VITESSE Documentation						December 7, 1995 

		VITESSE V1.0  (c)1995  Jeff O'Connell

***************************************************************
Summary:
* Program:       VITESSE 
                 (means 'speed' in French)
* Function:      Likelihood Calculation on Pedigrees
* Version:       1.0
* Date:          1995.12.06
* Author:        Jeff O'Connell 
* Copyright:     (c) 1995 Jeff O'Connell 
* Collaborators: Daniel E. Weeks
* Language:      C (ANSI standard)
* Distribution:  via anonymous FTP from 
		 watson.hgen.pitt.edu directory /pub/vitesse (US)
                 ftp.ox.ac.uk    directory  pub/users/ayoung/vitesse (UK)
                 ftp.ebi.ac.uk   
		  directory  pub/software/linkage_and_mapping/hgen_pitt (UK)
                 via WWW from
                 http://info.ox.ac.uk/~ayoung/gas.html  (UK)
* Registration:  via e-mail to jeff@watson.hgen.pitt.edu
* Reference:    "The VITESSE algorithm for rapid exact multilocus 
		 linkage analysis via genotype set-recoding and 
		 fuzzy inheritance", O'Connell JR, Weeks DE,
		 Nature Genetics 11:402-408, December 1995

******************************************************************

--------------------------------------------------------------------
DESCRIPTION:

VITESSE is a software package that computes likelihoods with
the functionality of the LINKMAP and MLINK programs from LINKAGE.

VITESSE uses the novel algorithms of set-recoding and fuzzy inheritance
to reduce the number of genotypes needed for exact computation of the
likelihood, which accelerates the calculation. It also represents
multilocus genotypes locus-by-locus to reduce the memory requirements.

The algorithms in VITESSE were developed and coded by Jeff O'Connell at 
the University of Pittsburgh. Dan Weeks at the University of Pittsburgh 
and the Wellcome Trust Centre for Human Genetics at Oxford University 
collaborated.

-------------------------------------------------------------------
FTP DIRECTIONS:

Here are the instructions for retrieving VITESSE.

ftp watson.hgen.pitt.edu
(login as 'anonymous' with your e-mail address as password)

cd pub/vitesse
get vitesse.tar.Z
 
On your machine:

uncompress vitesse.tar.Z
tar xvf vitesse.tar

All files will appear in the directory ./vitesse.

For DOS users: 

cd pub/vitesse/DOS   

get README.DOS

This file will contain further instructions on how to get 
DOS executables when they are available.  
		   
**The entire ftp site on watson is mirrored at the EBI ftp site in
  the UK.
  ftp.ebi.ac.uk
  directory  pub/software/linkage_and_mapping/hgen_pitt

----------------------------------------------------------------------
REGISTRATION/MAILING LIST:

If you wish to be on the VITESSE mailing list to be informed of 
new releases, updates, bug fixes, etc, please register with me at 
jeff@watson.hgen.pitt.edu. 

-----------------------------------------------------------------
PLATFORMS:

VITESSE has been tested on the following platforms:

Sun SPARCStation 10, Solaris 2.3 
Sun SPARCStation 2, SunOS 4.1.3
DEC Alpha, OSF/1.3
HP workstation, HP/UX 9.01
IBM SP2 (parallel RS/6000 CPUs), AIX 3.21
Silicon Graphics Challenge, IRIX 5.2
Silicon Graphics Indigo, IRIX 4.0.5
IBM-PC DOS, BorlandC Compiler (being tested)

--------------------------------------------------------------------
COMPILING:

There are a total of 4 executables:

1.) vitesse - the main program 
2.) pedcheck - determines the pedigrees VITESSE can currently 
	       handle. See below.
3.) cnvrt_sh - converts an lcp-generated shell to run VITESSE. See below.
4.) pedshell - converts an lcp-generated shell to run 'pedcheck'. See below. 

If you have the Make utility, then type 'make all'.

**The default compiler is gcc with optimization -O. Edit the
 'Makefile' to change these options.

If you don't use the Make utility then compile as follows:

1.  gcc -O vitsrc.c -o vitesse -lm
2.  gcc -O pedsrc.c -o pedcheck -lm
3.  gcc -O cnvrt_sh.c -o cnvrt_sh -lm
4.  gcc -O pedshell.c -o pedshell -lm

Replace gcc -O as necessary or desired.

**VITESSE is written in ANSI C and uses new style function
prototypes. I've had trouble compiling with older Sun compilers.

**For DOS users, executables are provided.
------------------------------------------------
RUNNING:

Running VITESSE requires that you be familiar with running the 
LINKAGE/FASTLINK package. If you have not used LINKAGE/FASTLINK you 
can find user manuals and other very helpful information on Jurg Ott's
Linkage Analysis Web Site at Columbia University. 
   http://linkage.rockefeller.edu/

**VITESSE is currently set up to run in an lcp-generated script file.

**I have included a program called cnvrt_sh to convert any lcp-generated 
  shell file to another shell file for running VITESSE. Basically it 
  strips out calls to the LINKAGE/FASTLINK programs mlink and linkmap 
  and replaces them with calls to vitesse. For example,

 % cnvrt_sh
 Input file :pedin
 Output file: vpedin
   -> 79 lines processed
 %

**Set any necessary paths to where you keep the vitesse and other 
  executables.

**The converted shell file also has some different file names. The output
  files are the same as for LINKAGE/FASTLINK except that the letter 'v' is 
  appended as a prefix. For example, 'final.out' becomes 'vfinal.out'. 

The reason the output is not the same is so that the user can compare
answers with LINKAGE/FASTLINK by running both shell files.

**The LINKAGE/FASTLINK program UNKNOWN is not used by VITESSE and so it is
  stripped from the shell file.  

---------------------------------------------------------------------------
LIMITATIONS:

Version 1.0 will only handle simple pedigrees. Simple means there are no
loops and there is only one set of parents who are founders. I'm actively
working on general pedigrees without loops. Yes, I know this limitation 
is annoying!!

**If VITESSE encounters a pedigree it cannot handle it issues a message and
exits. I have included yet another shell converter 'pedshell' to assist you
in determining which pedigrees are simple. Use 'pedshell' to convert
an lcp-generated shell file to another shell file containing a call
to the program 'pedcheck'. For example, 
 
 %pedshell 
 Input file :pedin
 Output file: ppedin
   -> 79 lines processed
 %

When you run 'ppedin', a message will be displayed after each 
pedigree that is not simple. Delete those pedigrees from the pedigree 
file (make a backup first) and then run VITESSE.

For example, if you generated 'pedin' using lcp, then convert 'pedin'
to say 'ppedin' and run it. Delete pedigrees, if necessary. Then 
convert 'pedin' to 'vpedin' and run it.

------------------------------------------------------------
ALELLE LUMPING/RECODING:

   NEVER RECODE ALLELES. VITESSE does its own allele recoding.
   Hand recoding may lead to errors and any 'allele lumping'
   will not affect VITESSE's running time. My experience is that
   LINKAGE/FASTLINK does not always handle more than 31 alleles 
   at a locus correctly. VITESSE has no restrictions on the number
   of alleles at a locus.

------------------------------------------------------------
SCREEN OUTPUT:

   The final output from VITESSE appears at the end of the run 
   because the program is optimized for MLINK and LINKMAP runs.
   This means that when a trait locus moves between two markers,
   all thetas are done for that pedigree, instead of doing all 
   pedigrees for one theta. VITESSE will print which pedigree is
   is processing.

   The output during the run is much different than LINKAGE.
   VITESSE prints the state of the calculation to give the user
   an idea of the of how complex the problem is. As each nuclear
   family is peeled from the pedigree, VITESSE prints the id's of
   the parents and children, and the number of Parental Pairs. This
   is the number of valid multilocus genotype pairings for the parents.
   For each pair, a calculation involving the compatible child genotypes
   is done, so this number relates to the complexity of that nuclear 
   family, and thus the pedigree - the more pairs, the longer the 
   calculation. As you run different pedigrees or add markers, you 
   should get a feel for the time complexity of the problems.

   When you reach the last nuclear family, the output looks slightly
   different because VITESSE uses a novel algorithm to exploit a
   special symmetry in this family which can lead up to 2-fold 
   speed up. 

**The time and space complexity of LINKAGE/FASTLINK is associated to 
  the product of the number of alleles of the markers, called Maxhap.
  This constant is actually a false indicator of the complexity of the 
  problem.  The space and time complexity of VITESSE is a function of the 
  number of markers and the number of parental genotypes in the pedigree. 
  Note that Maxhap is irrelevant in VITESSE.

I'm developing a preprocessor program that will quickly give this 
complexity information without doing the entire likelihood calculation. 

--------------------------------------------------------------------------
PEDIGREE/MENDELIAN INCONSISTENCIES:

VITESSE assumes to some extent that the pedigree file is correct and does
not have extensive diagnostic checking. Assuming the pedigree file is 
correct, VITESSE is guaranteed to find any Mendelian inheritance 
inconsistencies in your pedigree. If VITESSE finds an inconsistency it 
exits and gives information on the screen and in the file 'vitesse.dbg' 
to assist you in locating the problem.

--------------------------------------------------------------------------
MEMORY:

The memory requirements are a function of the number of loci and 
number of parental pairs. On most problems I've reached
the time complexity before the memory limit. I am testing other
implementations which use less memory and have not decided yet how to
or whether to separate versions.

--------------------------------------------------------------------------
COMPARING RESULTS FROM VITESSE AND LINKAGE/FASTLINK.

** IMPORTANT: VITESSE ALWAYS computes the exact likelihood.
   In LINKAGE/FASTLINK, if you use UNKNOWN, then the likelihood may 
   not be preserved because if everyone is untyped then they
   are made homozygous 1/1 with that allele frequency. (And I believe
   there are other things LINKAGE does unrelated to UNKNOWN that may
   alter the likelihood.) In general, my experience is that the more
   complex the pedigree, the more likely the likelihood will be 
   preserved. Although these techniques alter the raw likelihood the 
   LOD scores are preserved.

** IMPORTANT: The log base 10 output of LINKAGE/FASTLINK is not
   always accurate. That is because the programs don't use the log10
   function, but rather divide the natural log function by a fixed-
   length constant. This problem may have been fixed in newer versions of
   Pascal LINKAGE 5.* and is fixed LINKAGE 6.0. Note that the inaccuracy
   is minor, becuase the answers are fine to 2-3 decimal places.
   
   For example, on my Sun Workstation I get inaccurate log10 answers from
   FASTLINK.  To get better accuracy in FASTLINK change the log10_ value 
   in commondefs.h to  2.302585093 or replace it with '(log(10.0))'. 

** IMPORTANT: Since VITESSE is a new program and uses completely different
   algorithms than LINKAGE/FASTLINK, comparing output is advised where 
   possible. In general, the bugs I found while developing VITESSE 
   appeared while running 2-point analyses, so if all the 2-point runs
   agreed, other multipoint runs using those markers agreed.

***THUS, TO COMPARE:
     1. CHANGE THE LOG 10 CONSTANT; OTHERWISE LODS WON'T BE ACCURATE
	AND WON'T MATCH WITH VITESSE.
     2. IN THOSE CASES WHEN LINKAGE/FASTLINK GIVES EXACT LIKELIHOODS 
		                 		   ^^^^^
        THEN VITESSE MUST GIVE THE SAME NATURAL LOG AND BASE 10 LOG 
	VALUES. THE LOD SCORES, HOWEVER, MUST ALWAYS MATCH BETWEEN THE
                    ^^^                       ^^^^^^
	PROGRAMS. 

------------------------------------------------------------------
GAS/VITESSE INTERFACE:

VITESSE is also available with an alternative interface as part of the
GAS system -- Genetic Analysis System by Alan Young, Oxford University.

The GAS/VITESSE implementation offers some different functionality than 
LINKMAP and MLINK, allowing 2-locus optimization of theta, exclusion mapping
automated multi-point (up to 8 loci simultaneously) mapping across any number
of adjacent marker loci, and produces a postscript file for direct graphical
output.

For more details use either:

WWW   to   http://info.ox.ac.uk/~ayoung/gas.html
FTP   to   ftp.ox.ac.uk    directory  pub/users/ayoung
or send mail to ayoung@vax.ox.ac.uk.

VITESSE is also available at the above sites.

------------------------------------------------------------------
FUTURE WORK:

I'm currently working on various algorithmic additions to VITESSE,
so that it will cover the full range of analyses available in
LINKAGE/FASTLINK:

1. Loopless pedigrees (TOP PRIORITY)

2. Add loops: Then VITESSE will handle all pedigrees. I expect good
	results because in addition to VITESSE's speed the number of
	pedigree traversals due to untyped loop breakers will be reduced.

3. Linkage Disequilibrium.

4. Interference: Add other types of mapping functions.

5. Add ILINK capabilities. I'm looking at other optimization interfaces
   besides Gemini.

In addition,

6. Improved sequential algorithms. Can still greater speedup be achieved? 

7. Optimize existing code. Several functions need to be rewritten and there
    is some code no longer used.

The above improvements will also be incorporated into GAS/VITESSE.
----------------------------------------------------------------------
BETA TESTERS:

I would like to thank the following Beta testers for excellent 
information and suggestions on improving VITESSE and platform 
compatibility.

Jim Tomlin, NIH 
Venkateswara Rao Parasa, NIH 
John I. Powell, NIH 
Alan Young, Oxford University

----------------------------------------------------------------------
PUBLICATION CITATION: "The VITESSE algorithm for rapid exact multilocus
linkage analysis via genotype set-recoding and fuzzy inheritance",
O'Connell JR, Weeks DE, Nature Genetics 11:402-408, December 1995

Please reference this if you use VITESSE in any published work.

----------------------------------------------------------------------


I'm very interested in receiving feedback on your experiences. If you have 
any questions, problems, suggestions, or comments, please get in touch.

Thank you for your assistance,

Jeff O'Connell