Return to the home page

Documentation for EXTRACT program

The extract program is a utility which can help prepare files for input to the ANALYSIS, NONPARAMETRIC, and MULTI-ILINK programs, by extracting some (or all) of a set of marker loci from the input files, and reordering them with correct intermarker distances in the output file. Since ANALYSIS and NONPARAMETRIC programs require the loci to be in chromosome order with intermarker distances accurately specified, this utility can be used to order the markers in your file, and/or extract only those markers which are syntenic, for example. The MULTI-ILINK program requires that you have a set of only those markers you wish to analyze jointly in a multipoint ILINK analysis in the input files, and that this subset of markers be ordered correctly in the input files.

Input files:

pedin.dat - LINKAGE format pedigree file with disease + set of allele numbers marker loci

datain.dat - LINKAGE format parameter file with disease + set of allele numbers marker loci

Output files:

pedout.dat - LINKAGE format pedigree file with disease + set of ordered allele numbers marker loci

dataout.dat - LINKAGE format parameter file with disease + set of ordered allele numbers marker loci and intermarker recombination fractions specified correctly at the bottom of the file.

Usage of the program:

This program is very simple to use - to start with, you can call up the extract program from the directory containing your input files, and you will see the following message on your screen, for example, let us reorder the two marker loci in our sample dataset from 2-point/sample/autosomal directory input files:

         2
  0.990000  0.010000
         3
  0.317260  0.394050  0.288690
         2
  0.451190  0.548810

How many marker loci do you wish to extract into the new file ?

(the disease locus is automatically left as locus 1, so we now want to include both marker loci in our output file, so we would say '2' loci are to be extracted)

What is the locus order of these 2 marker loci ?

Note: Disease is assumed to be locus 1 and need not be included here!

(the marker order we wish to have is '3 2', since we said we wanted them in reverse order - note that the locus number corresponds to the number in the input file, and the disease is locus 1, the first marker is locus 2, etc. In fact, the number of alleles at each locus and their allele frequencies are indicated as you call up the program, so you can more easily verify the locus number for each locus)

What are the recombination fractions between the marker loci ?

(Now, we would enter '0.05' because we said we wanted to have these two loci separated by 0.05 recombination fraction in our output file)

Then, the program generates the new files pedout.dat and dataout.dat which are the LINKAGE format pedigree and parameter files for the reordered (sub)set of marker loci plus disease - these can now be copied to inped.dat and indata.dat for processing with DOWNFREQ, or can be copied to pedin.dat and datain.dat for input to the ANALYSIS, NONPARAMETRIC, MULTI-ILINK, or MULTIHOMOG programs.

Return to the home page

Converted to html by Marc Suchard