abyss-pe

abyss-pe(1)                      User Commands                     abyss-pe(1)



NAME
       abyss-pe - assemble reads into contigs

SYNOPSIS
       abyss-pe [OPTION]...  [PARAMETER=VALUE]...  [MAKE_TARGET]...

DESCRIPTION
       Assemble the reads of the input files into contigs. The reads may be in
       FASTA, FASTQ, qseq, export, SRA, SAM or BAM format and may be
       compressed with gz, bz2 or xz and may be tarred.

       abyss-pe is a Makefile script. Any options of make may also be used
       with abyss-pe.


   Parameters of abyss-pe
       name, JOB_NAME
              The name of this assembly. The resulting scaffolds will be
              stored in ${name}-scaffolds.fa.

       in     input files. Use this variable when assembling data from a
              single library.

       lib    a quoted list of whitespace-separated paired-end library names.
              Use this varible when assembling data from multiple paired-end
              libraries.  For each library name in lib, the user must define a
              variable on the command line with the same name, which indicates
              the read files for that library. See EXAMPLES below for a
              concrete example of usage.

       pe     list of paired-end libraries that will be used only for merging
              unitigs into contigs and will not contribute toward the
              consensus sequence.

       mp     list of mate-pair libraries that will be used for scaffolding.
              Mate-pair libraries do not contribute toward the consensus
              sequence.

       long   list of long sequence libraries that will be used for
              rescaffolding.  long sequence libraries do not contribute toward
              the consensus sequence.

       se     files containing single-end reads

       a      maximum number of branches of a bubble [2]

       b      maximum length of a bubble (bp) [""]
              abyss-pe has two bubble popping stages. The default limits are
              3*k bp for ABYSS and 10000 bp for PopBubbles.

       c      minimum mean k-mer coverage of a unitig [sqrt(median)]

       d      allowable error of a distance estimate (bp) [6]

       e      minimum erosion k-mer coverage [round(sqrt(median))]

       E      minimum erosion k-mer coverage per strand [1 if sqrt(median) > 2
              else 0]

       j      number of threads [2]

       k      size of a k-mer (when K is not set) or the span of a k-mer pair
              (when K is set)

       K      size of a single k-mer in a k-mer pair (bp)

       l      minimum alignment length of a read (bp) [40]

       m      minimum overlap of two unitigs (bp) [k-1]

       n      minimum number of pairs required for building contigs [10]

       N      minimum number of pairs required for building scaffolds [n]

       p      minimum sequence identity of a bubble [0.9]

       q      minimum base quality when trimming [3]
              Trim bases from the ends of reads whose quality is less q.

       Q      minimum base quality [0]
              Mask all bases of reads whose quality is less than Q as `N'.

       s      minimum unitig size required for building contigs (bp) [1000]
              The seed length should be at least twice the value of k. If more
              sequence is assembled than the expected genome size, try
              increasing s.

       S      minimum contig size required for building scaffolds (bp)
              [1000-10000]

       SS     SS=--SS to assemble in strand-specific mode
              Requires that all libraries are strand-specific RNA-Seq
              libraries.  Assumes that the first read in a read pair is
              reveresed WRT the transcripts sequenced.

       t      maximum length of blunt contigs to trim [k]

       v      v=-v to enable verbose logging

       np, NSLOTS
              the number of processes of an MPI assembly

       mpirun the path to mpirun

       aligner
              The program to use to align the reads to the contigs [map].
              Permitted values are: map, kaligner, bwa, bwasw, bowtie,
              bowtie2, dida.  See the DIDA section below for further info on
              the dida option.

       cs     convert colour-space contigs to nucleotide contigs following
              assembly

   Options of make
       -n, --dry-run
              Print the commands that would be executed, but do not execute
              them.

   Make targets for abyss-pe
       default
              Equivalent to `scaffolds scaffolds-dot stats'.

       unitigs
              Assemble unitigs.

       unitigs-dot
              Output the unitig overlap graph.

       pe-sam Map paired-end reads to the unitigs and output a SAM file. SAM
              file will only contain reads mapping to different contigs, and
              the read ID, sequence and quality strings will be replaced with
              '*' characters.

       pe-bam Map paired-end reads to the unitigs and output a BAM file. BAM
              file will only contain reads mapping to different contigs, and
              the read ID, sequence and quality strings will be replaced with
              '*' characters.

       pe-index
              Generate an index of the unitigs used by abyss-map.

       contigs
              Assemble contigs.

       contigs-dot
              Output the contig overlap graph.

       mp-sam Map mate-pair reads to the contigs and output a SAM file. SAM
              file will only contain reads mapping to different contigs, and
              the read ID, sequence and quality strings will be replaced with
              '*' characters.

       mp-bam Map mate-pair reads to the contigs and output a BAM file. BAM
              file will only contain reads mapping to different contigs, and
              the read ID, sequence and quality strings will be replaced with
              '*' characters.

       mp-index
              Generate an index of the contigs used by abyss-map.

       scaffolds
              Assemble scaffolds.

       scaffolds-dot
              Output the scaffold overlap graph.

       scaftigs
              Break scaffolds and generate AGP file.

       long-scaffs
              Rescaffold using RNA-Seq assembled contigs.

       long-scaffs-dot
              Output the RNA scaffold overlap graph.

       stats  Display assembly contiguity statistics.

       clean  Remove intermediate files.

       version
              Display the version of abyss-pe.

       versions
              Display the versions of all programs used by abyss-pe.

       help   Display a helpful message.


DIDA
       ABySS supports the use of DIDA (Distributed Indexing Dispatched
       Alignment), an MPI-based alignment framework for computing sequence
       alignments across multiple machines. To use DIDA with ABySS, first
       download and install DIDA from
       http://www.bcgsc.ca/platform/bioinfo/software/dida, then specify `dida`
       as the value of the aligner parameter to abyss-pe.


   DIDA-related abyss-pe parameters
       DIDA_MPIRUN
              The `mpirun` command used to run DIDA jobs.

       DIDA_RUN_OPTIONS
              Runtime options such as number of threads per MPI rank and
              values for environment variables (e.g. PATH, LD_LIBRARY_PATH).
              Run `abyss-dida --help` for a list of available options.

       DIDA_OPTIONS
              Options that are passed directly to the DIDA binary. For
              example, this can be used to control the minimum alignment
              length threshold.  Run `dida-wrapper --help` for a list of
              available options.


   MPI COMPATIBILITY
       Due to its use of multi-threading, DIDA has known deadlocking issues
       with OpenMPI.  Using the MPICH MPI library is strongly recommmended
       when running assemblies with DIDA. Testing was done with MPICH 3.1.3,
       compiled with --enable-threads=funneled.


   EXAMPLE
       The recommended runtime configuration for DIDA is 1 MPI rank per
       machine and 1 thread per CPU core. For example, to run an assembly
       across 3 cluster nodes with 12 cores each, do:

            abyss-pe k=64 name=ecoli in='reads1.fa reads2.fa' aligner=dida
       DIDA_RUN_OPTIONS='-j12' DIDA_MPIRUN='mpirun -np 3 -ppn 1 -bind-to
       board'

       This example uses the MPICH command line options for `mpirun`.  Here,
       `-np 3` indicates the number of MPI ranks, `-ppn 1` indicates the
       number of MPI ranks per "node", and `-bind-to board` defines a "node"
       to be a motherboard (i.e. a full machine).


ENVIRONMENT VARIABLES
       Any parameter that may be specified on the command line may also be
       specified in an environment variable.

       PATH   must contain the directory where the ABySS executables are
              installed.  Use `abyss-pe versions` to check that PATH is
              configured correctly.

       TMPDIR specifies a directory to use for temporary files

   Scheduler integration
       ABySS integrates well with cluster job schedulers, such as:
        * SGE (Sun Grid Engine)
        * Portable Batch System (PBS)
        * Load Sharing Facility (LSF)
        * IBM LoadLeveler

       The SGE environment variables JOB_NAME, SGE_TASK_ID and NSLOTS may be
       used to specify the parameters name, k and np, respectively, and
       similarly for other schedulers.

EXAMPLES
   One paired-end library
        abyss-pe k=64 name=ecoli in='reads1.fa reads2.fa'

   Multiple paired-end libraries
        abyss-pe k=64 name=ecoli lib='lib1 lib2' \
            lib1='lib1_1.fa lib1_2.fa' lib2='lib2_1.fa lib2_2.fa' \
            se='se1.fa se2.fa'

   Paired-end and mate-pair libraries
        abyss-pe k=64 name=ecoli lib='pe1 pe2' mp='mp1 mp2' \
            pe1='pe1_1.fa pe1_2.fa' pe2='pe2_1.fa pe2_2.fa' \
            mp1='mp1_1.fa mp1_2.fa' mp2='mp2_1.fa mp2_2.fa' \
            se='se1.fa se2.fa'

   Including RNA-Seq assemblies
        abyss-pe k=64 name=ecoli lib=pe1 mp=mp1 long=long1 \
            pe1='pe1_1.fa pe1_2.fa' mp1='mp1_1.fa mp1_2.fa' \
            long1=long1.fa

   MPI
        abyss-pe np=8 k=64 name=ecoli in='reads1.fa reads2.fa'

   SGE
        qsub -N ecoli -t 64 -pe openmpi 8 \
            abyss-pe n=10 in='reads1.fa reads2.fa'

SEE ALSO
       make(1), ABYSS(1)

AUTHOR
       Written by Shaun Jackman.

REPORTING BUGS
       Report bugs to <abyss-users@googlegroups.com>.

COPYRIGHT
       Copyright 2015 Canada's Michael Smith Genome Sciences Centre



abyss-pe (ABySS) 2.1.5             2015-May                        abyss-pe(1)