Specifying Read Technology

Before leaving the Input Sequence Files and Define Experiments or Individual Replicates dialog and proceeding to the next screen, you must make a selection from the Read technology drop-down menu: Illumina, 454, Ion Torrent, Pac Bio, Sanger or Other. The default values for parameters and other assembly options in subsequent panels will be based on your selection.

 

Considerations when choosing a read technology:

 

      If you choose Illumina, SeqMan NGen assumes that you have paired-end data > 50 bp in length, and with a 500 bp insert distance. For all other technologies, SeqMan NGen presumes single-end data.

 

Note: If the read length is shorter than 50 bp, you may wish to specify a shorter Mer size in the Assembly Options screen. Additional, options to employ with very short reads include the Advanced Options Minimum aligned length and Maximum gap size.

 

      For de novo assemblies, if you select Illumina and enter an insert size of 150 bp or less in the Set Pair Information dialog, the assembler will assume the reads overlap and will attempt to create a single “super-read” from each pair. Read pairs that cannot be merged, either because they do not overlap or have numerous errors in the overlapping region, will not be included in the assembly. See this help topic for a description of how to use Contaminant scan to remove PhiX174 control sequence from Illumina data prior to assembly.

 

      Because the technology does not support paired reads, PacBio is not available when you select Reference-guided assemblies with gap closure in the Choose Assembly Type screen The gap closure step is dependent on the presence of paired reads.

 

      Both types of Ion Torrent paired reads—”mate pairs” and “paired ends”—are supported.

 

      For an assembly that uses solely Sanger data, we strongly recommend using either SeqMan Pro or the SeqMan NGen SNG assembler. If using both Sanger and Illumina data (Sanger Validation Workflow), choose Illumina for all reads.