The Project Report for de novo assemblies will contain a subset of the following results:
Assembly Totals | |
---|---|
Contigs | Total number of contigs assembled. |
Contigs > 2K | Total number of assembled contigs that are more than 2000 base pairs in length. |
Contigs to Reach Genome Length ‘x’ | The number of contigs needed to cover the genome length specified in the Workflow pane. |
Contigs removed due to small size | The number of contigs removed due to being smaller than the threshold value. |
Assembled Sequences | The number of sequences utilized in the assembly. |
Unassembled Sequences | The number of sequences excluded from the assembly. These may be further categorized as: 1) Sequences not assembled due to complete trimming, and 2) Sequences removed due to small contig size. |
All Sequences | Total number of sequences in the project. |
Contig N50 | Contig size at which 50% of the sequence data are represented. Note: In a typical microbial genome assembly, Contig N50 values exceed 80K base pairs and genome coverage is attained in less than 100 contigs. In many assemblies, contig N50 exceeds 100K with genome coverage attained in 25 contigs. If paired-end Roche 454 Life Sciences data are used, contigs can be ordered into a handful of large scaffolds to attain genome coverage that greatly facilitates gap closure and completion of the genome assembly. |
Average Coverage | Average depth of coverage in the assembly. |
Average Totals | |
Sequences Per Contig | Average number of sequences used for each contig. |
Average Lengths | |
Contigs | Average contig length. |
Assembled Sequences | Average length of sequences used in the assembly. |
Unassembled Sequences | Average length of sequences excluded from the assembly. |
All Sequences | Average length of all sequences in the project. |
Average Quality | |
Assembled Sequences | Average quality score of sequences used in the assembly. |
Unassembled Sequences | Average quality score of sequences excluded from the assembly. |
All Sequences | Average quality score of all sequences in the project. |
Assembled Pair Statistics | |
Read Pairs | Total number of paired reads in the project. |
Assembled Pairs | The number of paired reads included in the assembly. |
Pairs Consistent Within a Contig | The number of paired sequences within a single contig that met pair constraints. One “pair” in this statistics represents two sequences. |
Pairs Inconsistent Within a Contig | The number of putative paired sequences within a contig that did not meet pair constraints. |
Split Pair Statistics (Ion Torrent paired reads and 454 data only) | |
Reads Split into Pairs | The number of reads that were split into pairs at the linker. |
Unsplit Reads with Pair Linker(s) | The number of reads that were not split into reads because the linker was too far to one side. |
Unsplit Reads without Pair Linker(s) | The number of reads that were not associated with a linker. |
Assembly Parameters | |
Match Size | The values specified in the SeqMan NGen wizard prior to assembly. |
Match Spacing | |
Minimum Match Percentage | |
Match Score | |
Mismatch Penalty | |
Gap Penalty | |
Max Gap | |
Genome Length | |
Expected Coverage |
Need more help with this?
Contact DNASTAR