The .assembly package is part of the output for XNG workflows. (The contents of the -noSplit.assembly package are similar to those of the .assembly package.)
In the file names below, the project name should be understood to precede any hyphen (-) or period (.) used at the beginning of file and folder names.
File Suffix | Description |
---|---|
It is intended that the entire .assembly package be opened in SeqMan Pro or SeqMan Ultra for viewing and analysis of the assembly. However, the following individual files also contain useful information. | |
.vcf | A VCF file (.vcf) is automatically created for all assemblies with variants. The file is modified in three ways adhering to the Variant Call File (VCF) v. 4.2 specification: * In the FILTER field, each row is marked with one of three qualifiers to show whether or not a position was covered: ** “PASS” for positions where a call could be determined based on the sequence read data. ** “NC” for positions with no sequence read coverage (this will be denoted at the top of the file under ##FILTER.) ** “.” for positions when data for a call is missing or a call could not be made. These changes to the FILTER field apply to both single-sample and multi-sample VCFs, but not to VCFs lacking any sample information. * In the QUAL field, a Phred-scaled quality score is provided for the assertion made in the ALT column. The score is calculated as -10 log10 prob (call in ALT is wrong). ** In rows where the ALT column contains ‘.’ (i.e. no variant was called), the column contains -10log10 prob(variant). ** In rows where the ALT does not contain ‘.’ (i.e. a variant call), the column contains -10log10 prob(no variant). ** A missing value is specified as “.” * The PA field contains the Pnotref value. Note that the QUAL scale is reversed relative to Pnotref when ALT is "."; that is, when a position is in the reference. However, in one direction or the other, it will scale logarithmically with Pnotref. This does mean that it will be closer to Qcall (or "GQ") in cases where there isn’t “homozygous vs. heterozygous” call ambiguity. However, when the ambiguity is present, it will diverge. |
.bed, .txt, etc. | The target region file (.bed or manifest) for the assembly, if one was specified. |
.templateInfo | Contains general information for each contig in the assembly. |
.enrichment_Summary.txt | Contains the textual information for the Project > Show coverage of target regions option in SeqMan Pro and SeqMan Ultra. |
.sqd | This file is only created when the .assembly is first opened in SeqMan Pro or SeqMan Ultra. It contains saved display specific information such as SNP filtering criteria. Double-clicking on this file will open the .assembly package in SeqMan Pro or SeqMan Ultra. |
-Transcriptome table folder containing the file .table.txt | This folder and its file, showing the putative gene identity for each transcript, are created for the de novo transcriptome RNA-seq workflow only. |
There is normally no reason to open the following files. | |
-0.assemblyInfo | Contains information about assembly parameters which can be used for combining multiple assemblies. This file is not present for SeqMan NGen assemblies made prior to version 14.0. In 14.0 and later, it is present in templated miRNA, ChIP-Seq, and RNA-Seq workflows. |
[project name]Transcriptome.table.txt | This file is present in RNA-seq workflows that used a .Transcriptome package as a template. It is equivalent to the .table.txt file in the Transcriptome table folder of the .Transcriptome package. |
.auxPair | (internal use only) |
.bam | The BAM formatted alignment file. |
.bam.bai | The BAM index file. |
.capture.userSNP.vcf | (internal use only) |
.combined.snpExt | (internal use only) |
.coverage | Contains information at each position along the contig where the coverage changes. |
.coverage2 | Contains information for the maximum coverage of 100 base pair intervals across the contig. |
.coverage4 | Contains information for the maximum coverage of 10,000 base pair intervals across the contig. |
.coverage.missingSNP | Contains information about positions in dbSNP that had coverage and were called the reference base in the assembly. |
.exomeCapture-features | (internal use only) |
.info | Contains files used by SeqMan Pro and SeqMan Ultra. |
.midinfo | (internal use only) |
missing.fas | A fasta file of reads with no mers matching the reference. |
missing.fas.qual | A base quality file of reads with no mers matching the reference. |
.nocoverage.missingSNP | Contains information about positions in dbSNP that had no coverage in the assembly. |
outofOrder.txt | A text file of sequence reads not included in the final assembly due to excessive trimming during the alignment phase. |
.pair | (internal use only) |
.pairDist | Contains information about the position and distance between paired end reads. |
pairSpecifiers.txt | (internal use only) |
poor.fas | A fasta file of reads rejected at the layout phase due to match scores below the threshold. |
poor.fas.qual | A base quality file of reads rejected at the layout phase due to match scores below the threshold. |
.quant | Reprises information in the .coverage4 .coverage2 and /or .coverage files. |
.region_capture.bed | (internal use only) |
report.txt | Contains the textual information for the Project > Report option in SeqMan Pro. See View the Project Report for information about the report contents for XNG and SNG workflows. |
.snp | Contains all the information for SNPs called using the “Simple” method. |
.snpExt | Contains all the information for SNPs called using either the “Diploid” or “Haploid” method. |
SNPs.log | An optional text form of the .snpExt table that contains information on how each was calculated. If you encounter a problem, this file is useful for DNASTAR Support to help you with trouble-shooting. |
.splitExt | (internal use only) |
.template-comment | Contains the comment information for that contig. |
.template-features | Contains the feature information for that contig. |
.template-features2 | (internal use only) |
.template.fof | A file-of-files containing the path and file names of the reference sequences. |
.template-gapped-seq | A .seq file of the template containing gaps. |
.template-gaps | A binary file of the template gap information. |
.template-seq | A .seq file of the template without gaps. |
unaligned.fas.qual | A base quality file of reads rejected at the alignment phase. |
Need more help with this?
Contact DNASTAR