Variant Analysis/Resequencing workflows - User Guide to SeqMan NGen - 17.6

The following table describes each of the workflows available in the Variant Analysis/Resequencing tab of the Workflow screen.

Group	Workflow	Description
ABI / Sanger	Whole genome	Align Sanger trace data from one or multiple samples to a genomic reference or genome template package for accurate SNP/Indel analysis. This type of assembly can include billions of reads and large eukaryotic genomes. After assembly, compare results in ArrayStar using the SNP Report.
	Amplicon	Align Sanger trace data from one or multiple samples to targeted genes or genomic regions for accurate SNP/Indel analysis. Assembles a region of interest produced by PCR amplification.
	Clone verification	Align reads to confirm clone integrity and insert orientation. (Note: For a dedicated clone verification workflow, see this topic in the SeqBuilder Pro User Guide).
NGS-based	Whole genome	Align NGS sequence data from one or multiple samples to a genomic reference or genome template package for accurate SNP/Indel analysis. This type of assembly can include billions of reads and large eukaryotic genomes.
	Amplicon, gene panel, exome	Align NGS sequence data from one or multiple samples to targeted genes or genomic regions for accurate SNP/Indel analysis. Gene panels look at specific gene regions, usually those corresponding to known defects. Exome assembly saves assembly time and resources by specifically targeting only exons and coding regions, but do require you to have the corresponding .bed file from the capture kit. For instance, if you used Human Genome build 38 as the reference, for example, the corresponding .bed file might be called Human genome build38.bed. If using this workflow with cancer samples, check the box next to Somatic/Cancer/Heterogeneous in the Analysis Options screen. In most cases, downstream analysis of these finished assembly will place in ArrayStar.
	Viral-host integration detection	Locate prophage and retro-viral insertion sites in host genome. Available in Lasergene 17.1 and later. Used to locate putative viral insertion sites or to predict the location of other inserted sequences, such as transposable elements. When you select this workflow, SeqMan NGen automatically sets up a templated assembly that is optimized for locating viral insertion sites. Since chimeric reads (sequences consisting of both host and viral DNA) usually indicate viral insertion sites, SeqMan NGen looks for chimeric reads in a multi-step process. First, the viral genome is used as the initial assembly template. Next, the sub-set of reads that mapped to the viral genome is then re-assembled against the host template. During both reference-guided assembly steps, SeqMan NGen “masks” (trims) whichever half of the chimeric read does not match the template for that step. The host template assembly results are output in BAM file format. SeqMan Ultra is used to explore possible viral insertion sites post-assembly. Launch SeqMan Ultra and use Contig > Contig Coverage to view tabular data for the individual contigs. Navigate to positions with multiple reads, as evidenced in the depth column. The reads at these positions should be trimmed to the same base indicating the insertion site. You may “untrim” the reads to verify that they also contain viral sequence.
PacBio / Nanopore	Whole genome	The long read version of the NGS workflow described above. This workflow uses a new long read alignment algorithm released with Lasergene 17.5 (July 2023). This aligner performs fast and accurate alignments for PacBio HiFi and ONT data while simultaneously calling variants.
	Amplicon, gene panel, exome	The long read version of the NGS workflow described above. This workflow uses a new long read alignment algorithm released with Lasergene 17.5 (July 2023). This aligner performs fast and accurate alignments for PacBio HiFi and ONT data while simultaneously calling variants.
	ARTIC Amplicon	Choose this workflow if you are running any templated assembly using Oxford Nanopore or PacBio CLR/HiFi long read data. The parameters for this workflow are tailored to viral long read data from PCR amplified fragments generated using ARTIC primer sets, but—despite the name—will work with any long read data.
Variant Call Format (VCF) files	Functional annotation of a single sample	Annotates the variant positions with functional information from a database, including affected genes and impact on protein encoding regions and/or splice sites.
Variant Call Format (VCF) files	Annotation and comparison of multiple samples	Allows multiple samples in VCF format to be annotated and then compared to identify genes and/or variants of interest in ArrayStar. This workflow is designed to use with assemblies created outside SeqMan NGen (e.g., using BWA + GATK). Such assemblies often have .vcf files as their only output.

Variant calling accuracy workflow

Need more help with this?
Contact DNASTAR