Create a custom transcript annotation database - User Guide to SeqMan NGen - 17.4

Welcome to SeqMan NGen
SeqMan NGen Tutorials
- Whole genome reference-guided workflow
- Whole genome de novo workflow with mate pair data
- De novo assembly using Sanger data
- Analysis of a whole genome de novo assembly
- RNA-Seq de novo transcriptome workflow
  - Part A: Setting up the transcriptome assembly in SeqMan NGen
  - Part B: Viewing annotated transcripts in SeqMan Ultra
- RNA-Seq reference-guided workflow with analysis in ArrayStar
- ChIP-Seq workflow with analysis in ArrayStar
- Copy number variation (CNV) workflow with analysis in ArrayStar and GenVision Pro
- Whole genome reference-guided workflow with analysis in ArrayStar
  - Part A: Setting up the assembly in SeqMan NGen
  - Part B: Analyzing the results in ArrayStar
- Long-read analysis with accuracy evaluation
  - Part A: Running the assembly in SeqMan NGen and viewing it in SeqMan Ultra
  - Part B (optional): Evaluating assembly accuracy using QUAST
- Exome workflow with analysis in ArrayStar
- Templated long-read workflow (ARTIC)
  - Part A: Creating draft genomes in SeqMan NGen and exporting a consensus from SeqMan Ultra
  - Part B: Using MegAlign Pro to determine the SARS-CoV-2 variant in an experimental sample
Wizard screen descriptions
- Welcome
- Workflow
  - De novo genome assembling and editing workflows
    - Create a reference-guided assembly to use in the “SNP to Structure” workflow
    - Remove PhiX control reads from Illumina data prior to import
  - Metagenomics workflows
  - RNA-seq/transcriptomics workflows
    - Include DESeq2 or edgeR statistics
  - Variant Analysis/Resequencing workflows
    - Variant calling accuracy workflow
    - ARTIC Amplicon workflow
  - Variant Call Format (VCF) files workflows
  - Combine/Reanalyze Existing Assemblies
- Analysis Options
  - RNA-seq normalization methods
  - ChIP-seq peak detection methods
- Assembly Log
- Assembly Options
- Assembly Output
- Assembly Summary
- Cloud Monitor
- Define Binding Proteins
- Input Assemblies
- Input Assembly
- Input Contig Sequences
- Input Host Files
- Input Reference (Sequence, Genome, for Scaffolding, etc.)
  - Annotate reference sequences prior to import
  - Manually specify an isoform prior to import
  - Use RNA-Seq de novo transcriptome output as a reference
  - Specify a VCF, BED or Manifest file
- Input Sequences
- Input Sequence files
  - Specify read technology
  - Specify paired-end data
    - Example regular expressions
  - Specify single sample, multi-sample or replicate data
  - Specify RNA-Seq options
- Input VCF Files
- Input Viral Genomes
- Post Assembly Options
- Preassembly Options
  - Preassembly Options for long-read workflows
  - Preassembly Options for all other workflows
- Run Assembly Project
  - Monitor the progress of a Cloud Assembly
- Set Contaminant
- Set Up Experiments
- Set Up Replicate Sets
- (Short Read) Polishing Options
- Transcript Annotation Database
  - Add a DNASTAR transcriptome package
  - Create a custom transcript annotation database
  - Use a local copy of RefSeq as a transcript annotation database
  - Annotation Options dialog
- Options tabs
  - Alignment tab
  - Layout tab
    - Layout tab (Preassembly Options, long read)
    - Layout tab (Assembly or Analysis Options)
  - Peak Detection tab
  - Scans tab
  - Trimming tab
    - Trimming tab (Preassembly Options, all others)
    - Trimming tab (Assembly Options)
  - Variants tab
    - Filter based on “P not Ref”
Log in to Cloud Assemblies
Use the DNASTAR Cloud Data Drive
- License and Credential Requirements
- The DNASTAR Cloud Data Drive User Interface
- Access the DNASTAR Cloud Data Drive
- Create a New Cloud Folder
- Transfer a Folder from a Physical Computer to the Cloud
- Transfer Files from a Physical Computer to the Cloud
- Transfer Files or Folders from the Cloud to a Physical Computer
- Permanently Remove Files and Folders from the Cloud
- Close the DNASTAR Cloud Data Drive
Navigate between wizard screens
Add and remove files in the wizard
- Add sequences from your computer or the cloud
- Add a genome template from DNASTAR
- Add a genome template from NCBI
- Remove a sequence from the list
Use editing commands in the wizard
Monitor the progress of a cloud assembly
Access and understand output files
- View the Project Report
  - Project Report contents for reference-guided workflows
  - Project Report contents for de novo workflows
- Reference-guided workflow output
  - Contents of the .assembly package
    - Contents of the -reports folder
      - Contents of the -zinternal folder
- De novo workflow output
- RNA-Seq reference-guided workflow output
- RNA-Seq de novo transcriptome workflow output
Appendix
- SeqMan NGen calculations
- Run SeqMan NGen through the command line
- Turn off usage logging
- Non-English keyboards
- Installed Lasergene file locations
- Troubleshoot failure to launch
- Research references

Download as PDF

The Transcript Annotation Database screen allows you upload a .fasta_-formatted database for use in _de novo RNA-seq workflows.

A custom database must meet the same formatting specifications as NCBI RefSeq files. They must:

Be in .fasta format (either single or multi-sequence files are supported)

Use the field delimiter ‘|’ (without quotes) between fields

Have a header line for each entry, written in the format:

ref | [Accession] | [Organism Name] [Description] ([Gene Name])

… where:

- Accession – All characters between third and fourth field delimiters
- Organism Name – The first two words after fourth field delimiter
- Description – All words after Organism Name up to the end of the line, or up to a comma or parentheses, if the gene name exists
- Gene Name – All characters in parentheses after Description

Example:

ref | XM _ 005842486.1 | Chlorella variabilis hypothetical protein (CHLNCDRAFT_144668) mRNA, partial cds

Add a DNASTAR transcriptome package

Use a local copy of RefSeq as a transcript annotation database

Need more help with this?
Contact DNASTAR