The Transcript Annotation Database screen allows you upload a .fasta_-formatted database for use in _de novo RNA-seq workflows.
A custom database must meet the same formatting specifications as NCBI RefSeq files. They must:
- Be in .fasta format (either single or multi-sequence files are supported)
- Use the field delimiter ‘|’ (without quotes) between fields
- Have a header line for each entry, written in the format:
ref | [Accession] | [Organism Name] [Description] ([Gene Name])
… where:
- Accession – All characters between third and fourth field delimiters
- Organism Name – The first two words after fourth field delimiter
- Description – All words after Organism Name up to the end of the line, or up to a comma or parentheses, if the gene name exists
- Gene Name – All characters in parentheses after Description
- Accession – All characters between third and fourth field delimiters
Example:
ref | XM _ 005842486.1 | Chlorella variabilis hypothetical protein (CHLNCDRAFT_144668) mRNA, partial cds
Need more help with this?
Contact DNASTAR