The Transcript Annotation Database screen allows you upload a .fasta-formatted database for use in the Transcript Annotation workflow.
A custom database must meet the same formatting specifications as NCBI RefSeq files. They must:
• Be in fasta format (either single or multi-sequence files are supported)
• Use the field delimiter ‘|’ (without quotes) between fields
• Have a header line for each entry, written in the format:
ref|[Accession]|[Organism Name] [Description] ([Gene Name])
… where:
Accession - All characters between third and fourth field delimiters
Organism Name – The first two words after fourth field delimiter
Description - All words after Organism Name up to the end of the line, or up to a comma or parentheses, if the gene name exists
Gene Name - All characters in parentheses after Description
Example:
ref|XM_005842486.1|Chlorella variabilis hypothetical protein (CHLNCDRAFT_144668) mRNA, partial cds