Creating a Custom Transcription Annotation Database

The Transcript Annotation Database screen allows you upload a .fasta-formatted database for use in the Transcript Annotation workflow.


A custom database must meet the same formatting specifications as NCBI RefSeq files. They must:


      Be in fasta format (either single or multi-sequence files are supported)


      Use the field delimiter ‘|’ (without quotes) between fields


      Have a header line for each entry, written in the format:


ref|[Accession]|[Organism Name] [Description] ([Gene Name])


… where:


Accession - All characters between third and fourth field delimiters


Organism Name – The first two words after fourth field delimiter


Description - All words after Organism Name up to the end of the line, or up to a comma or parentheses, if the gene name exists


Gene Name - All characters in parentheses after Description





ref|XM_005842486.1|Chlorella variabilis hypothetical protein (CHLNCDRAFT_144668) mRNA, partial cds