This example can be adapted to extract relevant annotated features (e.g., specific feature types or all features with specified annotations) for uses such as building BLAST databases or consensus matrices or performing alignments. Script A generates an overlapping CDS (specifically, the yeaC fragment). Using Scripts B or C solves this problem. Scripts B and C produce identical results, but only B also outputs the nucleotide sequence file.
Goal | To extract a set of annotated CDS features from a genome as protein sequences |
---|---|
Script A | m54sCDS.gbk=extract(m54s.gbk, 'CDS') m54s_proteins1.fas=translate("m54sCDS.gbk") |
Output A | LOCUS U00096:yeaA 414 bp DNA 13-JAN-2012 FEATURES Location/Qualifiers Source 1..414 Source complement (1..414) /seqninja_feature_id=“0000000001//Users/guy/Desktop/seqninja_testing /m54s.gbk:1” /note=”***Needs review***Cut segment head by 1860039 and tail by 2778768 units.” /organism=“Escherichia coli” CDS 411..414 /gene=“yeaC” seqninja_feature_id=“0000000012//Users/guy/Desktop/seqninja_testing /m54s.gbk:1” /note=”***Needs review***Cut segment tail by 314 units.” CDS 1..414 /gene=“yeaA” /seqninja_feature_id=“0000000013//Users/guy/Desktop/seqninja_testing /m54s.gbk:1” ORIGIN 1 atggctaata aaccttcggc agaagaactg aaaaaaaatt tgtccgagat gcagttttac 61 gtgacgcaga atcatgggac agaaccgcca tttacgggtc gtttactgca taacaagcgt 121 gacggcgtat atcactgttt gatctgcgat gccccgctgt ttcattccca aaccaagtat 181 gattccggct gtggctggcc cagtttctac gaaccggtaa gtgaagaatc cattcgttat 241 atcaaagact tgtcacatgg aatgcagcgc atagaaattc gttgcggtaa ctgtgatgcc… |
Script B | m54sCDS.fas=extract(m54s.gbk, 'CDS')m54s_proteins2.fas=translate("m54sCDS.fas", '/transl_table=11') |
Script C | m54s_proteins3.fas=translate("m54s.gbk") |
Need more help with this?
Contact DNASTAR