Sample Sequences - User Guide to SeqNinja

The Sample Sequences template, located in the Templates panel, is used to make an output file that contains a filtered set of sequences from the source file. Source file sequences can be filtered according to one or more specified conditions, such as length, contents, and start/end sequence characters.

Initially, template options are pre-selected (or pre-filled) to show an example of how to filter for sequences at least 375 nt in length and containing the sequence “GATCT.” It is intended that you overwrite these selections to fit your own needs.

One or more filter rows are needed in order to specify the sampling criteria. Two Filter rows have been provided as examples and can be edited or removed.
- To delete a Filter row or add a new one, click on the plus or minus tools () on the right of each row.
- To edit a Filter row, make selections from the Filter drop-down menus and filling in the corresponding Value boxes. The Filter drop-down menus offer the following options:

Use this filter:	To include:	Allowable values
Minimum Length	Only sequences the same or longer than the specified length.	Positive integer
Contains	Only sequences containing a specified sequence fragment. For sequences of DNA or unknown type, matches can occur on either strand.	DNA or protein sequence fragment using 1-letter IUPAC codes.
From Sequence Index	All sequences beginning with the sequence of this name.	Sequence name
To Sequence Index	All sequences up to and including the sequence of this name.	Sequence name
Sample Every	Every ‘nth’ sequence in the source file, where ‘n’ is a positive integer.	Positive integer
Sequence Name	All sequences with this name.	Sequence name
Probability to Include	A random subset of sequences. Each member of the source set individually has the given probability of being included.	A single-quoted decimal value from 0.0-1.0 (e.g., ‘0.7’)
Maximum Length	Only sequences the same or shorter than the specified length.	Positive integer
Starts With	Only sequences beginning with a specified sequence fragment. For sequences of DNA or unknown type, matches can occur at the beginning of either strand.	DNA or protein sequence fragment using 1-letter IUPAC codes.
Ends With	Only sequences ending with a specified sequence fragment. For sequences of DNA or unknown type, matches can occur at the end of either strand.	DNA or protein sequence fragment using 1-letter IUPAC codes.

In the Choose Sequence(s) button row, enter the sequence you wish to sample (see Add and modify a sequence).

In the Save Results As area, choose the name and location in which to save the output (see Specify output format and location).

Use the Format drop-down menu to select the file type for the output file.

Example #1 input:

Example #1 output:

A .fasta file containing the name and length of each output sequence (sequences #1, 3, 5, 7 and 9), followed by the sequence itself.

Example #2 input:

Example #2 output:

All sequences from contig01.fas that contain the sequence segment “TTGTT” have the bases “ATG” added to the beginnings of their sequences and “TAG” added to the ends of their sequences.

Reverse Complement Sequence(s)

Split Sequences

Need more help with this?
Contact DNASTAR