Make a Custom BED File

If you chose Exome and Gene Panel from the Choose Assembly Workflow screen, you may import a targeted regions file screen.

 

BED files are used to define capture regions in the assembly, and can be generated by the sequence provider or made by hand. These files are basically tab-separated text files whose extension has been changed to .bed. See the UCSC Genome Bioinformatics BED file page for detailed information.

 

The BED file can consist of multiple sections, each with a different track name. Text is allowed between the tables without restriction.

 

Tables must follow all of the rules below:

 

      A header row (the one below appears in blue and black) is optional and can contain any text.

 

      The first three columns must be included, and in the same order as shown below.

 

      All cells in the first three columns must be filled.

 

      Additional table columns are allowed, but will be ignored.

 

      IMPORTANT: Each table in the file must be primarily sorted by the first column, and secondarily sorted by the second column. Make sure to sort the columns numerically (1, 2, 3…) and not alphabetically (1, 11, 12…).

 

Note: If only chromosome 1 (and possibly 11) appears in SeqMan Pro’s “Coverage of Targeted Regions” report (Project > Show coverage of target regions), this is indicative of incorrect sorting.

 

chrom

(required)

chromStart

(required)

chromEnd

(required)

Column 4 and beyond

(ignored)

The name of the chromosome or scaffold. Numbers are preferred, but chr or ch prefixes are allowed.

Starting position for the feature. The coordinates for BED intervals are in 0-based coordinates as follows: (1-0) .. (100+1-1). Therefore, base 1-100 of a chromosome is represented in a BED file as 0-100.

Ending position for the feature.

Data in these columns are ignored.

 

Note: SeqMan NGen can read and produce output using a variety of common chromosome naming conventions, including “chr1” and “ch1,” as well as Arabic and Roman numerals. Chromosome names are captured from genome template packages and used to assign contig IDs to entries from BED, VCF and manifest files.