Work with SNP Data

SNP (single-nucleotide polymorphism) data in ArrayStar is used to compare differences found in genomic (or exomic) data, usually by sequence assembly. ArrayStar does not calculate this data. Rather, it is loaded from SeqMan NGen assemblies, SeqMan SNP table exports, or tabular text files that you provide.

 

ArrayStar's SNP support is oriented around comparing multiple assemblies or samples as experiments. All of the samples must correspond to the same "coordinate space," meaning the set of chromosomes, their names, and the coordinates of SNPs and genes on the chromosomes must all be the same. You may still load different strains or variants with different genomes. However, before loading such data, you must first align or assemble them to the same reference genome.

 

Each "SNP" in ArrayStar is not a literal SNP. A literal SNP is a single-base substitution, or sometimes a single base insertion or deletion (indel). In ArrayStar, each base corresponds to a single base in the reference, but multiple insertions or deletions might have been made.

 

SNPs can be classified as follows:

 

Presence of change

Change – There was some sort of nucleotide change at this position on the reference.

No change – There was no nucleotide change at this position on the reference.

Region affected

Intergenic – This change falls within the regions between genes.

Genic – This change falls within the boundaries of a gene.

Coding – This reference position is in a coding region of the genome.

Non-coding RNA – This change occurs in either the untranslated region (UTR) of an mRNA or within a non-coding RNA gene (e.g. rRNA, tRNA, miRNA, etc.).

Relationship to splice site location

Splice – This change is within either a splice donor or acceptor.

Translation

Synonymous – This change does not alter the encoded amino acid sequence in this (translated) coding region.

Non-synonymous – This change alters the encoded amino acid sequence in this (translated) coding region. There are five sub-categories of non-synonymous SNPs:

 

      Substitution – The replacement of one amino acid by another amino acid.

      No-start – This change results in the disappearance of the start codon.

      No-stop – This change results in the disappearance of the stop codon.

      Nonsense – This change results in a premature stop codon.

      Frameshift – This change results from insertion and/or deletion of a number of nucleotides that is not evenly divisible by three.

 

SNPs in the coding regions of RNAs which are not translated are not classified as either synonymous or non-synonymous. For example, mRNAs can be synonymous or non-synonymous, while tRNA, rRNA, misc_RNA, tmRNA, miRNA, and snRNAs cannot.