RPKM-CN

RPKM-CN normalization (Krumm et al., 2012) is available for CNV experiments and is calculated as follows:

 

RPKM-CN = RPKM / median of the exon's RPKMs; where RPKM > 1

 

RPKM-CN calculates the copy number by taking the ratio of the RPKM of an exon versus the median RPKM of any exon in the experiment. The final number is a ratio (or log ratio) indicating a relative copy number with no units, since the units are cancelled out in the ratio.

 

The variable M is a constant: the number of millions of mapped reads in the experiments. The ultimate meaning of the ratio comes from the different reads "R" and length "K" of each exon and the median. The constant, M, drops out of the equation and only affects scaling for initial filtering-out of low-coverage exons.

 

We only recommend using RPKM-CN if you don't have enough samples to provide a good standard deviation for each exon when using the zRPKM normalization method. Otherwise, zRPKM is the preferred method for CNV workflows.

 

Note: Based upon a method from NCBI, any items with a linear value of “zero” are automatically set to half of the lowest positive value in the set.