Convention 1B

Imagine a variant on Convention 1A where the number of digits preceding the “f/r” is not constant, and further that some reads were obtained with the extension “*.abi” and others with “*.ab1”. You may now want to use the expression:

 

Forward Name

Reverse Name

(\d+)f\.ab[i1]

(\d+)r\.ab[i1]

 

Now there can be one or more digits preceding the f or r, and for the purposes of defining a pair, it doesn’t matter whether the extension is “.abi” or “.ab1,” even for two different members of the same pair. (Recall that only the parts of the expression in parentheses must match to define a pair. If you put parentheses around the expression following the f or r here, i.e. “(\.ab[i1])” instead of “\.ab[i1]”, both members of the pair need the same extension to be recognized as a pair.)

 

For convention 1B, the simpler expression could instead be used:

 

Forward Name

Reverse Name

(\d+)f.*

(\d+)r.*

 

This simpler expression will allow two members of a pair to have completely different extensions. You should be cautious when using such simplified expression, since some naming systems may not lead to your desired results.