Examples of expressions you may find useful regarding paired end naming specifications follow. Please note this is not a complete list of regular expressions, and the definitions of the terms used are limited to their application to SeqMan NGen paired end naming specifications.

Special Characters
[ ] Character class used to enclose a list of alternatives. For example:

[Aa]bc matches abc and Abc.

If the first character is a carat (^), it means anything but the characters on the list. Thus: [^a]bc matches xbc but not abc.
\ A switch that makes special characters literal and literal characters special.
( ) Grouping--used to delimit a string comprising a “phrase.” Phrases are necessary in paired end specification so you can match a pair of forward and reverse reads while still distinguishing their orientation. In SeqMan NGen, phrases in parentheses must match for two reads to qualify as a pair; phrases outside the parentheses are used to distinguish members of the same pair.
\d Any digit (0-9)
\D Any non-digit character.
\w Any alphanumeric “word” character (including “_”)
. Any character
| Alternate--either the term before “|” or after “|”
^ Match at the beginning of the line only.
$ Match at the end of the line only.
Numerical Modifiers
* 0 or more
+ 1 or more
? 1 or 0
{n} Exactly n
{n,} At least n
{n,m} At least n but not more than m
Example Expressions and Their Meanings
d Literally the letter d
\d Any digit (0-9)
\d* Zero or more digits
\d+ One or more digits
(\d+) A phrase comprising one or more digits--same as “\d+”, but causes SeqMan NGen to match the names from the string inside the phrase when other characters in the name may not match.
\. Literally the period symbol (.)
. Any character
.+ One or more of any characters
.* Zero or more of any characters
a|b a OR b
ab[i1] abi or ab1
abi$ Ends with abi
[\.\d] A period OR a digit
[abc] a OR b OR c
[abc]+ One or more characters from the set a, b, c
.*f Any number of any characters followed by the letter “f”
(.*)f* A phrase comprising any number of any characters, followed by the letter “f”--same as “.*f”, but causes SeqMan NGen to match the phrase in parentheses without matching the “f” in a read name
(\D+)r(\d+) One or more non-digit characters followed by “r” followed by one or more digits.
(\d{2,4})f(\.abi) Two, three or four digits followed by “f” followed by “.abi

Need more help with this?
Contact DNASTAR

Thanks for your feedback.