Regular Expressions are a concise and flexible notation for finding and
replacing patterns of text. Regular expression support in ExamDiff Pro is
based on the Boost Regular Expression Library.
ExamDiff Pro uses Perl regular expression syntax.
| Expression |
Syntax |
Description |
| Literals |
|
All characters are literals except: ".", "|", "*", "?", "+",
"(", ")", "{", "}", "[", "]", "^", "$" and "\". These characters are literals
when preceded by a "\". |
| Any Character |
. |
Matches any one character. |
| Zero or More |
* |
Matches zero or more occurrences of the preceding expression.
For example, "ba*" will match all of "b", "ba", "baaa" etc. |
| One or More |
+ |
Matches at least one occurrence of the preceding expression. For
example, "ba+" will match "ba" or "baaaa" for example but not "b". |
| Zero or One |
? |
Matches zero or one occurrence of the preceding expression. For
example, "ba?" will match "b" or "ba". |
| Number of Repeats |
{} |
Used to specify the minimum and maximum number of repeats. For
example, "a{2}" is the letter "a" repeated exactly twice, "a{2,4}" represents
the letter "a" repeated between 2 and 4 times, and "a{2,}" represents the letter
"a" repeated at least twice with no upper limit. |
| Grouping |
() |
Groups a sub expression. For example the expression "(ab)*"
would match all of the string "ababab". |
| Or |
| |
Matches the expression before or after the |. Mostly used within a group. For example, "(sponge)|(mud) bath" matches "sponge bath" and "mud bath." |
| Set of Characters |
[] |
Matches any one of the characters within the []. To specify a range of characters, list the starting and ending character separated by a dash (-), as in [a-z].
For example, "[abc]" will match either of "a", "b", or "c". |
| Character Not in Set |
[^] |
Matches any character not in the set of characters following the ^.
For example, "[^abc] will match any character other than "a", "b", or "c". |
| Character Classes |
|
Character classes are denoted using the syntax "[:classname:]"
within a set declaration, for example "[[:space:]]" is the set of all whitespace
characters. The available character classes are: |
| |
[:alnum:] |
Any alpha-numeric character. |
| |
[:alpha:] |
Any alphabetical character a-z and A-Z. Other characters may
also be included depending upon the locale. |
| |
[:blank:] |
Any blank character, either a space or a tab. |
| |
[:cntrl:] |
Any control character. |
| |
[:digit:] or \d |
Any digit 0-9. |
| |
[:graph:] |
Any graphical character (a printable character other than a
space). |
| |
[:lower:] or \l |
Any lower case character a-z. Other characters may also be
included depending upon the locale. Note that this character class has to be
used in conjunction with "Match case" option. |
| |
[:print:] |
Any printable character. |
| |
[:punct:] |
Any punctuation character. |
| |
[:space:] or \s |
Any whitespace character. |
| |
[:upper:] or \u |
Any upper case character A-Z. Other characters may also be
included depending upon the locale. Note that this character class has to be
used in conjunction with "Match case" option. |
| |
[:xdigit:] |
Any hexadecimal digit character, 0-9, a-f and A-F. |
| |
[:word:] or \w |
Any word character - all alphanumeric characters plus the
underscore. |
| |
[:unicode:] |
Any character whose code is greater than 255, this applies to
the wide character traits classes only. |
| Character by Octal Code |
\0#### |
Escape character followed by the digit "0" followed by the octal
character code. For example "\023" represents the character whose octal code is
23. Where ambiguity could occur use parentheses to break the expression up:
"\0103" represents the character whose code is 103, "(\010)3 represents the
character 10 followed by "3". Use with "Match case" option. |
| Character by Hexadecimal Code |
\x#### |
\x followed by a string of hexadecimal digits, optionally
enclosed inside {}, for example \xf0 or \x{aff}. Use with "Match case" option. |
| Beginning of Line |
^ |
Starts the match at the beginning of a line (significant only at the start of an expression). |
| End of Line |
$ |
Anchors the match to the end of a line (significant only at the end of an expression). |