Online Help

:: ExamDiff Pro :: Home

Regular Expressions

Regular Expressions are a concise and flexible notation for finding and replacing patterns of text. Regular expression support in ExamDiff Pro is based on the Boost Regular Expression Library. ExamDiff Pro uses Perl regular expression syntax.

The table below shows the Regular Expressions that can be inserted into the Find or Replace boxes, used to ignore lines or line parts in Compare Options, or used to define comments in Document Types Options.

For more information and tutorials on regular expressions visit Regular-Expressions.info.

Expression Syntax Description
Literals   All characters are literals except: ".", "|", "*", "?", "+", "(", ")", "{", "}", "[", "]", "^", "$" and "\". These characters are literals when preceded by a "\".
Any Character . Matches any one character.
Zero or More * Matches zero or more occurrences of the preceding expression. For example, "ba*" will match all of "b", "ba", "baaa" etc.
One or More + Matches at least one occurrence of the preceding expression. For example, "ba+" will match "ba" or "baaaa" for example but not "b".
Zero or One ? Matches zero or one occurrence of the preceding expression. For example, "ba?" will match "b" or "ba".
Number of Repeats {} Used to specify the minimum and maximum number of repeats. For example, "a{2}" is the letter "a" repeated exactly twice, "a{2,4}" represents the letter "a" repeated between 2 and 4 times, and "a{2,}" represents the letter "a" repeated at least twice with no upper limit.
Grouping () Groups a sub expression. For example the expression "(ab)*" would match all of the string "ababab".
Or | Matches the expression before or after the |. Mostly used within a group. For example, "(sponge)|(mud) bath" matches "sponge bath" and "mud bath."
Set of Characters [] Matches any one of the characters within the []. To specify a range of characters, list the starting and ending character separated by a dash (-), as in [a-z]. For example, "[abc]" will match either of "a", "b", or "c".
Character Not in Set [^] Matches any character not in the set of characters following the ^. For example, "[^abc] will match any character other than "a", "b", or "c".
Character Classes   Character classes are denoted using the syntax "[:classname:]" within a set declaration, for example "[[:space:]]" is the set of all whitespace characters. The available character classes are:
  [:alnum:] Any alpha-numeric character.
  [:alpha:] Any alphabetical character a-z and A-Z. Other characters may also be included depending upon the locale.
  [:blank:] Any blank character, either a space or a tab.
  [:cntrl:] Any control character.
  [:digit:] or \d Any digit 0-9.
  [:graph:] Any graphical character (a printable character other than a space).
  [:lower:] or \l Any lower case character a-z. Other characters may also be included depending upon the locale. Note that this character class has to be used in conjunction with "Match case" option.
  [:print:] Any printable character.
  [:punct:] Any punctuation character.
  [:space:] or \s Any whitespace character.
  [:upper:] or \u Any upper case character A-Z. Other characters may also be included depending upon the locale. Note that this character class has to be used in conjunction with "Match case" option.
  [:xdigit:] Any hexadecimal digit character, 0-9, a-f and A-F.
  [:word:] or \w Any word character - all alphanumeric characters plus the underscore.
  [:unicode:] Any character whose code is greater than 255, this applies to the wide character traits classes only.
Character by Octal Code \0#### Escape character followed by the digit "0" followed by the octal character code. For example "\023" represents the character whose octal code is 23. Where ambiguity could occur use parentheses to break the expression up: "\0103" represents the character whose code is 103, "(\010)3 represents the character 10 followed by "3". Use with "Match case" option.
Character by Hexadecimal Code \x#### \x followed by a string of hexadecimal digits, optionally enclosed inside {}, for example \xf0 or \x{aff}. Use with "Match case" option.
Beginning of Line ^ Starts the match at the beginning of a line (significant only at the start of an expression).
End of Line $ Anchors the match to the end of a line (significant only at the end of an expression).

Contents|Index