regular expression
regular expression
[′reg·yə·lər ik′spresh·ən]regular expression
(text, operating system)An ordinary character (not one of the special charactersdiscussed below) matches that character.
A backslash (\\) followed by any special character matches thespecial character itself. The special characters are:
"." matches any character except NEWLINE; "RE*" (wherethe "*" is called the "Kleene star") matches zeroor more occurrences of RE. If there is any choice, thelongest leftmost matching string is chosen, in mostregexp flavours.
"^" at the beginning of an RE matches the start of a line and"$" at the end of an RE matches the end of a line.
[string] matches any one character in that string. If thefirst character of the string is a "^" it matches anycharacter except the remaining characters in the string (andalso usually excluding NEWLINE). "-" may be used to indicatea range of consecutive ASCII characters.
\\( RE \\) matches whatever RE matches and \, where n is adigit, matches whatever was matched by the RE between the nth\\( and its corresponding \\) earlier in the same RE. Manyflavours use ( RE ) used instead of \\( RE \\).
The concatenation of REs is a RE that matches theconcatenation of the strings matched by each RE. RE1 | RE2matches whatever RE1 or RE2 matches.
\\< matches the beginning of a word and \\> matches the end of aword. In many flavours of regexp, \\> and \\< are replaced by"\\b", the special character for "word boundary".
RE\\m\\ matches m occurences of RE. RE\\m,\\ matches m ormore occurences of RE. RE\\m,n\\ matches between m and noccurences.
The exact details of how regexp will work in a givenapplication vary greatly from flavour to flavour. Acomprehensive survey of regexp flavours is found in Friedl1997 (see below).
[Jeffrey E.F. Friedl, "Mastering Regular Expressions,O'Reilly, 1997].
regular expression
(2)Concatenation - pattern A concatenated with B matches a matchfor A followed by a match for B.
Or - pattern A-or-B matches either a match for A or a matchfor B.
Closure - zero or more matches for a pattern.
The earliest form of regular expressions (and the term itself)were invented by mathematician Stephen Cole Kleene in themid-1950s, as a notation to easily manipulate "regular sets",formal descriptions of the behaviour of finite state machines, in regular algebra.
[S.C. Kleene, "Representation of events in nerve nets andfinite automata", 1956, Automata Studies. Princeton].
[J.H. Conway, "Regular algebra and finite machines", 1971, EdsChapman & Hall].
[Sedgewick, "Algorithms in C", page 294].