regexp − regular expression notation
A *regular expression ***specifies a set of strings of****
****characters. A member of this set of strings is said to be**
*matched ***by the regular expression. In many applications a**
delimiter character, commonly bounds a regular expression.
**In the following specification for regular expressions the**
word ‘character’ means any character (rune) but newline.
The syntax for a regular expression **e0 **is
**e3: literal | charclass | ’.’ | ’^’ | ’$’ | ’(’ e0 ’)’**
**e2: e3**
**| e2 REP**
**REP: ’*’ | ’+’ | ’?’**
**e1: e2**
**| e1 e2**
**e0: e1**
**| e0 ’|’ e1**
A **literal **is any non‐metacharacter or a metacharacter
(one of or the delimiter preceded by
A **charclass **is a nonempty string *s *bracketed ]*[*s (or
]);*[^*s it matches any character in (or not in) *s. *A negated
character class never matches newline. A substring with *a*
and *b *in ascending order, stands for the inclusive range of
characters between *a *and In *s, *the metacharacters an initial
and the regular expression delimiter must be preceded by a
other metacharacters have no special meaning and may appear
unescaped.
A **. matches any character.**
A **^ matches the beginning of a line; $ matches the end**
of the line.
The **REP **operators match zero or more one or more zero
or one instances respectively of the preceding regular
expression
A concatenated regular expression, matches a match to
**e1 **followed by a match to
An alternative regular expression, matches either a
match to **e0 **or a match to
A match to any part of a regular expression extends as
far as possible without preventing a match to the remainder
of the regular expression.
