Next: The Backslash Character and Special Expressions, Previous: Fundamental Structure, Up: Regular Expressions
A bracket expression is a list of characters enclosed by ‘[’ and ‘]’. It matches any single character in that list; if the first character of the list is the caret ‘^’, then it matches any character not in the list. For example, the regular expression ‘[0123456789]’ matches any single digit.
Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive. In the default C locale, the sorting sequence is the native character order; for example, ‘[a-d]’ is equivalent to ‘[abcd]’. In other locales, the sorting sequence is not specified, and ‘[a-d]’ might be equivalent to ‘[abcd]’ or to ‘[aBbCcDd]’, or it might fail to match any character, or the set of characters that it matches might even be erratic. To obtain the traditional interpretation of bracket expressions, you can use the ‘C’ locale by setting the LC_ALL environment variable to the value ‘C’.
Finally, certain named classes of characters are predefined within bracket expressions, as follows. Their interpretation depends on the LC_CTYPE locale; for example, ‘[[:alnum:]]’ means the character class of numbers and letters in the current locale.
0 1 2 3 4 5 6 7 8 9
.
a b c d e f g h i j k l m n o p q r s t u v w x y z
.
! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~
.
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
.
0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f
.
If you mistakenly omit the outer brackets, and search for say, ‘[:upper:]’, GNU grep prints a diagnostic and exits with status 2, on the assumption that you did not intend to search for the nominally equivalent regular expression: ‘[:epru]’. Set the POSIXLY_CORRECT environment variable to disable this feature.
Most meta-characters lose their special meaning inside bracket expressions.