20.4. Wildcards

Wildcards are useful in many ways for a GNU/Linux system and for various other uses. Commands can use wildcards to perform actions on more than one file at a time, or to find part of a phrase in a text file. There are many uses for wildcards, there are two different major ways that wildcards are used, they are globbing patterns/standard wildcards that are often used by the shell. The alternative is regular expressions, popular with many other commands and popular for use with text searching and manipulation.

TipTip
 

If you have a file with wildcard expressions in it then you can use single quotes to stop bash expanding them or use backslashes (escape characters), or both.

For example if you wanted to create a file called 'fo*' (fo and asterisk) you would have to do it like this (note that you shouldn't create files with names like this, this is just an example):
 touch 'fo*' 

Note that parts of both subsections on wildcards are based (at least in part) off the grep manual and info pages. Please see the Bibliography for further information.

20.4.1. Standard Wildcards (globbing patterns)

Standard wildcards (also known as globbing patterns) are used by various command-line utilities to work with multiple files. For more information on standard wildcards (globbing patterns) refer to the manual page by typing:

man 7 glob

NoteCan be used by
 

Standard wildcards are used by nearly any command (including mv, cp, rm and many others).

? (question mark)

this can represent any single character. If you specified something at the command line like "hd?" GNU/Linux would look for hda, hdb, hdc and every other letter/number between a-z, 0-9.

* (asterisk)

this can represent any number of characters (including zero, in other words, zero or more characters). If you specified a "cd*" it would use "cda", "cdrom", "cdrecord" and anything that starts with “cd” also including “cd” itself. "m*l" could by mill, mull, ml, and anything that starts with an m and ends with an l.

[ ] (square brackets)

specifies a range. If you did m[a,o,u]m it can become: mam, mum, mom if you did: m[a-d]m it can become anything that starts and ends with m and has any character a to d inbetween. For example, these would work: mam, mbm, mcm, mdm. This kind of wildcard specifies an “or” relationship (you only need one to match).

{ } (curly brackets)

terms are separated by commas and each term must be the name of something or a wildcard. This wildcard will copy anything that matches either wildcard(s), or exact name(s) (an “or” relationship, one or the other).

For example, this would be valid:

cp {*.doc,*.pdf} ~

This will copy anything ending with .doc or .pdf to the users home directory. Note that spaces are not allowed after the commas (or anywhere else).

[!]

This construct is similar to the [ ] construct, except rather than matching any characters inside the brackets, it'll match any character, as long as it is not listed between the [ and ]. This is a logical NOT. For example rm myfile[!9] will remove all myfiles* (ie. myfiles1, myfiles2 etc) but won't remove a file with the number 9 anywhere within it's name.

\ (backslash)

is used as an "escape" character, i.e. to protect a subsequent special character. Thus, "\\” searches for a backslash. Note you may need to use quotation marks and backslash(es).

20.4.2. Regular Expressions

Regular expressions are a type of globbing pattern used when working with text. They are used for any form of manipulation of multiple parts of text and by various programming languages that work with text. For more information on regular expressions refer to the manual page or try an online tutorial, for example IBM Developerworks using regular expressions. For the manual page type:

Type:

man 7 regex

NoteRegular expressions can be used by
 

Regular Expressions are used by grep (and can be used) by find and many other programs.

TipTip
 

If your regular expressions don't seem to be working then you probably need to use single quotation marks over the sentence and then use backslashes on every single special character.

. (dot)

will match any single character, equivalent to ? (question mark) in standard wildcard expressions. Thus, "m.a" matches "mpa" and "mea" but not "ma" or "mppa".

\ (backslash)

is used as an "escape" character, i.e. to protect a subsequent special character. Thus, "\\" searches for a backslash. Note you may need to use quotation marks and backslash(es).

.* (dot and asterisk)

is used to match any string, equivalent to a12264.htm a12264.html b12722.htm backing-up-files.html book1.htm c10407.htm c10694.htm c107.htm c10866.htm c1089.htm c11270.htm c11412.htm c1195.htm c2086.htm c2269.htm c2690.htm c4268.htm c4975.htm c6239.htm c6435.htm c8113.htm c8319.htm c9295.htm c962.htm c9978.htm checking-the-hard-disk.html command-substitution.html compression.html concept-definitions.html contributors.html controlling-processes.html controlling-services.html controlling-the-system.html conventions.html date-time-calendars.html directing-input-ouput.html disclaimer.html doc-index.html duplicating-disks.html feedback.html file-permissions.html finding-information.html finding-packages-tools.html finding-text-within-files.html further-reading.html general-shell-tips.html gnu-free-documentation-licence.html GNU-Linux-Tools-Summary.html graphics-tools.html hard-disk-partition-info.html help.html i12910.htm icon_smile.png index.html internet-specific-commands.html introduction.html legal.html license.html managing-users.html mass-rename.html mathematical-tools.html mini-guides.html miscellaneous.html mounting-and-unmounting.html network-commands.html network-configuration.html other-key-combinations.html performing-more-than-one-command.html references.html remote-administration.html resources-used-to-create-this-document.html rpm.html rsync.html scheduling.html security.html shell-tips.html shutting-down.html some-basic-security-tools.html sources-of-document.html tar.html text-editors.html text-filter-tools.html text-information-tools.html text-manipulation-tools.html text-related-tools.html text-viewing-tools.html the-command-line-history.html the-unix-tools-philosophy.html usage-input-output.html users-and-groups.html using-filesystem.html virtual-terminals.html who-would-not-want-to-read-this-guide.html who-would-want-to-read-this-guide.html wildcards.html working-files-folders.html working-with-ms-dos.html working-with-the-file-system.html x10099.htm x10181.htm x1039.htm x11569.htm x11606.htm x11655.htm x12429.htm x12637.htm x1712.htm x1877.htm x2005.htm x2361.htm x2563.htm x2622.htm x299.htm x3289.htm x335.htm x392.htm x4055.htm x4892.htm x5152.htm x5368.htm x6066.htm x611.htm x6546.htm x662.htm x6823.htm x696.htm x6993.htm x7619.htm x7969.htm x8751.htm x9094.htm x9543.htm in standard wildcards.

* (asterisk)

the proceeding item is to be matched zero or more times. ie. network-commands.html network-configuration.html will match n, nn, nnnn, nnnnnnn but not na or any other character.

^ (caret)

means "the beginning of the line". So "^a" means find a line starting with an "a".

$ (dollar sign)

means "the end of the line". So "a$" means find a line ending with an "a".

For example, this command searches the file myfile for lines starting with an "s" and ending with an "n", and prints them to the standard output (screen):

cat myfile | grep '^s.*n$'
[ ] (square brackets)

specifies a range. If you did m[a,o,u]m it can become: mam, mum, mom if you did: m[a-d]m it can become anything that starts and ends with m and has any character a to d inbetween. For example, these would work: mam, mbm, mcm, mdm. This kind of wildcard specifies an “or” relationship (you only need one to match).

|

This wildcard makes a logical OR relationship between wildcards. This way you can search for something or something else (possibly using two different regular expressions). You may need to add a '\' (backslash) before this command to work, because the shell may attempt to interpret this as a pipe.

[^]

This is the equivalent of [!] in standard wildcards. This performs a logical “not”. This will match anything that is not listed within those square brackets. For example, rm myfile[^9] will remove all myfiles* (ie. myfiles1, myfiles2 etc) but won't remove a file with the number 9 anywhere within it's name.

20.4.3. Useful categories of characters (as defined by the POSIX standard)

This information has been taken from the grep info page with a tiny amount of editing, see [10] in the Bibliography for further information.

NoteThese are used with
 

The above commands will work with most tools which work with text (for example: tr).

For example (advanced example), this command scans the output of the dir command, and prints lines containing a capital letter followed by a digit:

ls -l | grep '[[:upper:]][[:digit:]]'

The command greps for [upper_case_letter][any_digit], meaning any uppercase letter followed by any digit. If you remove the [ ] (square brackets) in the middle it would look for an uppercase letter or a digit, because it would become [upper_case_letter any_digit]