What is globbing and how do I use it in Linux?

Globbing is like regular expressions for your filenames. Unfortunately, it’s often misunderstood, despite being an essential command-line skill that everyone knows at least something about. Turn that superficial knowledge into a better understanding.

What is globbing?

“Globbing” is an informal version of the term “filename expansion.” Using special patterns, you can identify filenames based on patterns rather than exact literal matches.

The first example of globbing that everyone comes across is in this kind of format:

ls *.txt

This shows a list of files with names ending in the extension “.txt” in your current directory. In this example, the * matches any string, including none at all, so it would match files with these names:

  • about.txt
  • a filename with spaces.txt

It would not match:

Hidden files beginning with a . are a special case. To match hidden files, you must begin your pattern with a “.” character.

There are only two other types of match in the standard case. The ? pattern matches any single character: one exactly. And the […] pattern matches a single character of the enclosed set, which can also include ranges (like a-z) and character classes (like :digit:).

With just these rules, you can build quite sophisticated patterns to match different sets of files, according to your needs. So, for example, you could use ls [amz]*[:digit:].?? To show all files beginning with an a, m, or z and ending with a digit followed by a period and an extension of exactly two letters.

You may sometimes see a pattern involving braces, like ls *.{md,markdown}, which lists all files ending in either extension. Technically, this brace expansion is separate from the globbing process, but it’s common to use the two together.

How does globbing work?

Globbing is handled by your shell, not by the command that you’re running. Consider what happens when you run ls *.txt. You might think the ls program receives “*.txt” as an argument, works out which files that pattern matches, and prints the results.

Actually, the shell is responsible for transforming “*.txt” into “foo.txt,” “hello.txt,” etc. It then passes those values to the ls command as arguments. So ls never needs to worry about handling “*.txt”; the shell handles it.

This is the cause of a common problem with commands like find:

find . -name *.txt

Running this command, you’d expect find to locate all text files in the current directory and all its subdirectories. However, instead, Bash does the following:

  • Sees the “*” and converts “*.txt” into a list of matching filenames, which will all be from the current directory only.
  • Runs the command, e.g., find . -name filea.txt fileb.txt filec.txt.

Confirm the command your shell will actually run beforehand by prefixing it with echo:

Using echo with a glob will expand the glob and print its value.

The problem now is that find will complain about an “unknown primary or operator;” and you’ll probably be left scratching your head trying to work out what that means!

The find program reporting an error that reads "unknown primary or operator."

The -name option (find calls this a “primary”) can only take one argument itself, so find gets confused when it sees “fileb.txt,” and bails. The correct way to run this command is:

find . -name '*.txt'

Quoting the expression—with either single or double quotes—means the shell will not perform filename expansion, passing the pattern onto find instead.

What else can globbing do?

In Bash, the core globbing characters are *, ?, and […]. However, Bash supports an extended set of wildcards with more functionality. This is mostly to support repeated patterns and brings globbing closer to full regular expression matching.

Extended globbing is usually enabled by default, but if it isn’t (e.g., in Bash 3.2 on macOS) you’ll need to do so with this command:

shopt -s extglob

Once enabled, you’ll be able to use the following:

  • ?(pattern-list) to match 0 or 1 occurrence of the given patterns.
  • *(pattern-list) to match 0 or more occurrences of the given patterns.
  • +(pattern-list) to match 1 or more occurrences of the given patterns.
  • @(pattern-list) to match 1 of the given patterns.
  • !(pattern-list) to match anything except one of the given patterns.

For example, you can match files named “a,” “aa,” and “aaaaaa” with the glob +(a). You can match the files “README.md,” “README.txt,” and “README” with the pattern README?(.md|.txt).

If you’re familiar with regular expressions, it’s easy to forget that globs are implicitly anchored. This means that a pattern like +(a) will only match files that consist solely of the “a” character, rather than any file that contains at least one “a” character somewhere in the middle.

Since globbing is handled by your shell, the syntax it supports can vary, so you should always check your shell’s documentation. The Zsh shell also supports the basics, but it offers an extended syntax that’s closer to regular expressions. For example, Zsh supports grouping with parentheses, which looks similar to Bash’s extended syntax:

ls (file1|file2)  # ls file1 file2

Many shells, including Bash and Zsh, add a recursive globbing feature, where you can expand file matches recursively. Using this, you can match zero or more directories with the pattern **. So ls **/*.txt will find all .txt files inside the current directory or any of its subdirectories, at any depth:

Results from a recursive glob showing files in nested directories.

In Bash, this is another optional shell setting, so you’ll need to enable it:

shopt -s globstar

It’s worth mastering the core, POSIX-compliant syntax for globbing: ?, *, and […]. With these patterns, you’ll be able to handle groups of files efficiently, saving precious typing effort. If you’re committed to one particular shell, you’ll get a small bonus from learning its extended syntax.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top