Luckily enough, you are in the right place. Here bash pattern matching will be treated thoroughly starting from the basics and working towards less deviled too touch advanced pattern matching techniques. Bash pattern matching Results, Types and Tools will be covered.
Pattern matching results
The result of pattern matching is a list of 1 or more matching patterns. In the case of an empty list, the pattern did not match.
Types of patterns
Before we even get started with our first pattern matching example, let’s lay down the groundworks to build on. That is, let’s list out all the types of patterns to be treated in the scope of pattern matching and provide an overview of the examples to follow.
- Generic pattern
- String exact pattern
- String regular expression pattern
- File exact pattern
- File glob pattern
Patterns in general
In general, when we are looking to do pattern matching there are three base parameters: the pattern, the subject, and the relation. For simplicity purposes, we’ll assume that there is a function that maps the pattern into the subject and the result matches the subject. Let’s look at some examples.
General patterns: Alphabet soup
Suppose that we have a bowl of alphabet soup that we wish to make subject to pattern matching. For the pattern, we choose the letter P, as in Pikachu. Then, we throw the ball and wait for the result of pattern matching. The letter P matches alphabet soup. Now we can continue eating our breakfast.
General patterns: Spaghetti Os
Now instead, we have a bowl of Spaghetti-Os. Again, we use the letter P as the pattern and throw the ball. As you would expect, the letter P does not match Spaghetti-Os. Maybe we should have had alphabet soup for breakfast or picked a pattern more likely to match.
Patterns in strings
In bash, all variables despite attributes, are represented internally as strings. That is all variables in bash are subject to pattern matching in the same way. Types of string patterns can be Exact or Regular expression.
String patterns: exact pattern
The string exact pattern is a string that represents only 1 string. When matched, the subject of pattern matching is returned as a whole or a substring if matched.
Example 1: simple pattern matching using string exact patterns
Pattern: ori
Matches(pattern,subject): true (ori)
See parameter expansion
Example 2: simple pattern mismatch using string exact patterns
Pattern: ali
Matches(pattern,subject): false ()
See tests
String patterns: regular expression patterns
The string regular expression pattern is a string that can be expanded to match one or more expressions. They come in handy when exact string matching just doesn’t cut it. That is, we need magic or regular expressions. Let’s go with the latter.
Example 3: simple pattern matching using string exact patterns for the word algorithm
Pattern: [logarithm]
Matches(pattern,subject): true (algorithm)
See example in tests
Example 4: simple pattern matching using string exact patterns for hyphen separated date strings
Pattern: [0-9-]*
Matches(pattern,subject): true (2010-01-01)
See example in tests
Patterns in the tree
Bash has a feature called globbing that expands strings outside of quotes to names of files or directories immediately present in the tree. File expansion as it is also referred to as is enabled by default so you never have to turn it one. However, in some cases, you may opt to turn it off. Do note that although similar, globbing is not as extensive as regular expressions as seen in string patterns.
Example 5: glob all files in the working directory together
Pattern: *
Matches(pattern, subject): true (all files in working directory)
See example in file expansion
Example 6: glob all files in the working directory together with name containing only a single character
Pattern: ?
Matches(pattern, subject): true (single letter file and directory names)
See example in file expansion
Tools for pattern matching in bash
Bash does not have special builtins for pattern matching. Instead, it requires tools such as grep, sed, or awk in addition to bash builtins like file and parameter expansion, and tests. Here are the tools in and out of bash for pattern matching.
External tools for bash pattern matching
- grep
- gawk
- sed
- xxd
- find
grep
Grep is a simple yet powerful command-line utility and one of the reasons bash doesn’t know how to handle pattern matching. It searches for a pattern in a file. What more can you ask for?
It finds patterns within a file. Using xargs, it can be used to search for patterns in the filesystem.
Suppose that you want to search a directory called haystack for a file containing the word ‘haystack’. Here is how we would use grep.
echo needle >> haystack/aa
find haystack -type f | xargs grep -e "needle" || echo not found
Note that I just happened to rename the sandbox directory in the example below to haystack.
gawk (or awk)
Perhaps another reason why bash appears to not want anything to do with pattern matching is that awk, the pattern scanning, and processing language, existed well before the first release of bash.
In practice, you will find gawk used extensively in many polyglot bash programs as a means of entering pattern matching mode from within a batch script.
Unlike other tools listed for bash pattern matching, gawk has the capability of creating new instances of bash or any other command-line utility through a builtin system function. However, in this case, it is more practical to handle using xargs to run in parallel or pipe into bash directly to run in sequence.
Gawk may also be used to implement primitive versions of command command-line utilities like tac and shuffle, as seen in bash tac command and bash shuf command, respectfully.
sed
Sed, yet another powerful command-line utility and another reason why bash can’t compete by itself in pattern matching, stands for stream editor. It uses a simple programming language built around regular expression allowing you to search, replace, edit files in place, or otherwise to more than string manipulation in bash.
It is commonly used in polyglot bash scripts to replace patterns in files that would otherwise be overkill trying to accomplish using bash parameter expansion.
As seen in bash sed examples, there is more to sed than pattern matching alone.
xxd
xxd is a command-line utility available in most systems that allows you to convert the output to and from hex notation. It makes pattern matching and replacement in non-text files easier when used in conjunction with other pattern matching tools for in bash.
find
find is a command-line utility that can be used as an alternative to file expansion when recursion is required. It allows you to traverse the file system while listing files found matching the options set. For pattern matching on file names, the -name option may be used.
Internal tools for bash pattern matching
Bash has pattern matching capabilities when it comes to files and strings. Here are the tools for pure bash pattern matching: file expansion (globbing), parameter expansion, tests.
file expansion (globbing)
File expansion allows a string not surrounded by quotes containing the characters * or ? to be expanded into one or more paths matching the string. In cases where using the find command is not required, especially when working in the interactive mode in command-line, we may opt to use file expansion over the find command. File expansion is enabled by default. However, it may be disabled using the shopt builtin command.
Usage
Wildcard matching 1 or more characters in a filename
*
Wildcard matching 1 character in a filename
?
By default, unquoted strings will expand depending on files present in the working directory.
Globbing may be disabled and enabled by setting noglob.
Disable globbing
set -o noglob
Enabled globbing (default)
set +o noglob
Alternatively, you may use the short command for disabled globbing
set -f
For other ways to use set, see The Set Builtin. It deserves a section.
You may also find The Shopt Builtin useful as well.
There are ways to modify the file globbing behavior in bash via the set and shopt builtins.
Commands
Run the following commands to set up a sandbox for file expansion (globbing).
mkdir sandbox
cd sandbox
touch {.,}{a..z}{a..z}
touch {.,}{a..z}{a..z}{a,b}
}
You should now be working in a directory named sandbox containing files such as aa, ab, …, zy, zz, including hidden files.
Match all hidden files and directories
Match all files and directories
Match all files and directories starting with an ‘a’
Match all files and directories starting with an ‘a’ and ending with a ‘b’
Match all files and directories with name containing 2 characters and starts with an ‘a’
Match all files and directories with name containing 2 characters
Last but not least, let’s try to glob with noglob set
echo .*
echo .* *
echo a*
echo a*b
echo a?
echo ??
parameter expansion
Parameter expansion in bash allows you to manipulate variables containing strings. It may be used to replace and replace a pattern within a string. Support for case insensitive pattern matching is available by using the shopt builtin command.
Usage
Here is a little function I cooked up to show bash pattern matching in action using parameter expansion. It has 2 parameters: 1) subject; and 2) pattern. If the subject matches the pattern, the function returns a ‘0’; otherwise, it will return ‘1’. Pattern may be a regular expression.
{
local subject
local pattern
subject="${1}"
pattern="${2}"
new_subject="${subject//${pattern}/}"
echo "${new_subject}" 1>&2
test ! "${subject}" = "${new_subject}"
echo ${?}
}
Commands
Here is a block of commands showing how the match function works.
match ${subject} a
match ${subject} ba
match ${subject} [a-d]
Output
tests
Tests in bash allow you to compare files, strings, and integers. They may be used to do pattern matching on a string. In the case of simple pattern matching on strings using regular expressions, we may opt to use tests instead of grep.
Usage
Commands
{
[[ "algorithm" =~ [${1}]{9} ]];
echo ${?}
}
_ logarithm
_ algorithm
_ algorith_
Output
TLDR;
I’ll admit, pattern matching goes way beyond bash alone and may require another section with examples and exercise allowing you to get your hands dirty. I’ll just say that including pure bash pattern matching methods, becoming familiar with the command line utilities listed as external tools for pattern matching in bash is a definite must. Happy bash programming!
Thanks,