You can extract substrings in the Bash shell primarily using parameter expansion, a built-in feature that is fast and efficient. Other methods include using external commands like cut, expr, and awk.
Parameter Expansion
This is the most common and recommended method for substring extraction. It’s built into the shell, making it very quick as it doesn’t require launching a separate process.
By Position and Length
The syntax for extracting a substring is ${string:position:length}.
string: The variable containing the string.position: The starting index.length: The number of characters to extract.
Indices in Bash start at 0. You can also use a negative position to count from the end of the string.
my_string="Hello, World!"
# Extract 5 characters starting from the 0th position
substring1=${my_string:0:5}
echo $substring1
# Output: Hello
# Extract from the 7th position to the end of the string
substring2=${my_string:7}
echo $substring2
# Output: World!
# Extract 6 characters starting from the 6th position from the end
substring3=${my_string: -6:6}
echo $substring3
# Output: World!
The space after the colon in ${my_string: -6:6} is important when using a negative index to distinguish it from parameter expansion operators.
By Pattern Matching
You can also use pattern matching to remove parts of a string and extract what’s left.
${string#pattern}: Removes the shortest matching pattern from the beginning.${string##pattern}: Removes the longest matching pattern from the beginning.${string%pattern}: Removes the shortest matching pattern from the end.${string%%pattern}: Removes the longest matching pattern from the end.
filename="home/user/document.txt.bak"
# Get the filename by removing the longest matching path from the beginning
base_name=${filename##*/}
echo $base_name
# Output: document.txt.bak
# Get the file extension by removing the shortest matching prefix from the end
extension=${base_name##*.}
echo $extension
# Output: bak
# Get the filename without the last extension
no_ext=${base_name%.*}
echo $no_ext
# Output: document.txt
External Commands
These tools are useful if you’re already using them in your script or prefer their syntax, but they are generally slower than parameter expansion because they are separate processes.
cut
The cut command extracts sections from each line of a file. For substrings, you can use the -c option.
my_string="extract me"
# Extract characters from position 9 to 10
echo "$my_string" | cut -c 9-10
# Output: me
expr
The expr command is an older tool for evaluating expressions, including string operations.
my_string="bash_is_powerful"
# Extract 4 characters starting from the 10th position
expr substr "$my_string" 10 4
# Output: erful
awk
awk is a powerful text processing language. The substr() function is an easy way to extract a substring.
my_string="Bash scripting"
# Extract 6 characters starting from the 6th position
echo "$my_string" | awk '{print substr($0, 6, 6)}'
# Output: script