How to Split String Into Variables in Bash

Abdul Mateen Feb 02, 2024
  1. Strings in Bash
  2. String Functions in Bash
  3. Split String Into Variables in Bash
  4. Use the cut Command to Split String Into Variables in Bash
  5. Use an Internal Field Separator (IFS) to Split String Into Variables in Bash
  6. Use read With sed to Split String Into Variables in Bash
  7. Use a Regular Expression With Match to Split String Into Variables in Bash
How to Split String Into Variables in Bash

This tutorial will discuss different methods to split a string into variables in Bash.

We will start our discussion with a brief introduction to strings. Later, we will discuss various ways to split strings using Bash examples.

Strings in Bash

A string is a combination/collection of characters. In Bash, a string is a data type like an integer or a float.

Characters may also include digits, as we see numbers inside a string as a sequence of ASCII characters.

In data types like integers or floats, the number is a complete entity; there is no individual existence of digits. However, in a string, every character inside the string (including space, comma, semi-colon, and colon) has its representation.

In combination, the individual characters also represent a complete string. Any character you can type from the keyboard can be part of the string.

For example:

s1="Welcome"
s2="This is a complete sentence and string as well"
s3="The string, has some special charactrs like ,$%^[]"
s4="The population of country ABC is 12345678"

Here, the first string has alphabets only. The following string (i.e., s2) also has spaces; these spaces are part of the string and will be stored like alphabets.

Therefore, in the second string, there are eight spaces, which will take eight bytes in memory.

Third-string has special characters, which are again part of the string.

Lastly, in the fourth string, we have a number, not a number but a combination of digits. If you want to make a calculation/comparison as a number, it is impossible unless converted into a number through some mechanism.

String Functions in Bash

Like arithmetic operations on integers or floating point numbers, certain operations/functions are possible/available with strings.

For example, we can compare two strings for equality (equality operation is required if we have to find a string or sub-string). Here is the Bash script:

s1="World Cup."
s2="World Cup"

if [ "$s1" = "$s2" ]; then
    echo "Strings are equal."
else
    echo "Strings are not equal."
fi

s3="World Cup."

if [ "$s1" = "$s3" ];  then
    echo "Both the strings are equal."
else
    echo "Strings are not equal."
fi

Here, we have two strings that look the same; however, there is a difference in the dot at the end of the first string.

The above script first compares the two strings using an if statement, where string variables are enclosed in double quotes, and a single equal sign is used to compare.

Next, it declares another string, s3, the same as the first string, and performs a comparison again. The output of this script is:

Strings are not equal.
Both the strings are equal.

We can find the length of a string using a hash sign. The Bash script is:

s=abcAXYZ123456_AaBbCc
echo "Length of " $s " is:" ${#s}

The output of this script is:

Length of abcAXYZ123456_AaBbCc is: 20

We can compare and get the length of the sub-string using regular expressions (we are not going into details of a regular expression, you may read about regular expressions here).

Here is the code for matching using a regular expression:

s=abcAXYZ123456_AaBbCc
echo `expr match "$s" 'abc[A-Z]*.2'`
echo `expr "$s" : 'abc[A-Z]*.[0-9]*_'`

In the first match, we are matching small abc, followed by capital alphabets (zero or more), followed by digit 2.

The second match uses a slightly different way. Nevertheless, matching the sub-string closed with an underscore using a regular expression.

The output of the above script is:

9
14

In the first match, digit 2 comes at position 9, and in the second match, the underscore is at position 14. There is a long list of operations/functions possible on strings.

In the first match, digit two is coming at position 9. In the second match, the underscore is coming at position 14.

However, we are moving toward our main topic, splitting a string.

Split String Into Variables in Bash

Splitting a string into characters or sub-strings is a common and frequently used operation. For example, a split of expressions into variables and operators (known as tokens) is required before it can be evaluated.

Before starting the translation phase, compilers or other language translators use lexical analyzers to analyze and split a program, a string, into variables, keywords, blocks, functions, etc. In natural language processing, the division of an article into sentences, words, verbs, nouns, etc., is required.

There are different ways in Bash to split a string. We will discuss them with examples.

Use the cut Command to Split String Into Variables in Bash

In Bash, we can use the cut operation to split a string. The syntax of the cut command is:

cut -f(number) -d(delimiter)

The cut command splits a string based on the delimiter provided after the -d option. The delimiter in the strings can be any character that separates (or is assumed to be separating) two sub-strings.

For example, in the English language, sentences are separated by a dot; therefore, a dot can be considered as a delimiter to separate sentences. Similarly, space is a delimiter to separate words.

An operator is a delimiter between operands in an arithmetic expression (in programming).

Next, we have a script where we have a string with a hyphen as a delimiter. The string is split into three variables.

When we use a variable first time, a dollar sign is not required; however, subsequent operations require a dollar sign with variables. The script is:

v="0123-456-789"
v1=$(echo $v | cut -f1 -d-)
v2=$(echo $v | cut -f2 -d-)
v3=$(echo $v | cut -f3 -d-)
echo $v1
echo $v2
echo $v3

The output of this script is:

0123
456
789

We will use another delimiter with a cut operation to give you an idea. Here is another script with a colon as a delimiter:

v="0123:456:789"
v1=$(echo $v | cut -f1 -d:)
v2=$(echo $v | cut -f2 -d:)
v3=$(echo $v | cut -f3 -d:)
echo $v1
echo $v2
echo $v3

The code is the same except for the delimiter. The colon is a delimiter in the string; the same is used in the cut operation.

The output of this script is:

0123
456
789

The cut operation has an option of a single delimiter; however, using an internal field separator, we can use multiple delimiters.

Use an Internal Field Separator (IFS) to Split String Into Variables in Bash

Using IFS, we can use single or multiple delimiters. In the case of a single delimiter, no quotes are required; however, to use various delimiters, double quotes can contain more than one delimiter character.

Let’s look at a very basic example:

IFS=- read v1 v2 v3 v4 <<< this-is-batch-file
echo $v1
echo $v2
echo $v3
echo $v4

In the above script, the IFS has only a single delimiter. Using IFS, read assigning the string this-is-batch-file to variables v1 to v4.

The output of this script is:

this
is
batch
file

Next, we have an example of multiple delimiters. See the following script:

IFS=" ,: " read v1 v2 v3 v4 <<< is,something,strange:there
echo $v1
echo $v2
echo $v3
echo $v4

Here, two delimiters are used in IFS, and string also contains two delimiters. The first three words are separated using a comma, and the last word is separated using a colon.

The output is:

is
something
strange
there

Now using IFS, we have the option of multiple delimiters.

Do we always need to give a delimiter to split the string? The answer is no; we have other ways as well.

Use read With sed to Split String Into Variables in Bash

We can split a string into an array of characters using read and sed (a special command that can do many string operations line by line). So, once we have an array of characters, we can manipulate them according to our specific requirements.

Here is the script:

read -ra var5 <<<"$(echo "12-34-56" | sed 's/./& /g')"
echo ${var5[0]}
echo ${var5[1]}
echo ${var5[2]}
echo ${var5[3]}

The output of this script is:

1
2
-
3

Note that even a hyphen character is placed as a character; therefore, this method cannot split the string on any delimiter. Instead, it breaks the string into an array of characters.

There is one exception to this method. If you use space between the characters, the process will ignore the space, and the resultant array has no spaces.

Let’s understand this using the following script and the adjacent Output:

read -ra var51 <<<"$(echo "FG HI JK" | sed 's/./& /g')"
echo ${var51[0]}
echo ${var51[1]}
echo ${var51[2]}
echo ${var51[3]}

The output is:

F
G
H
I

In the output, it is evident that subscript/index 2 contains the letter H instead of a space character.

Use a Regular Expression With Match to Split String Into Variables in Bash

We have already shared a link to read about regular expressions. Here, we have another way to split strings, using regular expression and match operation using =~.

The script is:

re="^([^-]+)-(.*)$"
[[ "31-28-31" =~ $re ]] && var6="${BASH_REMATCH[1]}" && var_r="${BASH_REMATCH[2]}"
[[ $var_r =~ $re ]] && var7="${BASH_REMATCH[1]}" && var8="${BASH_REMATCH[2]}"
echo "First:" $var6
echo "Second:" $var7
echo "Third:" $var8

Again hyphen is used as a delimiter, and our regular expression contains a hyphen. The output is:

First: 31
Second: 28
Third: 31

Next, we have the same method with a different delimiter comma. The script is:

re1="^([^,]+),(.*)$"
[[ "high,risk" =~ $re1 ]] && v1="${BASH_REMATCH[1]}" && v2="${BASH_REMATCH[2]}"
echo "First:" $v1
echo "Second:" $v2

The output is:

First: high
Second: risk

Finally, we have presented different ways in Bash to split the string and store it into variables. Now, the readers can use the method that suits their requirements well.

Related Article - Bash String