The boost::split Function in C++

Jinku Hu Oct 12, 2023
  1. Use the boost::split Function to Tokenize the Given String
  2. Use stringstream With getline Function to Split the String With Delimiters
The boost::split Function in C++

This article will demonstrate how to use the boost::split function in C++.

Use the boost::split Function to Tokenize the Given String

Boost provides powerful tools to extend the C++ standard library with mature and well-tested libraries. This article explores the boost::split function, which is part of the Boost string algorithm library. The latter includes several string manipulation algorithms like trimming, replacing, etc.

The boost::split function splits the given string sequence into tokens separated by the delimiter. The user should supply a predicate function that identifies the delimiter as the third parameter. The provided function should return true if the given element is a delimiter.

In the following example, we specify an isspace function object to identify spaces in the given text and split them into tokens. boost::split also needs a destination sequence container to store tokenized sub-strings. Note that the destination container is passed as the first parameter, and its previous contents are overwritten after the function call.

#include <boost/algorithm/string/split.hpp>
#include <iostream>
#include <sstream>
#include <string>
#include <vector>

using std::cin;
using std::cout;
using std::endl;
using std::string;
using std::vector;

int main() {
  string text = "Lorem ipsum  dolor sit amet, consectetur adipiscing elit.";
  vector<string> words;

  boost::split(words, text, isspace);
  for (const auto &item : words) {
    cout << item << "; ";
  }
  cout << endl;

  return EXIT_SUCCESS;
}

Output:

Lorem; ipsum; ; dolor; sit; amet,; consectetur; adipiscing; elit.;

The boost::split call in the previous code snippet stores empty strings when two or more delimiters are next to each other. Although, we can specify the fourth optional parameter - boost::token_compress_on and adjacent delimiters will be merged as shown in the following example. Mind that Boost libraries must be installed on the system if you want to run these two code snippets successfully.

#include <boost/algorithm/string/split.hpp>
#include <iostream>
#include <sstream>
#include <string>
#include <vector>

using std::cin;
using std::cout;
using std::endl;
using std::string;
using std::stringstream;
using std::vector;

int main() {
  string text = "Lorem ipsum  dolor sit amet, consectetur adipiscing elit.";
  vector<string> words;

  boost::split(words, text, isspace, boost::token_compress_on);
  for (const auto &item : words) {
    cout << item << "; ";
  }
  cout << endl;

  return EXIT_SUCCESS;
}

Output:

Lorem; ipsum; dolor; sit; amet,; consectetur; adipiscing; elit.;

Use stringstream With getline Function to Split the String With Delimiters

Alternatively, one can employ the stringstream class and getline function to split text with the given character delimiter. In this case, we only utilize STL tools, and there’s no need to include Boost headers. Note that this code version is bulkier and needs extra steps to merge the adjacent delimiter characters.

#include <iostream>
#include <sstream>
#include <string>
#include <vector>

using std::cin;
using std::cout;
using std::endl;
using std::string;
using std::stringstream;
using std::vector;

int main() {
  string text = "Lorem ipsum  dolor sit amet, consectetur adipiscing elit.";
  vector<string> words;

  char space_char = ' ';
  stringstream sstream(text);
  string word;
  while (std::getline(sstream, word, space_char)) {
    words.push_back(word);
  }

  for (const auto &item : words) {
    cout << item << "; ";
  }
  cout << endl;

  return EXIT_SUCCESS;
}

Output:

Lorem; ipsum; ; dolor; sit; amet,; consectetur; adipiscing; elit.;
Author: Jinku Hu
Jinku Hu avatar Jinku Hu avatar

Founder of DelftStack.com. Jinku has worked in the robotics and automotive industries for over 8 years. He sharpened his coding skills when he needed to do the automatic testing, data collection from remote servers and report creation from the endurance test. He is from an electrical/electronics engineering background but has expanded his interest to embedded electronics, embedded programming and front-/back-end programming.

LinkedIn Facebook

Related Article - C++ Boost