How to Remove Duplicates From Vector in C++

Jinku Hu Feb 12, 2024
  1. Remove Duplicates From a Vector in C++ Using std::sort and std::unique
  2. Remove Duplicates From a Vector in C++ Using std::sort and std::unique With resize
  3. Remove Duplicates From a Vector in C++ Using std::set
  4. Remove Duplicates From a Vector in C++ Using std::unordered_set for Improved Performance
  5. Remove Duplicates From a Vector in C++ Using a Loop
  6. Conclusion
How to Remove Duplicates From Vector in C++

Efficiently managing data is a fundamental aspect of software development, and one common challenge is handling duplicate elements within a collection like a vector. In C++, where versatility and performance are crucial, knowing effective methods to remove duplicates from a vector is essential.

This article explores various techniques to achieve this goal, ranging from standard library algorithms like std::sort and std::unique to leveraging containers like std::set and std::unordered_set. Additionally, we’ll delve into a loop-based approach.

This article will guide you through the diverse strategies available for deduplicating vectors in C++.

Remove Duplicates From a Vector in C++ Using std::sort and std::unique

Removal of duplicate elements from a vector in C++ can be efficiently achieved using the combination of std::sort and std::unique, two powerful functions provided by the C++ Standard Template Library (STL).

The std::sort function is used to sort the elements in a specified range. In the context of removing duplicates, sorting is essential as it brings identical elements together, making it easier for std::unique to identify and remove duplicates efficiently.

#include <algorithm>
#include <vector>

std::sort(myVector.begin(), myVector.end());

On the other hand, the std::unique function is designed to eliminate consecutive duplicate elements within a sorted range. It shifts the unique elements towards the beginning of the range and returns an iterator pointing to the end of the new unique range.

auto last = std::unique(myVector.begin(), myVector.end());

With the sorted and unique elements identified, the duplicates can be erased from the vector using the erase member function:

myVector.erase(last, myVector.end());

Now, let’s put these concepts into practice with a complete working example:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector<int> myVector = {10,  23,  10,  324, 10, 10, 424,
                               649, 110, 110, 129, 40, 424};

  std::sort(myVector.begin(), myVector.end());

  auto last = std::unique(myVector.begin(), myVector.end());

  myVector.erase(last, myVector.end());

  std::cout << "Unique elements: ";
  for (const auto& element : myVector) {
    std::cout << element << "; ";
  }

  return 0;
}

In the provided C++ code example, we begin by including the necessary headers for the Standard Template Library (STL) components we use:

#include <algorithm>
#include <iostream>
#include <vector>

In this example, our vector myVector is initialized with a set of integers, some of which are duplicates. The initial vector looks like this:

std::vector<int> myVector = {10,  23,  10,  324, 10, 10, 424,
                             649, 110, 110, 129, 40, 424};

Now, the first critical step is to sort the vector using std::sort. Sorting is necessary because std::unique operates on sorted ranges.

The line of code below achieves this:

std::sort(myVector.begin(), myVector.end());

After sorting, our vector is transformed into a sorted sequence of elements:

10; 10; 10; 23; 40; 110; 110; 129; 324; 424; 424; 649;

Next, we utilize the std::unique function to identify consecutive duplicate elements within the sorted range. It returns an iterator pointing to the end of the newly formed unique range:

auto last = std::unique(myVector.begin(), myVector.end());

Following the application of std::unique, the vector now contains only the unique elements:

10; 23; 40; 110; 129; 324; 424; 649;

Finally, to reflect these modifications in the original vector, we use the erase member function. It removes elements from the vector starting from the last iterator up to the end:

myVector.erase(last, myVector.end());

Now, our vector is updated, containing unique elements only. To visualize the result, we loop through the modified vector and output the unique elements.

Code Output:

Remove Duplicates From a Vector in C++ Using std::sort and std::unique

This output reflects the vector after successfully removing duplicates using the std::sort and std::unique combination.

Remove Duplicates From a Vector in C++ Using std::sort and std::unique With resize

When tasked with removing duplicate elements from a vector, we have seen that the combination of std::sort and std::unique is a convenient choice. However, an alternative approach is to use the resize function instead of erase to modify the vector’s size directly.

The initial steps remain the same, where the vector is sorted using std::sort to bring duplicate elements together, and std::unique is applied to identify and shift the unique elements:

#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>

std::sort(myVector.begin(), myVector.end());
auto last = std::unique(myVector.begin(), myVector.end());

Instead of erasing the duplicates with erase, we use the resize function, which directly modifies the size of the vector. The argument to resize is the distance between the beginning of the vector and the iterator returned by std::unique:

myVector.resize(std::distance(myVector.begin(), last));

Let’s explore this technique with a complete working example:

#include <algorithm>
#include <iostream>
#include <iterator>
#include <vector>

int main() {
  std::vector<int> myVector = {10,  23,  10,  324, 10, 10, 424,
                               649, 110, 110, 129, 40, 424};

  std::sort(myVector.begin(), myVector.end());

  auto last = std::unique(myVector.begin(), myVector.end());

  myVector.resize(std::distance(myVector.begin(), last));

  std::cout << "Unique elements using resize: ";
  for (const auto& element : myVector) {
    std::cout << element << "; ";
  }

  return 0;
}

The code begins by including the necessary headers and initializing a vector with duplicate elements. After sorting and applying std::unique, instead of erasing duplicates, we employ the resize function to adjust the size of the vector directly.

This results in a vector containing only the unique elements.

myVector.resize(std::distance(myVector.begin(), last));

The loop at the end iterates through the modified vector to display the unique elements.

Code Output:

Remove Duplicates From a Vector in C++ Using std::sort and std::unique with resize

This output reflects the vector after successfully removing duplicates using std::sort and std::unique, with the resize function providing an efficient alternative to the erase operation.

Remove Duplicates From a Vector in C++ Using std::set

In C++, another effective approach to removing duplicate elements from a vector is by leveraging the std::set container. Unlike vectors, sets automatically store unique elements, making them an excellent choice for deduplication tasks. In this article, we’ll explore the syntax and functionality of utilizing std::set to achieve this goal.

The std::set container in C++ automatically maintains a sorted, unique collection of elements. To remove duplicates from a vector, we can initialize a set with the vector elements, and the set’s uniqueness property takes care of discarding duplicate values:

#include <algorithm>
#include <iostream>
#include <iterator>
#include <set>
#include <vector>

std::set<int> uniqueSet(myVector.begin(), myVector.end());

Once the set is populated with unique elements, the assign function can be employed to overwrite the original vector with these unique values:

myVector.assign(uniqueSet.begin(), uniqueSet.end());

Now, let’s put these concepts into practice with a complete working example:

#include <algorithm>
#include <iostream>
#include <iterator>
#include <set>
#include <vector>

int main() {
  std::vector<int> myVector = {10,  23,  10,  324, 10, 10, 424,
                               649, 110, 110, 129, 40, 424};

  std::set<int> uniqueSet(myVector.begin(), myVector.end());

  myVector.assign(uniqueSet.begin(), uniqueSet.end());

  std::cout << "Unique elements using std::set: ";
  for (const auto& element : myVector) {
    std::cout << element << "; ";
  }

  return 0;
}

In this example, we begin by including the necessary headers and initializing a vector with duplicate elements.

The critical step is creating a set, uniqueSet, and populating it with the elements of the original vector. The set’s unique property automatically ensures that only distinct elements are stored.

std::set<int> uniqueSet(myVector.begin(), myVector.end());

Following the creation of the set, we utilize the assign function to overwrite the original vector with the unique elements contained in the set:

myVector.assign(uniqueSet.begin(), uniqueSet.end());

Now, the vector, myVector, is updated to contain unique elements only. The loop at the end iterates through the modified vector to display the unique elements.

Code Output:

Remove Duplicates From a Vector in C++ Using std::set

This output displays the vector after successfully removing duplicates using the std::set container. The set’s inherent uniqueness property simplifies the deduplication process, providing a clean and efficient solution.

Remove Duplicates From a Vector in C++ Using std::unordered_set for Improved Performance

If performance is a priority, using std::unordered_set to remove duplicates from a vector can be a highly efficient choice. Unlike std::set, std::unordered_set doesn’t maintain a sorted order, making it faster for insertion and lookup operations.

The std::unordered_set container in C++ is an unordered associative container that stores unique elements. Due to its hash-based implementation, it provides constant-time average complexity for insertion and lookup operations.

In the context of removing duplicates, we can use it as follows:

#include <algorithm>
#include <iostream>
#include <iterator>
#include <unordered_set>
#include <vector>

std::unordered_set<int> uniqueSet(myVector.begin(), myVector.end());

After initializing the unordered set with vector elements, we can then use a loop to iterate through the original vector and erase elements that are not unique in the set:

myVector.erase(std::remove_if(myVector.begin(), myVector.end(),
                              [&uniqueSet](const int& val) {
                                return !uniqueSet.insert(val).second;
                              }),
               myVector.end());

Let’s examine this approach with a complete working example:

#include <algorithm>
#include <iostream>
#include <iterator>
#include <unordered_set>
#include <vector>

int main() {
  std::vector<int> myVector = {10,  23,  10,  324, 10, 10, 424,
                               649, 110, 110, 129, 40, 424};

  std::unordered_set<int> uniqueSet(myVector.begin(), myVector.end());

  myVector.erase(std::remove_if(myVector.begin(), myVector.end(),
                                [&uniqueSet](const int& val) {
                                  return !uniqueSet.insert(val).second;
                                }),
                 myVector.end());

  std::cout << "Unique elements using unordered_set: ";
  for (const auto& element : uniqueSet) {
    std::cout << element << "; ";
  }

  return 0;
}

The code begins by including the necessary headers and initializing a vector with duplicate elements. The unique elements are identified by inserting them into an std::unordered_set.

The loop-based erasure then removes duplicates from the original vector based on the unique set.

myVector.erase(std::remove_if(myVector.begin(), myVector.end(),
                              [&uniqueSet](const int& val) {
                                return !uniqueSet.insert(val).second;
                              }),
               myVector.end());

The loop condition checks if the insertion into the unordered set is successful. If an element already exists, indicating a duplicate, it is removed from the vector.

Code Output:

Remove Duplicates From a Vector in C++ Using std::unordered_set

This output reflects the vector after successfully removing duplicates using std::unordered_set. The unordered set’s hash-based implementation contributes to faster insertion and lookup times, making it a performant choice for deduplication tasks.

Remove Duplicates From a Vector in C++ Using a Loop

In certain scenarios where simplicity is prioritized, or performance considerations lead us away from using STL algorithms, a straightforward approach is to use a loop to remove duplicates from a vector in C++. This method involves iterating through the vector and selectively erasing duplicate elements based on their occurrence.

The core idea is to iterate through the vector and selectively erase elements that are duplicates. A loop condition checks if an element has already been encountered and, if so, removes it from the vector:

#include <iostream>
#include <vector>

for (auto it = myVector.begin(); it != myVector.end(); ++it) {
  if (std::find(myVector.begin(), it, *it) != it) {
    it = myVector.erase(it) - 1;
  }
}

Let’s explore this loop-based technique with a complete working example:

#include <algorithm>
#include <iostream>
#include <vector>

int main() {
  std::vector<int> myVector = {10,  23,  10,  324, 10, 10, 424,
                               649, 110, 110, 129, 40, 424};

  for (auto it = myVector.begin(); it != myVector.end(); ++it) {
    if (std::find(myVector.begin(), it, *it) != it) {
      it = myVector.erase(it) - 1;
    }
  }

  std::cout << "Unique elements using a loop: ";
  for (const auto& element : myVector) {
    std::cout << element << "; ";
  }

  return 0;
}

The code begins by including the necessary headers and initializing a vector with duplicate elements. The loop iterates through the vector using an iterator, and for each element, it checks if the element has already been encountered in the vector before the current position.

If a duplicate is found, the element is erased, and the iterator is adjusted to point to the last valid position.

for (auto it = myVector.begin(); it != myVector.end(); ++it) {
  if (std::find(myVector.begin(), it, *it) != it) {
    it = myVector.erase(it) - 1;
  }
}

This loop continues until the end of the vector is reached, effectively removing duplicates.

Code Output:

Remove Duplicates From a Vector in C++ Using a Loop

This output reflects the vector after successfully removing duplicates using a loop. While this method may be less performant than some of the STL algorithm-based approaches, it provides a clear and straightforward solution for deduplicating a vector in C++.

Conclusion

Removing duplicates from a vector in C++ is a common task with various approaches, each offering its advantages and considerations. We explored several techniques in this article, including the use of std::sort and std::unique, std::set, std::unordered_set for improved performance and a loop-based approach.

The std::sort and std::unique combination provides a simple and effective solution, while std::set offers an ordered alternative. For enhanced performance, especially with larger datasets, std::unordered_set presents a hash-based approach.

Additionally, a loop-based method provides a straightforward alternative. The choice of method depends on the specific requirements, emphasizing the importance of considering factors such as performance, simplicity, and the need for a sorted output. With these techniques, you can confidently tackle the task of deduplicating vectors in C++ based on your unique project constraints and goals.

Author: Jinku Hu
Jinku Hu avatar Jinku Hu avatar

Founder of DelftStack.com. Jinku has worked in the robotics and automotive industries for over 8 years. He sharpened his coding skills when he needed to do the automatic testing, data collection from remote servers and report creation from the endurance test. He is from an electrical/electronics engineering background but has expanded his interest to embedded electronics, embedded programming and front-/back-end programming.

LinkedIn Facebook

Related Article - C++ Vector