Ways to Remove xa0 From a String in Python

Ways to Remove xa0 From a String in Python

Najwa Riyaz Oct-22, 2021 Jul-02, 2021 Python Python String
  1. Use the Unicodedata’s Normalize() Function to Remove \xa0 From a String in Python
  2. Use the String’s replace() Function to Remove \xa0 From a String in Python
  3. Use the BeautifulSoup Library’s get_text() Function With strip Set as True to Remove \xa0 From a String in Python

This article introduces different methods to remove \xa0 from a string in Python.

The \xa0 Unicode represents a hard space or a no-break space in a program. It is represented as   in HTML.

The Python functions that can help to remove \xa0 from a string are as follows.

  • The normalize() function of unicodedata
  • The string’s replace() function
  • The BeautifulSoup library’s get_text() function with strip enabled as True.

Use the Unicodedata’s Normalize() Function to Remove \xa0 From a String in Python

You can use the unicodedata standard library’s normalize() function to remove \xa0 from a string.

The normalize() function is used as follows.

unicodedata.normalize("NFKD", string_to_normalize)

Here, NFKD denotes the normal form KD. It replaces all the compatibility characters with their equivalent characters.

The example program below illustrates this.

import unicodedata

str_hard_space='17\xa0kg on 23rd\xa0June 2021'
print (str_hard_space)
xa=u'\xa0'

if xa in str_hard_space:
    print("xa0 is Found!")
else:
    print("xa0 is not Found!")


new_str = unicodedata.normalize("NFKD", str_hard_space)
print (new_str)
if xa in new_str:
    print("xa0 is Found!")
else:
    print("xa0 is not Found!")

Output:

17 kg on 23rd June 2021
xa0 is Found!
17 kg on 23rd June 2021
xa0 is not Found!

Use the String’s replace() Function to Remove \xa0 From a String in Python

You can use the string’s replace() function to remove \xa0 from a string.

The replace() function is used as follows.

str_hard_space.replace(u'\xa0', u' ')

The below example illustrates this.

str_hard_space='16\xa0kg on 24th\xa0June 2021'
print (str_hard_space)
xa=u'\xa0'

if xa in str_hard_space:
    print("xa0 Found!")
else:
    print("xa0 not Found!")

new_str = str_hard_space.replace(u'\xa0', u' ')
print (new_str)
if xa in new_str:
    print("xa0 Found!")
else:
    print("xa0 not Found!")

Output:

16 kg on 24th June 2021
xa0 Found!
16 kg on 24th June 2021
xa0 not Found!

Use the BeautifulSoup Library’s get_text() Function With strip Set as True to Remove \xa0 From a String in Python

You can use the BeautifulSoup standard library’s get_text() function with strip enabled as True to remove \xa0 from a string.

The get_text() function is used as follows.

clean_html = BeautifulSoup(input_html, "lxml").get_text(strip=True)

The below example illustrates this.

from bs4 import BeautifulSoup
html = 'This is a test message, Hello This is a test message, Hello\xa0here'
print (html)

clean_text = BeautifulSoup(html, "lxml").get_text(strip=True)

print(clean_text)

Output:

Hello, This is a test message, Welcome to this website!
Hello, This is a test message, Welcome to this website!

Related Article - Python String

  • Remove Commas From String in Python
  • Check a String Is Empty in a Pythonic Way
  • Convert a String to Variable Name in Python
  • Remove Whitespace From a String in Python
  • Extract Numbers From a String in Python
  • Convert String to Datetime in Python