Raw String and Unicode String in Python

Raw String and Unicode String in Python

Neema Muganga Jan-23, 2022 Oct-26, 2021 Python Python String
  1. Raw String in Python
  2. Python Unicode String

Raw String in Python

Raw string literals in Python define normal strings that are prefixed with either an r or R before the opening quote. If a backslash (\) is in the string, the raw string treats this character as a literal character but not an escape character.

For example,




It is required to double every backslash when defining a string so that it is not mistaken as the beginning of an escape sequence like a new-line, or the new-tab. We see such syntax application in the syntax of regular expressions and when expressing Windows file paths.

<div class="panel panel-primary panel-warning">
<div class="panel-heading">Note</div>
<div class="panel-body">

r'\' will raise a syntax error because r treats the backslash as a literal. Without the r prefix, the backslash is treated as an escape character.





Without the raw string flag r, the backslash is treated as an escape character, so when the above string is printed, the new line escape sequence is generated. Hence the two strings in the text are printed out on separate lines, as displayed in the output.

Using the same text example, add the r prefix before the string.





From the output, the raw string flag treats the backslash as a literal and prints out the text with the backslash included. So, the input and output are both the same because the backslash character is not escaped.

For instance, '\\n' and r'\n' have the same value.


Python Unicode String

Unicode is one way of storing python strings. Unicode can store strings from all language types. The second way is the ASCII type of string storage represented as str in Python. str is the default data type to store strings in Python.

To convert a string to Unicode type, put a u before the text like this - u'string' or call the unicode() function like this - unicode('string').

u'text' is a Unicode string while text is a byte string. A Unicode object takes more memory space.

For example,

test = u"一二三"



Related Article - Python String

  • Remove Commas From String in Python
  • Check a String Is Empty in a Pythonic Way
  • Convert a String to Variable Name in Python
  • Remove Whitespace From a String in Python
  • Extract Numbers From a String in Python
  • Convert String to Datetime in Python