Pandas Convert String to Numeric Type

  1. pandas.to_numeric() Method
  2. Convert String Values of Pandas DataFrame to Numeric Type Using the pandas.to_numeric() Method
  3. Convert String Values of Pandas DataFrame to Numeric Type With Other Characters in It

This tutorial explains how we can convert string values of Pandas DataFrame to numeric type using the pandas.to_numeric() method.

import pandas as pd

items_df=pd.DataFrame({
    'Id':[302,504,708,103,343,565],
    'Name':['Watch','Camera','Phone','Shoes','Laptop','Bed'],
    'Cost':["300","400","350","100","1000","400"],
    
})

print(items_df)

Output:

    Id    Name  Cost
0  302   Watch   300
1  504  Camera   400
2  708   Phone   350
3  103   Shoes   100
4  343  Laptop  1000
5  565     Bed   400

We will use the above example to demonstrate how we can change the values of DataFrame to the numeric type.

pandas.to_numeric() Method

Syntax

pandas.to_numeric(arg, 
                  errors='raise', 
                  downcast=None)

It converts the argument passed as arg to the numeric type. By default, the arg will be converted to int64 or float64. We can set the value for the downcast parameter to convert the arg to other datatypes.

Convert String Values of Pandas DataFrame to Numeric Type Using the pandas.to_numeric() Method

import pandas as pd

items_df = pd.DataFrame({
    'Id': [302, 504, 708, 103, 343, 565],
    'Name': ['Watch', 'Camera', 'Phone', 'Shoes', 'Laptop', 'Bed'],
    'Cost': ["300", "400", "350", "100", "1000", "400"],

})

print("The items DataFrame is:")
print(items_df, "\n")

print("Datatype of Cost column before type conversion:")
print(items_df["Cost"].dtypes, "\n")

items_df["Cost"] = pd.to_numeric(items_df["Cost"])
print("Datatype of Cost column after type conversion:")
print(items_df["Cost"].dtypes)

Output:

The items DataFrame is:
    Id    Name  Cost
0  302   Watch   300
1  504  Camera   400
2  708   Phone   350
3  103   Shoes   100
4  343  Laptop  1000
5  565     Bed   400 

Datatype of Cost column before type conversion:
object 

Datatype of Cost column after type conversion:
int64

It converts the data type of the Cost column of the items_df from object to int64.

Convert String Values of Pandas DataFrame to Numeric Type With Other Characters in It

If we want to convert a column to a numeric type with values with some characters in it, we get an error saying ValueError: Unable to parse string. In such cases, we can remove all the non-numeric characters and then perform type conversion.

import pandas as pd

items_df = pd.DataFrame({
    'Id': [302, 504, 708, 103, 343, 565],
    'Name': ['Watch', 'Camera', 'Phone', 'Shoes', 'Laptop', 'Bed'],
    'Cost': ["$300", "$400", "$350", "$100", "$1000", "$400"],

})

print("The items DataFrame is:")
print(items_df, "\n")

print("Datatype of Cost column before type conversion:")
print(items_df["Cost"].dtypes, "\n")

items_df["Cost"] = pd.to_numeric(items_df["Cost"].str.replace('$', ''))
print("Datatype of Cost column after type conversion:")
print(items_df["Cost"].dtypes, "\n")

print("DataFrame after Type Conversion:")
print(items_df)

Output:

The items DataFrame is:
    Id    Name   Cost
0  302   Watch   $300
1  504  Camera   $400
2  708   Phone   $350
3  103   Shoes   $100
4  343  Laptop  $1000
5  565     Bed   $400 

Datatype of Cost column before type conversion:
object 

Datatype of Cost column after type conversion:
int64 

DataFrame after Type Conversion:
    Id    Name  Cost
0  302   Watch   300
1  504  Camera   400
2  708   Phone   350
3  103   Shoes   100
4  343  Laptop  1000
5  565     Bed   400

It removes the $ character attached with the Cost column’s values and then converts these values into the numeric type using the pandas.to_numeric() method.

Contribute
DelftStack is a collective effort contributed by software geeks like you. If you like the article and would like to contribute to DelftStack by writing paid articles, you can check the write for us page.