How to change data type of columns in Pandas

  1. to_numaric method to covert columns to numeric values in Pandas
  2. astype() method to convert one type to any other data type
  3. infer_objects() method to convert columns datatype to a more specific type

We will introduce the method to change the data type of columns in Pandas dataframe, and options like to_numaric, as_type and infer objects. We will also discuss how to use the downcasting option with to_numaric.

to_numaric method to covert columns to numeric values in Pandas

to_numeric() is the best way to convert one or more columns of a dataFrame to numeric values. It will also try to change non-numeric objects (such as strings) into integers or floating-point numbers as appropriate.to_numeric() input can be a Series or a column of a dataFrame. If some values can’t be converted to a numeric type, to_numaric() allows us to force non-numeric values to be NaN.

Example Codes:

# python 3.x
import pandas as pd
s = pd.Series([
    '12', '12', '4.7', 'asad', '3.0'])
print(s)
print('------------------------------')
print(pd.to_numeric(s, errors='coerce'))

Output:

0      12
1      12
2     4.7
3    asad
4     3.0
dtype: object
------------------------------
0    12.0
1    12.0
2     4.7
3     NaN
4     3.0
dtype: float64

to_numeric() will give us either an int64 or float64 dtype by default. We can use an option to cast to either integer, signed, unsigned or float:

# python 3.x
import pandas as pd
s = pd.Series([-3, 1, -5])
print(s)
print(pd.to_numeric(s, downcast='integer'))

Output:

0   -3
1    1
2   -5
dtype: int64
0   -3
1    1
2   -5
dtype: int8

astype() method to convert one type to any other data type

The astype() method enables us to be explicit about the dtype we want to convert. We can go from one data type to another by passing the parameter inside astype() method.

Consider the following code:

# python 3.x
import pandas as pd
c = [['x', '1.23', '14.2'], 
     ['y', '20', '0.11'],
     ['z', '3', '10']]
df = pd.DataFrame(
    c, 
    columns=['first', 'second', 'third'])
print(df)
df[['second', 'third']] = 
df[['second', 'third']].astype(float)
print('Converting..................')
print('............................')
print(df)

Output:

  first second third
0     x   1.23  14.2
1     y     20  0.11
2     z      3    10
Converting..................
............................
  first  second  third
0     x    1.23  14.20
1     y   20.00   0.11
2     z    3.00  10.00

infer_objects() method to convert columns datatype to a more specific type

infer_objects()method introduced from Version 0.21.0 of the panda for converting columns of a dataFrame to a more specific data type (soft conversions).

Consider the following code:

# python 3.x
import pandas as pd
df = pd.DataFrame({
    'a': [3, 12, 5], 
    'b': [3.0,2.6,1.1]}, 
     dtype='object')
print(df.dtypes)
df = df.infer_objects()
print('Infering..................')
print('............................')
print(df.dtypes)

Output:

a    object
b    object
dtype: object
Infering..................
............................
a      int64
b    float64
dtype: object
comments powered by Disqus