How to Convert DataFrame Column to Datetime in Pandas

  1. Pandas to_datetime (pd.to_datetime()) function to convert DataFrame column to Pandas datetime
  2. DataFrame apply Method to Convert DataFrame Column to Datetime
  3. Methods to Convert DataFrame Column to Datetime Performance Comparison

We will introduce methods to convert Pandas DataFrame column to Python Pandas datetime.

  • Pandas to_datetime() function
  • DataFrame apply method to operate on elements in column

We use the same DataFrame below in this article.

>>> from datetime import datetime, timedelta
>>> from pandas import DataFrame
>>> df = DataFrame.from_dict(
    {'Alfa': [1, 2, 3], 
     'Bravo': [4, 5, 6], 
     'Datetime': [datetime.strftime(datetime.now()-timedelta(days=_), "%m/%d/%Y, %H:%M:%S") for _ in range(3)]}, 
    orient='index', 
    columns=['A', 'B', 'C']).T
>>> df
  Alfa Bravo              Datetime
A    1     4  12/07/2019, 14:08:55
B    2     5  12/06/2019, 14:08:55
C    3     6  12/05/2019, 14:08:55

Pandas to_datetime (pd.to_datetime()) function to convert DataFrame column to Pandas datetime

Pandas pd.to_datetime() function converts the given argument to datetime.

pandas.to_datetime(param, format="")

Pandas to_datetime format parameter specifies the pattern of the datetime string. It is the same with the format in stftime or strptime in Python datetime module.

pd.to_datetime() converts Pandas column to Pandas datetime Example

>>> df
  Alfa Bravo              Datetime
A    1     4  12/07/2019, 14:08:55
B    2     5  12/06/2019, 14:08:55
C    3     6  12/05/2019, 14:08:55
>>> df['Datetime'] = pd.to_datetime(df['Datetime'], format="%m/%d/%Y, %H:%M:%S")
>>> df
  Alfa Bravo            Datetime
A    1     4 2019-12-07 14:08:55
B    2     5 2019-12-06 14:08:55
C    3     6 2019-12-05 14:08:55

pd.to_datetime() function doesn’t modify the DataFrame data in-place; therefore we need to assign the returned Pandas Series to the specific DataFrame column.

Pandas to_datetime() (pd.to_datetime()) Function Is Smart to Convert to Datetime

pandas.to_datetime() function could do the conversion to datetime in a smart way without being given the datetime format string. It will find the string pattern automatically and smartly.

>>> df['Datetime'] = pd.to_datetime(df['Datetime'])
>>> df
  Alfa Bravo            Datetime
A    1     4 2019-12-07 14:08:55
B    2     5 2019-12-06 14:08:55
C    3     6 2019-12-05 14:08:55
Warning

Although pd.to_datetime could do its job without giving the format smartly, the conversion speed is much lower than that when the format is given.

We could set the option infer_datetime_format of to_datetime to be True to switch the conversion to a faster mode if the format of the datetime string could be inferred without giving the format string.

It could increase the parsing speed by 5~6 times.

Options When the Input Argument is Not A Valid DateTime String

to_datetime() has the errors parameter to specify the behavior if the given input is not a valid datetime string to be parsed.

errors Option Behaviour
raise An exception will be raised. Default option
coerce NaT is set
ignore invalid parsing returns the input

raise

It raises an exception when the errors option is raise, or it is omitted because raise is the default option.

import pandas as pd
from datetime import datetime, timedelta
from pandas import DataFrame
df = DataFrame.from_dict(
    {'Alfa': [1, 2, 3], 
     'Bravo': [4, 5, 6], 
     'Datetime': [datetime.strftime(datetime.now()-timedelta(days=_), "%m/%d/%Y, %H:%M:%S") for _ in range(3)]}, 
    orient='index', 
    columns=['A', 'B', 'C']).T

df['Datetime'][0] = '32/09/2020, 09:27:02'

df['Datetime'] = pd.to_datetime(df['Datetime'],
                                format="%m/%d/%Y, %H:%M:%S",
                                errors='raise')
print(df)
>>> 
....
....
  File "pandas\_libs\tslibs\strptime.pyx", line 157, in pandas._libs.tslibs.strptime.array_strptime
ValueError: time data '32/09/2020, 09:27:02' does not match format '%m/%d/%Y, %H:%M:%S' (match)
>>> 

coerce

import pandas as pd
from datetime import datetime, timedelta
from pandas import DataFrame
df = DataFrame.from_dict(
    {'Alfa': [1, 2, 3], 
     'Bravo': [4, 5, 6], 
     'Datetime': [datetime.strftime(datetime.now()-timedelta(days=_), "%m/%d/%Y, %H:%M:%S") for _ in range(3)]}, 
    orient='index', 
    columns=['A', 'B', 'C']).T

df['Datetime'][0] = '32/09/2020, 09:27:02'

df['Datetime'] = pd.to_datetime(df['Datetime'],
                                format="%m/%d/%Y, %H:%M:%S",
                                errors='coerce')
print(df)

Output:

  Alfa Bravo            Datetime
A    1     4                 NaT
B    2     5 2020-05-08 09:37:55
C    3     6 2020-05-07 09:37:55

The invalid item is set to be NaT and others are converted correctly.

ignore

If errors is set to be ignore, when any of the items in the column is not valid, then the input column will be returned, even other items are valid datetime string.

import pandas as pd
from datetime import datetime, timedelta
from pandas import DataFrame
df = DataFrame.from_dict(
    {'Alfa': [1, 2, 3], 
     'Bravo': [4, 5, 6], 
     'Datetime': [datetime.strftime(datetime.now()-timedelta(days=_), "%m/%d/%Y, %H:%M:%S") for _ in range(3)]}, 
    orient='index', 
    columns=['A', 'B', 'C']).T

df['Datetime'][0] = '32/09/2020, 09:27:02'

df['Datetime'] = pd.to_datetime(df['Datetime'],
                                format="%m/%d/%Y, %H:%M:%S",
                                errors='ignore')
print(df)

Output:

  Alfa Bravo              Datetime
A    1     4  32/09/2020, 09:27:02
B    2     5  05/08/2020, 09:41:28
C    3     6  05/07/2020, 09:41:28

As shown above, the whole Datetime column is not converted or is ignored.

DataFrame apply Method to Convert DataFrame Column to Datetime

apply(func, *args, **kwds)

apply method of DataFrame applies the function func to each column or row.

We could use lambda function in the place of func for simplicity.

>>> df['Datetime'] = df['Datetime'].apply(lambda _: datetime.strptime(_,"%m/%d/%Y, %H:%M:%S"))
           
>>> df
           
  Alfa Bravo            Datetime
A    1     4 2019-12-07 14:44:35
B    2     5 2019-12-06 14:44:35
C    3     6 2019-12-05 14:44:35

Methods to Convert DataFrame Column to Datetime Performance Comparison

Let’s use timeit to compare the timing performance of both methods introduced in this article.

In [1]: %timeit pd.to_datetime(df['Datetime'], format="%m/%d/%Y, %H:%M:%S")
452 µs ± 85.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [2]: %timeit df['Datetime'].apply(lambda _: datetime.strptime(_,"%m/%d/%Y, %H:%M:%S"))
377 µs ± 58.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

The apply method is even slightly better than Pandas native to_datetime method, with around 80% of the execution time of to_datetime function.

Related Article - Pandas DataFrame

  • How to Sort Pandas DataFrame by One Column's Values
  • How to Convert DataFrame Column to String in Pandas
  • Related Article - Pandas DataFrame Column

  • How to rename columns in Pandas DataFrame
  • How to set value for particular cell in pandas DataFrame using index
  • Related Article - Python Datetime

  • How to Delete Pandas DataFrame Column
  • How to count the NaN occurrences in a column in Pandas Dataframe
  • comments powered by Disqus