Convert Pandas Column to Datetime
-
Pandas
to_datetime
(pd.to_datetime()
) Function to Convert DataFrame Column to Pandas Datetime -
Use the
apply()
Method to Convert Pandas DataFrame Column to Datetime -
Use the
apply()
Method to Convert Pandas Multiple Columns to Datetime -
Use
Series.astype()
Method to Convert Pandas DataFrame Column to Datetime
We will introduce methods to convert a Pandas column to datetime
. We use the same DataFrame below in the following examples.
from datetime import datetime, timedelta
from pandas import DataFrame
df = DataFrame.from_dict(
{'Alfa': [1, 2, 3],
'Bravo': [4, 5, 6],
'Datetime': [datetime.strftime(datetime.now()-timedelta(days=_),
"%m/%d/%Y, %H:%M:%S") for _ in range(3)]},
orient='index',
columns=['A', 'B', 'C']).T
print(df)
Output:
Alfa Bravo Datetime
A 1 4 12/07/2019, 14:08:55
B 2 5 12/06/2019, 14:08:55
C 3 6 12/05/2019, 14:08:55
Pandas to_datetime
(pd.to_datetime()
) Function to Convert DataFrame Column to Pandas Datetime
Pandas pd.to_datetime()
function converts the given argument to datetime
.
pandas.to_datetime(param, format="")
The format
parameter in the Pandas to_datetime
function specifies the pattern of the datetime
string. It is the same with the format
in stftime
or strptime
in Python datetime
module.
Example of pd.to_datetime()
Converting Pandas Column to datetime
import pandas as pd
from datetime import datetime, timedelta
from pandas import DataFrame
df = DataFrame.from_dict(
{'Alfa': [1, 2, 3],
'Bravo': [4, 5, 6],
'Datetime': [datetime.strftime(datetime.now()-timedelta(days=_),
"%m/%d/%Y, %H:%M:%S") for _ in range(3)]},
orient='index',
columns=['A', 'B', 'C']).T
df['Datetime'] = pd.to_datetime(df['Datetime'], format="%m/%d/%Y, %H:%M:%S")
print(df)
Output:
pd.to_datetime()
function doesn’t modify the DataFrame
in-place; therefore we need to assign the returned Pandas Series
to the specific DataFrame
column.
Pandas to_datetime()
(pd.to_datetime()
) Function Is Smart to Convert to Datetime
pandas.to_datetime()
function could do the conversion to datetime
in a smart way without being given the datetime
format string. It will find the string pattern automatically and smartly.
>>> df['Datetime'] = pd.to_datetime(df['Datetime'])
>>> df
Alfa Bravo Datetime
A 1 4 2019-12-07 14:08:55
B 2 5 2019-12-06 14:08:55
C 3 6 2019-12-05 14:08:55
Although pd.to_datetime
could do its job without given the format
smartly, the conversion speed is much lower than when the format
is given.
We could set the option infer_datetime_format
of to_datetime
to be True
to switch the conversion to a faster mode if the format of the datetime
string could be inferred without giving the format
string.
It could increase the parsing speed by 5~6 times.
Options When the Input Argument Is Not a Valid DateTime
String
to_datetime()
has the errors
parameter to specify the behavior if the given input is not a valid datetime
string to be parsed.
errors Option |
Behaviour |
---|---|
raise |
An exception will be raised. Default option |
coerce |
NaT is set |
ignore |
invalid parsing returns the input |
raise
It raises an exception when the errors
option is raise
or is omitted because raise
is the default option.
import pandas as pd
from datetime import datetime, timedelta
from pandas import DataFrame
df = DataFrame.from_dict(
{'Alfa': [1, 2, 3],
'Bravo': [4, 5, 6],
'Datetime': [datetime.strftime(datetime.now()-timedelta(days=_),
"%m/%d/%Y, %H:%M:%S") for _ in range(3)]},
orient='index',
columns=['A', 'B', 'C']).T
df['Datetime'][0] = '32/09/2020, 09:27:02'
df['Datetime'] = pd.to_datetime(df['Datetime'],
format="%m/%d/%Y, %H:%M:%S",
errors='raise')
print(df)
>>>
....
....
File "pandas\_libs\tslibs\strptime.pyx", line 157, in pandas._libs.tslibs.strptime.array_strptime
ValueError: time data '32/09/2020, 09:27:02' does not match format '%m/%d/%Y, %H:%M:%S' (match)
>>>
coerce
import pandas as pd
from datetime import datetime, timedelta
from pandas import DataFrame
df = DataFrame.from_dict(
{'Alfa': [1, 2, 3],
'Bravo': [4, 5, 6],
'Datetime': [datetime.strftime(datetime.now()-timedelta(days=_),
"%m/%d/%Y, %H:%M:%S") for _ in range(3)]},
orient='index',
columns=['A', 'B', 'C']).T
df['Datetime'][0] = '32/09/2020, 09:27:02'
df['Datetime'] = pd.to_datetime(df['Datetime'],
format="%m/%d/%Y, %H:%M:%S",
errors='coerce')
print(df)
Output:
Alfa Bravo Datetime
A 1 4 NaT
B 2 5 2020-05-08 09:37:55
C 3 6 2020-05-07 09:37:55
The invalid item is set to be NaT
, and others are converted correctly.
ignore
If errors
is set to be ignore
, when any of the column items is not valid, then the input column will be returned, even other items are valid datetime
string.
import pandas as pd
from datetime import datetime, timedelta
from pandas import DataFrame
df = DataFrame.from_dict(
{'Alfa': [1, 2, 3],
'Bravo': [4, 5, 6],
'Datetime': [datetime.strftime(datetime.now()-timedelta(days=_),
"%m/%d/%Y, %H:%M:%S") for _ in range(3)]},
orient='index',
columns=['A', 'B', 'C']).T
df['Datetime'][0] = '32/09/2020, 09:27:02'
df['Datetime'] = pd.to_datetime(df['Datetime'],
format="%m/%d/%Y, %H:%M:%S",
errors='ignore')
print(df)
Output:
Alfa Bravo Datetime
A 1 4 32/09/2020, 09:27:02
B 2 5 05/08/2020, 09:41:28
C 3 6 05/07/2020, 09:41:28
As shown above, the whole Datetime
column is not converted or is ignored.
Use the apply()
Method to Convert Pandas DataFrame Column to Datetime
apply(func, *args, **kwds)
apply
method of Pandas Series
applies the function func
to each column or row.
We could use the lambda
function in the place of func
for simplicity.
>>> df['Datetime'] = df['Datetime'].apply(lambda _: datetime.strptime(_,"%m/%d/%Y, %H:%M:%S"))
>>> df
Alfa Bravo Datetime
A 1 4 2019-12-07 14:44:35
B 2 5 2019-12-06 14:44:35
C 3 6 2019-12-05 14:44:35
Use the apply()
Method to Convert Pandas Multiple Columns to Datetime
If we need to convert Pandas DataFrame multiple columns to datetiime
, we can still use the apply()
method as shown above.
Suppose we have two columns DatetimeA
and DatetimeB
that are datetime
strings.
The function passed to the apply()
method is the pd.to_datetime
function introduced in the first section.
Example code:
import pandas as pd
from datetime import datetime, timedelta
from pandas import DataFrame
df = DataFrame.from_dict(
{'Alfa': [1, 2, 3],
'Bravo': [4, 5, 6],
'DatetimeA': [datetime.strftime(datetime.now()-timedelta(days=_),
"%m/%d/%Y, %H:%M:%S") for _ in range(3)],
'DatetimeB': [datetime.strftime(datetime.now()-timedelta(days=_+3),
"%m/%d/%Y, %H:%M:%S") for _ in range(3)]},
orient='index',
columns=['A', 'B', 'C']).T
print(df)
Output:
Use Series.astype()
Method to Convert Pandas DataFrame Column to Datetime
astype()
method of the Pandas Series
converts the column to another data type. The data type of the datetime
in Pandas is datetime64[ns]
; therefore, datetime64[ns]
shall be given as the parameter in the astype()
method to convert the DataFrame column to datetime
.
import pandas as pd
from datetime import datetime, timedelta
from pandas import DataFrame
df = DataFrame.from_dict(
{'Alfa': [1, 2, 3],
'Bravo': [4, 5, 6],
'Datetime': [datetime.strftime(datetime.now()-timedelta(days=_),
"%m/%d/%Y, %H:%M:%S") for _ in range(3)]},
orient='index',
columns=['A', 'B', 'C']).T
df['Datetime'] = df['Datetime'].astype('datetime64[ns]')
print(df)
Output:
Alfa Bravo Datetime
A 1 4 2020-12-14 22:34:54
B 2 5 2020-12-13 22:34:54
C 3 6 2020-12-12 22:34:54