Skip to content

Work Around for Pandas Error “datetime64 type does not support sum operations”

This week, I needed to update an older python script and in attempting to run the script, I ran into the error shown below.

This works in PRD because it’s running on pandas version 1.1.5.

This is a known bug introduced after pandas 1.5.3, which came out in 2023. The current version is version 2.3.3, so this problem has been around for a while.

What Causes It

With this bug in pandas, you cannot use the group by function on a data frame that contains datetime columns. It doesn’t matter if you are not using the datetime column in the group by. The mere presence of a datetime field enrages pandas and causes it to error.

Workaround

You have a couple of options depending on what you need and don’t need in the data set.

  1. I was converting both the period_key and production_month fields to date time in lines 164 and 165. I don’t need to do that in order for the group by to work, so I can just move the data type conversion to after the group by or comment it out.
  2. If you don’t need the fields at all, dropping them (as shown in line 173) prior to the group by will also work.
  3. You can convert the datetime to a string using, and depending on future transformations, you might need to change it back after the group by. The second dataset I worked with, I went with this method using a loop because my dataset had 95 column.
  4. Or, if you need to use the datetime in the group by, you can try the following….(annoying but it will work).
    • Break the date time fields into year, month, day
    • Drop the original fields
    • Use the components in the group by
    • Recreate the original fields and add back to the data frame

#break into pieces
accruals[‘year’] = pd.DatetimeIndex(accruals[‘period_key’]).year
accruals[‘month’] = pd.DatetimeIndex(accruals[‘period_key’]).month
accruals[‘day’] = pd.DatetimeIndex(accruals[‘period_key’]).day

#drop original
accruals = accruals.drop(columns=[‘account’, ‘period_key’])

#use pieces in group by
accruals = accruals.groupby([‘budget_node’, ‘year’, ‘month’, ‘day’, ‘category’]).sum().reset_index().copy(deep=True)

#recreate the period_key field
accruals[‘period_key’] = accruals.apply(lambda row: f”{row[‘year’]:04d}-{row[‘month’]:02d}-{row[‘day’]:02d}”, axis=1)

Hopefully, one of these works for you.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.