Extracting just Month and Year from Pandas Datetime column

I have a Dataframe, df, with the following column:

    df['ArrivalDate'] =
    936   2012-12-31
    938   2012-12-29
    965   2012-12-31
    966   2012-12-31
    967   2012-12-31
    968   2012-12-31
    969   2012-12-31
    970   2012-12-29
    971   2012-12-31
    972   2012-12-29
    973   2012-12-29

The elements of the column are pandas.tslib.Timestamp.

I want to just include the year and month. I thought there would be simple way to do it, but I can't figure it out.

Here's what I've tried:

    df['ArrivalDate'].resample('M', how = 'mean')

I got the following error:

    Only valid with DatetimeIndex or PeriodIndex

Then I tried:


I got the following error:

    'Timestamp' object has no attribute '__getitem__'

Any suggestions?

Edit: I sort of figured it out.

    df.index = df['ArrivalDate']

Then, I can resample another column using the index.

But I'd still like a method for reconfiguring the entire column. Any ideas?

You can directly access the year and month attributes, or request a datetime.datetime:

    In [15]: t = pandas.tslib.Timestamp.now()

    In [16]: t
    Out[16]: Timestamp('2014-08-05 14:49:39.643701', tz=None)

    In [17]: t.to_datetime()
    Out[17]: datetime.datetime(2014, 8, 5, 14, 49, 39, 643701)

    In [18]: t.day
    Out[18]: 5

    In [19]: t.month
    Out[19]: 8

    In [20]: t.year
    Out[20]: 2014

One way to combine year and month is to make an integer encoding them, such as: 201408 for August, 2014. Along a whole column, you could do this as:

    df['YearMonth'] = df['ArrivalDate'].map(lambda x: 100*x.year + x.month)

or many variants thereof.

I'm not a big fan of doing this, though, since it makes date alignment and arithmetic painful later and especially painful for others who come upon your code or data without this same convention. A better way is to choose a day-of-month convention, such as final non-US-holiday weekday, or first day, etc., and leave the data in a date/time format with the chosen date convention.

The calendar module is useful for obtaining the number value of certain days such as the final weekday. Then you could do something like:

    import calendar
    import datetime
    df['AdjustedDateToEndOfMonth'] = df['ArrivalDate'].map(
        lambda x: datetime.datetime(
            max(calendar.monthcalendar(x.year, x.month)[-1][:5])

If you happen to be looking for a way to solve the simpler problem of just formatting the datetime column into some stringified representation, for that you can just make use of the strftime function from the datetime.datetime class, like this:

    In [5]: df
    0 2014-10-17 22:00:03

    In [6]: df.date_time
    0   2014-10-17 22:00:03
    Name: date_time, dtype: datetime64[ns]

    In [7]: df.date_time.map(lambda x: x.strftime('%Y-%m-%d'))
    0    2014-10-17
    Name: date_time, dtype: object

From: stackoverflow.com/q/25146121