Pandas sum by groupby, but exclude certain columns

What is the best way to do a groupby on a Pandas dataframe, but exclude some columns from that groupby? E.g. I have the foll. dataframe:

    Code    Country Item_Code   Item    Ele_Code    Unit    Y1961   Y1962   Y1963
    2   Afghanistan 15          Wheat   5312        Ha      10       20      30
    2   Afghanistan 25          Maize   5312        Ha      10       20      30
    4   Angola      15          Wheat   7312        Ha      30       40      50
    4   Angola      25          Maize   7312        Ha      30       40      50

I want to groupby the column Country and Item_Code and only compute the sum of the rows falling under the columns Y1961, Y1962 and Y1963. The resulting dataframe should look like this:

    Code    Country Item_Code   Item    Ele_Code    Unit    Y1961   Y1962   Y1963
        2   Afghanistan 15      C3      5312        Ha      20       40      60
        4   Angola      25      C4      7312        Ha      60       80      100

Right now, I am doing this:

    df.groupby('Country').sum()

However, this adds up the values in the Item_Code column as well. Is there any way I can specify which columns to include in the sum() operation and which ones to exclude?

You can select the columns of a groupby:

    In [11]: df.groupby(['Country', 'Item_Code'])[["Y1961", "Y1962", "Y1963"]].sum()
    Out[11]:
                           Y1961  Y1962  Y1963
    Country     Item_Code
    Afghanistan 15            10     20     30
                25            10     20     30
    Angola      15            30     40     50
                25            30     40     50

Note that the list passed must be a subset of the columns otherwise you'll see a KeyError.

From: stackoverflow.com/q/32751229

Back to homepage or read more recommendations: