Pandas percentage of total with groupby

This is obviously simple, but as a numpy newbe I'm getting stuck.

I have a CSV file that contains 3 columns, the State, the Office ID, and the Sales for that office.

I want to calculate the percentage of sales per office in a given state (total of all percentages in each state is 100%).

    df = pd.DataFrame({'state': ['CA', 'WA', 'CO', 'AZ'] * 3,
                       'office_id': range(1, 7) * 2,
                       'sales': [np.random.randint(100000, 999999)
                                 for _ in range(12)]})

    df.groupby(['state', 'office_id']).agg({'sales': 'sum'})

This returns:

                      sales
    state office_id        
    AZ    2          839507
          4          373917
          6          347225
    CA    1          798585
          3          890850
          5          454423
    CO    1          819975
          3          202969
          5          614011
    WA    2          163942
          4          369858
          6          959285

I can't seem to figure out how to "reach up" to the state level of the groupby to total up the sales for the entire state to calculate the fraction.

Paul H's answer is right that you will have to make a second groupby object, but you can calculate the percentage in a simpler way -- just groupby the state_office and divide the sales column by its sum. Copying the beginning of Paul H's answer:

    # From Paul H
    import numpy as np
    import pandas as pd
    np.random.seed(0)
    df = pd.DataFrame({'state': ['CA', 'WA', 'CO', 'AZ'] * 3,
                       'office_id': list(range(1, 7)) * 2,
                       'sales': [np.random.randint(100000, 999999)
                                 for _ in range(12)]})
    state_office = df.groupby(['state', 'office_id']).agg({'sales': 'sum'})
    # Change: groupby state_office and divide by sum
    state_pcts = state_office.groupby(level=0).apply(lambda x:
                                                     100 * x / float(x.sum()))

Returns:

                         sales
    state office_id           
    AZ    2          16.981365
          4          19.250033
          6          63.768601
    CA    1          19.331879
          3          33.858747
          5          46.809373
    CO    1          36.851857
          3          19.874290
          5          43.273852
    WA    2          34.707233
          4          35.511259
          6          29.781508

From: stackoverflow.com/q/23377108

Back to homepage or read more recommendations: