get list from pandas dataframe column

I have an excel document which looks like this..

    cluster load_date   budget  actual  fixed_price
    A   1/1/2014    1000    4000    Y
    A   2/1/2014    12000   10000   Y
    A   3/1/2014    36000   2000    Y
    B   4/1/2014    15000   10000   N
    B   4/1/2014    12000   11500   N
    B   4/1/2014    90000   11000   N
    C   7/1/2014    22000   18000   N
    C   8/1/2014    30000   28960   N
    C   9/1/2014    53000   51200   N

I want to be able to return the contents of column 1 - cluster as a list, so I can run a for loop over it, and create an excel worksheet for every cluster.

Is it also possible, to return the contents of a whole row to a list? e.g.

    list = [], list[column1] or list[df.ix(row1)]

Pandas DataFrame columns are Pandas Series when you pull them out, which you can then call x.tolist() on to turn them into a Python list. Alternatively you cast it with list(x).

    import pandas as pd

    d = {'one' : pd.Series([1., 2., 3.],     index=['a', 'b', 'c']),
        'two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}

    df = pd.DataFrame(d)

    print("Starting with this dataframe\n", df)

    print("The first column is a", type(df['one']), "\nconsisting of\n", df['one'])

    dfToList = df['one'].tolist()

    dfList = list(df['one'])

    dfValues = df['one'].values

    print("dfToList is", dfToList, "and it's a", type(dfToList))
    print("dfList is  ", dfList,   "and it's a", type(dfList))
    print("dfValues is", dfValues, "and it's a", type(dfValues))

The last lines return:

    dfToList is [1.0, 2.0, 3.0, nan] and it's a <class 'list'>
    dfList is   [1.0, 2.0, 3.0, nan] and it's a <class 'list'>
    dfValues is [ 1.  2.  3. nan] and it's a <class 'numpy.ndarray'>

This question might be helpful. And the Pandas docs are actually quite good once you get your head around their style.

So in your case you could:

my_list = df["cluster"].tolist()

and then go from there.


Back to homepage or read more recommendations: