shuffling/permutating a DataFrame in pandas

What's a simple and efficient way to shuffle a dataframe in pandas, by rows or by columns? I.e. how to write a function shuffle(df, n, axis=0) that takes a dataframe, a number of shuffles n, and an axis (axis=0 is rows, axis=1 is columns) and returns a copy of the dataframe that has been shuffled n times.

Edit : key is to do this without destroying the row/column labels of the dataframe. If you just shuffle df.index that loses all that information. I want the resulting df to be the same as the original except with the order of rows or order of columns different.

Edit2 : My question was unclear. When I say shuffle the rows, I mean shuffle each row independently. So if you have two columns a and b, I want each row shuffled on its own, so that you don't have the same associations between a and b as you do if you just re-order each row as a whole. Something like:

    for 1...n:
      for each col in df: shuffle column
    return new_df

But hopefully more efficient than naive looping. This does not work for me:

    def shuffle(df, n, axis=0):
            shuffled_df = df.copy()
            for k in range(n):
            return shuffled_df

    df = pandas.DataFrame({'A':range(10), 'B':range(10)})
    shuffle(df, 5)
    In [16]: def shuffle(df, n=1, axis=0):     
        ...:     df = df.copy()
        ...:     for _ in range(n):
        ...:         df.apply(np.random.shuffle, axis=axis)
        ...:     return df

    In [17]: df = pd.DataFrame({'A':range(10), 'B':range(10)})

    In [18]: shuffle(df)

    In [19]: df
       A  B
    0  8  5
    1  1  7
    2  7  3
    3  6  2
    4  3  4
    5  0  1
    6  9  0
    7  4  6
    8  2  8
    9  5  9


Back to homepage or read more recommendations: