Drop all duplicate rows in Python Pandas

The pandas drop_duplicates function is great for "uniquifying" a dataframe. However, one of the keyword arguments to pass is take_last=True or take_last=False, while I would like to drop all rows which are duplicates across a subset of columns. Is this possible?

        A   B   C
    0   foo 0   A
    1   foo 1   A
    2   foo 1   B
    3   bar 1   A

As an example, I would like to drop rows which match on columns A and C so this should drop rows 0 and 1.

This is much easier in pandas now with drop_duplicates and the keep parameter.

    import pandas as pd
    df = pd.DataFrame({"A":["foo", "foo", "foo", "bar"], "B":[0,1,1,1], "C":["A","A","B","A"]})
    df.drop_duplicates(subset=['A', 'C'], keep=False)

From: stackoverflow.com/q/23667369

Back to homepage or read more recommendations: