Pandas DataFrame: replace all values in a column, based on condition
I have a simple DataFrame like the following:
I want to select all values from the 'First Season' column and replace those that are over 1990 by 1. In this example, only Baltimore Ravens would have the 1996 replaced by 1 (keeping the rest of the data intact).
I have used the following:
df.loc[(df['First Season'] > 1990)] = 1
But, it replaces all the values in that row by 1, and not just the values in the 'First Season' column.
How can I replace just the values from that column?
You need to select that column:
In : df.loc[df['First Season'] > 1990, 'First Season'] = 1 df Out: Team First Season Total Games 0 Dallas Cowboys 1960 894 1 Chicago Bears 1920 1357 2 Green Bay Packers 1921 1339 3 Miami Dolphins 1966 792 4 Baltimore Ravens 1 326 5 San Franciso 49ers 1950 1003
So the syntax here is:
df.loc[<mask>(here mask is generating the labels to index) , <optional column(s)> ]
If you want to generate a boolean indicator then you can just use the boolean condition to generate a boolean Series and cast the dtype to
int this will convert
In : df['First Season'] = (df['First Season'] > 1990).astype(int) df Out: Team First Season Total Games 0 Dallas Cowboys 0 894 1 Chicago Bears 0 1357 2 Green Bay Packers 0 1339 3 Miami Dolphins 0 792 4 Baltimore Ravens 1 326 5 San Franciso 49ers 0 1003
★ Back to homepage or read more recommendations: