Pandas conditional creation of a series/dataframe column

I have a dataframe along the lines of the below:

        Type       Set
    1    A          Z
    2    B          Z           
    3    B          X
    4    C          Y

I want to add another column to the dataframe (or generate a series) of the same length as the dataframe (= equal number of records/rows) which sets a colour green if Set = 'Z' and 'red' if Set = otherwise.

What's the best way to do this?

If you only have two choices to select from:

    df['color'] = np.where(df['Set']=='Z', 'green', 'red')

For example,

    import pandas as pd
    import numpy as np

    df = pd.DataFrame({'Type':list('ABBC'), 'Set':list('ZZXY')})
    df['color'] = np.where(df['Set']=='Z', 'green', 'red')
    print(df)

yields

      Set Type  color
    0   Z    A  green
    1   Z    B  green
    2   X    B    red
    3   Y    C    red

If you have more than two conditions then usenp.select. For example, if you want color to be

  • yellow when (df['Set'] == 'Z') & (df['Type'] == 'A')
  • otherwise blue when (df['Set'] == 'Z') & (df['Type'] == 'B')
  • otherwise purple when (df['Type'] == 'B')
  • otherwise black,

then use

    df = pd.DataFrame({'Type':list('ABBC'), 'Set':list('ZZXY')})
    conditions = [
        (df['Set'] == 'Z') & (df['Type'] == 'A'),
        (df['Set'] == 'Z') & (df['Type'] == 'B'),
        (df['Type'] == 'B')]
    choices = ['yellow', 'blue', 'purple']
    df['color'] = np.select(conditions, choices, default='black')
    print(df)

which yields

      Set Type   color
    0   Z    A  yellow
    1   Z    B    blue
    2   X    B  purple
    3   Y    C   black

From: stackoverflow.com/q/19913659

Back to homepage or read more recommendations: