Compare two columns using pandas

Using this as a starting point:

    a = [['10', '1.2', '4.2'], ['15', '70', '0.03'], ['8', '5', '0']]
    df = pd.DataFrame(a, columns=['one', 'two', 'three'])

    Out[8]: 
      one  two three
    0   10  1.2   4.2
    1   15  70   0.03
    2    8   5     0

I want to use something like an if statement within pandas.

    if df['one'] >= df['two'] and df['one'] <= df['three']:
        df['que'] = df['one']

Basically, check each row via the if statement, create new column.

The docs say to use .all but there is no example...

You could use np.where. If cond is a boolean array, and A and B are arrays, then

    C = np.where(cond, A, B)

defines C to be equal to A where cond is True, and B where cond is False.

    import numpy as np
    import pandas as pd

    a = [['10', '1.2', '4.2'], ['15', '70', '0.03'], ['8', '5', '0']]
    df = pd.DataFrame(a, columns=['one', 'two', 'three'])

    df['que'] = np.where((df['one'] >= df['two']) & (df['one'] <= df['three'])
                         , df['one'], np.nan)

yields

      one  two three  que
    0  10  1.2   4.2   10
    1  15   70  0.03  NaN
    2   8    5     0  NaN

If you have more than one condition, then you could use np.select instead. For example, if you wish df['que'] to equal df['two'] when df['one'] < df['two'], then

    conditions = [
        (df['one'] >= df['two']) & (df['one'] <= df['three']), 
        df['one'] < df['two']]

    choices = [df['one'], df['two']]

    df['que'] = np.select(conditions, choices, default=np.nan)

yields

      one  two three  que
    0  10  1.2   4.2   10
    1  15   70  0.03   70
    2   8    5     0  NaN

If we can assume that df['one'] >= df['two'] when df['one'] < df['two'] is False, then the conditions and choices could be simplified to

    conditions = [
        df['one'] < df['two'],
        df['one'] <= df['three']]

    choices = [df['two'], df['one']]

(The assumption may not be true if df['one'] or df['two'] contain NaNs.)

Note that

    a = [['10', '1.2', '4.2'], ['15', '70', '0.03'], ['8', '5', '0']]
    df = pd.DataFrame(a, columns=['one', 'two', 'three'])

defines a DataFrame with string values. Since they look numeric, you might be better off converting those strings to floats:

    df2 = df.astype(float)

This changes the results, however, since strings compare character-by-character, while floats are compared numerically.

    In [61]: '10' <= '4.2'
    Out[61]: True

    In [62]: 10 <= 4.2
    Out[62]: False

From: stackoverflow.com/q/27474921