Boolean Indexing with multiple conditions
I have a Pandas DF where I need to filter out some rows that contains values == 0 for feature 'a' and feature 'b'.
In order to inspect the values, I run the following:
DF1 = DF[DF['a'] == 0]
Which returns the right values. Similarly, by doing this:
DF2 = DF[DF['b'] == 0]
I can see the 0 values for feature 'b'.
However, if I try to combine these 2 in a single line of code using the OR operand:
DF3 = DF[DF['a'] == 0 | DF['b'] == 0]
I get this:
TypeError: cannot compare a dtyped [float64] array with a scalar of type [bool]
What's happening here?
You can transform either column 'a' or 'b' so they are both either float64 or bool. However, an easier solution that preserves the data type of your features is this:
DF3 = DF[(DF['a'] == 0) | (DF['b'] == 0)]
A common operation is the use of boolean vectors to filter the data. The operators are: | for or, & for and, and ~ for not. These must be grouped by using parentheses.