Drop rows containing empty cells from a pandas DataFrame

I have a pd.DataFrame that was created by parsing some excel spreadsheets. A column of which has empty cells. For example, below is the output for the frequency of that column, 32320 records have missing values for Tenant.

       In [67]: value_counts(Tenant,normalize=False)
       Out[67]:
                                  32320
       Thunderhead                8170
       Big Data Others            5700
       Cloud Cruiser              5700
       Partnerpedia               5700
       Comcast                    5700
       SDP                        5700
       Agora                      5700
       dtype: int64

I am trying to drop rows where Tenant is missing, however isnull option does not recognize the missing values.

       In [71]: df['Tenant'].isnull().sum()
       Out[71]: 0

The column has data type "Object". What is happening in this case ? How can I drop records where Tenant is missing?

Pandas will recognise a value as null if it is a np.nan object, which will print as NaN in the DataFrame. Your missing values are probably empty strings, which Pandas doesn't recognise as null. To fix this, you can convert the empty stings (or whatever is in your empty cells) to np.nan objects using replace(), and then call dropna()on your DataFrame to delete rows with null tenants.

To demonstrate, we create a DataFrame with some random values and some empty strings in a Tenants column:

    >>> import pandas as pd
    >>> import numpy as np
    >>> 
    >>> df = pd.DataFrame(np.random.randn(10, 2), columns=list('AB'))
    >>> df['Tenant'] = np.random.choice(['Babar', 'Rataxes', ''], 10)
    >>> print df

              A         B   Tenant
    0 -0.588412 -1.179306    Babar
    1 -0.008562  0.725239         
    2  0.282146  0.421721  Rataxes
    3  0.627611 -0.661126    Babar
    4  0.805304 -0.834214         
    5 -0.514568  1.890647    Babar
    6 -1.188436  0.294792  Rataxes
    7  1.471766 -0.267807    Babar
    8 -1.730745  1.358165  Rataxes
    9  0.066946  0.375640

Now we replace any empty strings in the Tenants column with np.nan objects, like so:

    >>> df['Tenant'].replace('', np.nan, inplace=True)
    >>> print df

              A         B   Tenant
    0 -0.588412 -1.179306    Babar
    1 -0.008562  0.725239      NaN
    2  0.282146  0.421721  Rataxes
    3  0.627611 -0.661126    Babar
    4  0.805304 -0.834214      NaN
    5 -0.514568  1.890647    Babar
    6 -1.188436  0.294792  Rataxes
    7  1.471766 -0.267807    Babar
    8 -1.730745  1.358165  Rataxes
    9  0.066946  0.375640      NaN

Now we can drop the null values:

    >>> df.dropna(subset=['Tenant'], inplace=True)
    >>> print df

              A         B   Tenant
    0 -0.588412 -1.179306    Babar
    2  0.282146  0.421721  Rataxes
    3  0.627611 -0.661126    Babar
    5 -0.514568  1.890647    Babar
    6 -1.188436  0.294792  Rataxes
    7  1.471766 -0.267807    Babar
    8 -1.730745  1.358165  Rataxes

From: stackoverflow.com/q/29314033