adding multiple columns to pandas simultaneously

I'm new to pandas and trying to figure out how to add multiple columns to pandas simultaneously. Any help here is appreciated. Ideally I would like to do this in one step rather than multiple repeated steps...

    import pandas as pd

    df = {'col_1': [0, 1, 2, 3],
            'col_2': [4, 5, 6, 7]}
    df = pd.DataFrame(df)

    df[[ 'column_new_1', 'column_new_2','column_new_3']] = [np.nan, 'dogs',3]  #thought this would work here...

I would have expected your syntax to work too. The problem arises because when you create new columns with the column-list syntax (df[[new1, new2]] = ...), pandas requires that the right hand side be a DataFrame (note that it doesn't actually matter if the columns of the DataFrame have the same names as the columns you are creating).

Your syntax works fine for assigning scalar values to existing columns, and pandas is also happy to assign scalar values to a new column using the single-column syntax (df[new1] = ...). So the solution is either to convert this into several single-column assignments, or create a suitable DataFrame for the right-hand side.

Here are several approaches that will work:

    import pandas as pd
    import numpy as np

    df = pd.DataFrame({
        'col_1': [0, 1, 2, 3],
        'col_2': [4, 5, 6, 7]
    })

Then one of the following:

(1) Technically this is three steps, but it looks like one step:

    df['column_new_1'], df['column_new_2'], df['column_new_3'] = [np.nan, 'dogs', 3]

(2) DataFrame conveniently expands a single row to match the index, so you can do this:

    df[['column_new_1', 'column_new_2', 'column_new_3']] = pd.DataFrame([[np.nan, 'dogs', 3]], index=df.index)

(3) This would work well if you make a temporary data frame with new columns, then combine with the original data frame later:

    df = pd.concat(
        [
            df,
            pd.DataFrame(
                [[np.nan, 'dogs', 3]], 
                index=df.index, 
                columns=['column_new_1', 'column_new_2', 'column_new_3']
            )
        ], axis=1
    )

(4) Similar to the previous, but using join instead of concat (may be less efficient):

    df = df.join(pd.DataFrame(
        [[np.nan, 'dogs', 3]], 
        index=df.index, 
        columns=['column_new_1', 'column_new_2', 'column_new_3']
    ))

(5) This is a more "natural" way to create the new data frame than the previous two, but the new columns will be sorted alphabetically (at least before Python 3.6 or 3.7):

    df = df.join(pd.DataFrame(
        {
            'column_new_1': np.nan,
            'column_new_2': 'dogs',
            'column_new_3': 3
        }, index=df.index
    ))

(6) I like this variant on @zero's answer a lot, but like the previous one, the new columns will always be sorted alphabetically, at least with early versions of Python:

    df = df.assign(column_new_1=np.nan, column_new_2='dogs', column_new_3=3)

(7) This is interesting (based on https://stackoverflow.com/a/44951376/3830997), but I don't know when it would be worth the trouble:

    new_cols = ['column_new_1', 'column_new_2', 'column_new_3']
    new_vals = [np.nan, 'dogs', 3]
    df = df.reindex(columns=df.columns.tolist() + new_cols)   # add empty cols
    df[new_cols] = new_vals  # multi-column assignment works for existing cols

(8) In the end it's hard to beat this:

    df['column_new_1'] = np.nan
    df['column_new_2'] = 'dogs'
    df['column_new_3'] = 3

Note: many of these options have already been covered in other answers: Add multiple columns to DataFrame and set them equal to an existing column, Is it possible to add several columns at once to a pandas DataFrame?, Pandas: Add multiple empty columns to DataFrame

From: stackoverflow.com/q/39050539

Back to homepage or read more recommendations: