How to store a dataframe using Pandas
Right now I'm importing a fairly large
CSV as a dataframe every time I run the script. Is there a good solution for keeping that dataframe constantly available in between runs so I don't have to spend all that time waiting for the script to run?
df.to_pickle(file_name) # where to save it, usually as a .pkl
Then you can load it back using:
df = pd.read_pickle(file_name)
_Note: before 0.11.1
load were the only way to do this (they are now deprecated in favor of
store = HDFStore('store.h5') store['df'] = df # save it store['df'] # load it
More advanced strategies are discussed in thecookbook.