How to import data from mongodb to pandas?

I have a large amount of data in a collection in mongodb which I need to analyze. How do i import that data to pandas?

I am new to pandas and numpy.

EDIT: The mongodb collection contains sensor values tagged with date and time. The sensor values are of float datatype.

Sample Data:

    {
    "_cls" : "SensorReport",
    "_id" : ObjectId("515a963b78f6a035d9fa531b"),
    "_types" : [
        "SensorReport"
    ],
    "Readings" : [
        {
            "a" : 0.958069536790466,
            "_types" : [
                "Reading"
            ],
            "ReadingUpdatedDate" : ISODate("2013-04-02T08:26:35.297Z"),
            "b" : 6.296118156595,
            "_cls" : "Reading"
        },
        {
            "a" : 0.95574014778624,
            "_types" : [
                "Reading"
            ],
            "ReadingUpdatedDate" : ISODate("2013-04-02T08:27:09.963Z"),
            "b" : 6.29651468650064,
            "_cls" : "Reading"
        },
        {
            "a" : 0.953648289182713,
            "_types" : [
                "Reading"
            ],
            "ReadingUpdatedDate" : ISODate("2013-04-02T08:27:37.545Z"),
            "b" : 7.29679823731148,
            "_cls" : "Reading"
        },
        {
            "a" : 0.955931884300997,
            "_types" : [
                "Reading"
            ],
            "ReadingUpdatedDate" : ISODate("2013-04-02T08:28:21.369Z"),
            "b" : 6.29642922525632,
            "_cls" : "Reading"
        },
        {
            "a" : 0.95821381,
            "_types" : [
                "Reading"
            ],
            "ReadingUpdatedDate" : ISODate("2013-04-02T08:41:20.801Z"),
            "b" : 7.28956613,
            "_cls" : "Reading"
        },
        {
            "a" : 4.95821335,
            "_types" : [
                "Reading"
            ],
            "ReadingUpdatedDate" : ISODate("2013-04-02T08:41:36.931Z"),
            "b" : 6.28956574,
            "_cls" : "Reading"
        },
        {
            "a" : 9.95821341,
            "_types" : [
                "Reading"
            ],
            "ReadingUpdatedDate" : ISODate("2013-04-02T08:42:09.971Z"),
            "b" : 0.28956488,
            "_cls" : "Reading"
        },
        {
            "a" : 1.95667927,
            "_types" : [
                "Reading"
            ],
            "ReadingUpdatedDate" : ISODate("2013-04-02T08:43:55.463Z"),
            "b" : 0.29115237,
            "_cls" : "Reading"
        }
    ],
    "latestReportTime" : ISODate("2013-04-02T08:43:55.463Z"),
    "sensorName" : "56847890-0",
    "reportCount" : 8
    }

pymongo might give you a hand, followings are some codes I'm using:

    import pandas as pd
    from pymongo import MongoClient


    def _connect_mongo(host, port, username, password, db):
        """ A util for making a connection to mongo """

        if username and password:
            mongo_uri = 'mongodb://%s:%s@%s:%s/%s' % (username, password, host, port, db)
            conn = MongoClient(mongo_uri)
        else:
            conn = MongoClient(host, port)


        return conn[db]


    def read_mongo(db, collection, query={}, host='localhost', port=27017, username=None, password=None, no_id=True):
        """ Read from Mongo and Store into DataFrame """

        # Connect to MongoDB
        db = _connect_mongo(host=host, port=port, username=username, password=password, db=db)

        # Make a query to the specific DB and Collection
        cursor = db[collection].find(query)

        # Expand the cursor and construct the DataFrame
        df =  pd.DataFrame(list(cursor))

        # Delete the _id
        if no_id:
            del df['_id']

        return df

From: stackoverflow.com/q/16249736

Back to homepage or read more recommendations: