Panda's get_dummies vs. Sklearn's OneHotEncoder() :: What are the pros and cons?

I'm learning different methods to convert categorical variables to numeric for machine-learning classifiers. I came across the pd.get_dummies method and sklearn.preprocessing.OneHotEncoder() and I wanted to see how they differed in terms of performance and usage.

I found a tutorial on how to use OneHotEnocder() on https://xgdgsc.wordpress.com/2015/03/20/note-on-using-onehotencoder-in-scikit-learn-to-work-on-categorical-features/ since the sklearn documentation wasn't too helpful on this feature. I have a feeling I'm not doing it correctly...but

Can some explain the pros and cons of usingpd.dummies over sklearn.preprocessing.OneHotEncoder() and vice versa? I know that OneHotEncoder() gives you a sparse matrix but other than that I'm not sure how it is used and what the benefits are over the pandas method. Am I using it inefficiently?

    import pandas as pd
    import numpy as np
    from sklearn.datasets import load_iris
    sns.set()

    %matplotlib inline

    #Iris Plot
    iris = load_iris()
    n_samples, m_features = iris.data.shape

    #Load Data
    X, y = iris.data, iris.target
    D_target_dummy = dict(zip(np.arange(iris.target_names.shape[0]), iris.target_names))

    DF_data = pd.DataFrame(X,columns=iris.feature_names)
    DF_data["target"] = pd.Series(y).map(D_target_dummy)
    #sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)  \
    #0                  5.1               3.5                1.4               0.2   
    #1                  4.9               3.0                1.4               0.2   
    #2                  4.7               3.2                1.3               0.2   
    #3                  4.6               3.1                1.5               0.2   
    #4                  5.0               3.6                1.4               0.2   
    #5                  5.4               3.9                1.7               0.4   

    DF_dummies = pd.get_dummies(DF_data["target"])
    #setosa  versicolor  virginica
    #0         1           0          0
    #1         1           0          0
    #2         1           0          0
    #3         1           0          0
    #4         1           0          0
    #5         1           0          0

    from sklearn.preprocessing import OneHotEncoder, LabelEncoder
    def f1(DF_data):
        Enc_ohe, Enc_label = OneHotEncoder(), LabelEncoder()
        DF_data["Dummies"] = Enc_label.fit_transform(DF_data["target"])
        DF_dummies2 = pd.DataFrame(Enc_ohe.fit_transform(DF_data[["Dummies"]]).todense(), columns = Enc_label.classes_)
        return(DF_dummies2)

    %timeit pd.get_dummies(DF_data["target"])
    #1000 loops, best of 3: 777 µs per loop

    %timeit f1(DF_data)
    #100 loops, best of 3: 2.91 ms per loop

OneHotEncoder cannot process string values directly. If your nominal features are strings, then you need to first map them into integers.

pandas.get_dummies is kind of the opposite. By default, it only converts string columns into one-hot representation, unless columns are specified.

From: stackoverflow.com/q/36631163