# pandas iloc vs ix vs loc explanation; how are they different?

Can someone explain how these three methods of slicing are different?

I've seen the docs, and I've seen these answers, but I still find myself unable to explain how the three are different. To me, they seem interchangeable in large part, because they are at the lower levels of slicing.

For example, say we want to get the first five rows of a `DataFrame`

. How is it that all three of these work?

```
df.loc[:5]
df.ix[:5]
df.iloc[:5]
```

Can someone present three cases where the distinction in uses are clearer?

*Note: in pandas version 0.20.0 and above, ix is deprecated and the use of loc and iloc is encouraged instead. I have left the parts of this answer that describe ix intact as a reference for users of earlier versions of pandas. Examples have been added below showing alternatives to ix*.

First, here's a recap of the three methods:

`loc`

gets rows (or columns) with particular*labels*from the index.`iloc`

gets rows (or columns) at particular*positions*in the index (so it only takes integers).`ix`

usually tries to behave like`loc`

but falls back to behaving like`iloc`

if a label is not present in the index.

It's important to note some subtleties that can make `ix`

slightly tricky to use:

if the index is of integer type,

`ix`

will only use label-based indexing and not fall back to position-based indexing. If the label is not in the index, an error is raised.if the index does not contain

*only*integers, then given an integer,`ix`

will immediately use position-based indexing rather than label-based indexing. If however`ix`

is given another type (e.g. a string), it can use label-based indexing.

To illustrate the differences between the three methods, consider the following Series:

```
>>> s = pd.Series(np.nan, index=[49,48,47,46,45, 1, 2, 3, 4, 5])
>>> s
49 NaN
48 NaN
47 NaN
46 NaN
45 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
```

We'll look at slicing with the integer value `3`

.

In this case, `s.iloc[:3]`

returns us the first 3 rows (since it treats 3 as a position) and `s.loc[:3]`

returns us the first 8 rows (since it treats 3 as a label):

```
>>> s.iloc[:3] # slice the first three rows
49 NaN
48 NaN
47 NaN
>>> s.loc[:3] # slice up to and including label 3
49 NaN
48 NaN
47 NaN
46 NaN
45 NaN
1 NaN
2 NaN
3 NaN
>>> s.ix[:3] # the integer is in the index so s.ix[:3] works like loc
49 NaN
48 NaN
47 NaN
46 NaN
45 NaN
1 NaN
2 NaN
3 NaN
```

Notice `s.ix[:3]`

returns the same Series as `s.loc[:3]`

since it looks for the label first rather than working on the position (and the index for `s`

is of integer type).

What if we try with an integer label that isn't in the index (say `6`

)?

Here `s.iloc[:6]`

returns the first 6 rows of the Series as expected. However, `s.loc[:6]`

raises a KeyError since `6`

is not in the index.

```
>>> s.iloc[:6]
49 NaN
48 NaN
47 NaN
46 NaN
45 NaN
1 NaN
>>> s.loc[:6]
KeyError: 6
>>> s.ix[:6]
KeyError: 6
```

As per the subtleties noted above, `s.ix[:6]`

now raises a KeyError because it tries to work like `loc`

but can't find a `6`

in the index. Because our index is of integer type `ix`

doesn't fall back to behaving like `iloc`

.

If, however, our index was of mixed type, given an integer `ix`

would behave like `iloc`

immediately instead of raising a KeyError:

```
>>> s2 = pd.Series(np.nan, index=['a','b','c','d','e', 1, 2, 3, 4, 5])
>>> s2.index.is_mixed() # index is mix of different types
True
>>> s2.ix[:6] # now behaves like iloc given integer
a NaN
b NaN
c NaN
d NaN
e NaN
1 NaN
```

Keep in mind that `ix`

can still accept non-integers and behave like `loc`

:

```
>>> s2.ix[:'c'] # behaves like loc given non-integer
a NaN
b NaN
c NaN
```

As general advice, if you're only indexing using labels, or only indexing using integer positions, stick with `loc`

or `iloc`

to avoid unexpected results - try not use `ix`

.

### Combining position-based and label-based indexing

Sometimes given a DataFrame, you will want to mix label and positional indexing methods for the rows and columns.

For example, consider the following DataFrame. How best to slice the rows up to and including 'c' *and* take the first four columns?

```
>>> df = pd.DataFrame(np.nan,
index=list('abcde'),
columns=['x','y','z', 8, 9])
>>> df
x y z 8 9
a NaN NaN NaN NaN NaN
b NaN NaN NaN NaN NaN
c NaN NaN NaN NaN NaN
d NaN NaN NaN NaN NaN
e NaN NaN NaN NaN NaN
```

In earlier versions of pandas (before 0.20.0) `ix`

lets you do this quite neatly - we can slice the rows by label and the columns by position (note that for the columns, `ix`

will default to position-based slicing since `4`

is not a column name):

```
>>> df.ix[:'c', :4]
x y z 8
a NaN NaN NaN NaN
b NaN NaN NaN NaN
c NaN NaN NaN NaN
```

In later versions of pandas, we can achieve this result using `iloc`

and the help of another method:

```
>>> df.iloc[:df.index.get_loc('c') + 1, :4]
x y z 8
a NaN NaN NaN NaN
b NaN NaN NaN NaN
c NaN NaN NaN NaN
```

`get_loc()`

is an index method meaning "get the position of the label in this index". Note that since slicing with `iloc`

is exclusive of its endpoint, we must add 1 to this value if we want row 'c' as well.

There are further examples in pandas' documentation here.

From: stackoverflow.com/q/31593201