# What is the role of TimeDistributed layer in Keras?

I am trying to grasp what TimeDistributed wrapper does in Keras.

I get that TimeDistributed "applies a layer to every temporal slice of an input."

But I did some experiment and got the results that I cannot understand.

In short, in connection to LSTM layer, TimeDistributed and just Dense layer bear same results.

```
model = Sequential()
model.add(LSTM(5, input_shape = (10, 20), return_sequences = True))
model.add(TimeDistributed(Dense(1)))
print(model.output_shape)
model = Sequential()
model.add(LSTM(5, input_shape = (10, 20), return_sequences = True))
model.add((Dense(1)))
print(model.output_shape)
```

For both models, I got output shape of **(None, 10, 1)**.

Can anyone explain the difference between TimeDistributed and Dense layer after an RNN layer?

In `keras`

- while building a sequential model - usually the second dimension (one after sample dimension) - is related to a `time`

dimension. This means that if for example, your data is `5-dim`

with `(sample, time, width, length, channel)`

you could apply a convolutional layer using `TimeDistributed`

(which is applicable to `4-dim`

with `(sample, width, length, channel)`

) along a time dimension (applying the same layer to each time slice) in order to obtain `5-d`

output.

The case with `Dense`

is that in `keras`

from version 2.0 `Dense`

is by default applied to only last dimension (e.g. if you apply `Dense(10)`

to input with shape `(n, m, o, p)`

you'll get output with shape `(n, m, o, 10)`

) so in your case `Dense`

and `TimeDistributed(Dense)`

are equivalent.

From: stackoverflow.com/q/47305618