How to apply gradient clipping in TensorFlow?

Considering the example code.

I would like to know How to apply gradient clipping on this network on the RNN where there is a possibility of exploding gradients.

    tf.clip_by_value(t, clip_value_min, clip_value_max, name=None)

This is an example that could be used but where do I introduce this ? In the def of RNN

        lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0)
        # Split data because rnn cell needs a list of inputs for the RNN inner loop
        _X = tf.split(0, n_steps, _X) # n_steps
    tf.clip_by_value(_X, -1, 1, name=None)

But this doesn't make sense as the tensor _X is the input and not the grad what is to be clipped?

Do I have to define my own Optimizer for this or is there a simpler option?

Gradient clipping needs to happen after computing the gradients, but before applying them to update the model's parameters. In your example, both of those things are handled by the AdamOptimizer.minimize() method.

In order to clip your gradients you'll need to explicitly compute, clip, and apply them as described in this section in TensorFlow's API documentation. Specifically you'll need to substitute the call to the minimize() method with something like the following:

    optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
    gvs = optimizer.compute_gradients(cost)
    capped_gvs = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gvs]
    train_op = optimizer.apply_gradients(capped_gvs)

From: stackoverflow.com/q/36498127

Back to homepage or read more recommendations: