"freeze" some variables/scopes in tensorflow: stop_gradient vs passing variables to minimize
I am trying to implement Adversarial NN, which requires to 'freeze' one or the other part of the graph during alternating training minibatches. I.e. there two sub-networks: G and D.
G( Z ) -> Xz D( X ) -> Y
where loss function of
G depends on
First I need to train parameters in D with all G parameters fixed, and then parameters in G with parameters in D fixed. Loss function in first case will be negative loss function in the second case and the update will have to apply to the parameters of whether first or second subnetwork.
I saw that tensorflow has
tf.stop_gradient function. For purpose of training the D (downstream) subnetwork I can use this function to block the gradient flow to
Z -> [ G ] -> tf.stop_gradient(Xz) -> [ D ] -> Y
tf.stop_gradient is very succinctly annotated with no in-line example (and example
seq2seq.py is too long and not that easy to read), but looks like it must be called during the graph creation. Does it imply that if I want to block/unblock gradient flow in alternating batches, I need to re-create and re-initialize the graph model?
Also it seems that one cannot block the gradient flowing through the G (upstream) network by means of
As an alternative I saw that one can pass the list of variables to the optimizer call as
opt_op = opt.minimize(cost, <list of variables>), which would be an easy solution if one could get all variables in the scopes of each subnetwork. Can one get a
<list of variables> for a tf.scope?
The easiest way to achieve this, as you mention in your question, is to create two optimizer operations using separate calls to
opt.minimize(cost, ...). By default, the optimizer will use all of the variables in
tf.trainable_variables(). If you want to filter the variables to a particular scope, you can use the optional
scope argument to
tf.get_collection() as follows:
optimizer = tf.train.AdagradOptimzer(0.01) first_train_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, "scope/prefix/for/first/vars") first_train_op = optimizer.minimize(cost, var_list=first_train_vars) second_train_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, "scope/prefix/for/second/vars") second_train_op = optimizer.minimize(cost, var_list=second_train_vars)