You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to set up a network in which all of the parameters of one of the linear layers are hard-coded and do not change through training. In other libraries such as PyTorch, one can do this by clearing flag requires_grad on the parameters one wishes to hold fixed. I can't find any equivalent in the dfdx documentation, nor any mention of the terms "non-trainable" or similar.
Does dfdx support this at all? If so, how does one set this up?
The text was updated successfully, but these errors were encountered:
I'm not entirely sure, but I believe you can create a wrapper structure that defines how the forward_mut method behaves (assuming you want to implement a Module), and in that method when using the linear layers that you intend to not train, instead of calling their forward_mut methods you'd call the forward instead. But I'm not sure how you'd need to go about the Tapes on the inputs data, maybe it can be kept the same.
If its possible to isolate the trainable parts of the model, you can just make the optimizer only take the trainable parts as input. Eg if you are only training the last layer, you can make the optimizer only take the last layer.
I would like to set up a network in which all of the parameters of one of the linear layers are hard-coded and do not change through training. In other libraries such as PyTorch, one can do this by clearing flag
requires_grad
on the parameters one wishes to hold fixed. I can't find any equivalent in the dfdx documentation, nor any mention of the terms "non-trainable" or similar.Does dfdx support this at all? If so, how does one set this up?
The text was updated successfully, but these errors were encountered: