Non-trainable parameters? #921

hovinen · 2024-03-01T16:14:05Z

I would like to set up a network in which all of the parameters of one of the linear layers are hard-coded and do not change through training. In other libraries such as PyTorch, one can do this by clearing flag requires_grad on the parameters one wishes to hold fixed. I can't find any equivalent in the dfdx documentation, nor any mention of the terms "non-trainable" or similar.

Does dfdx support this at all? If so, how does one set this up?

The text was updated successfully, but these errors were encountered:

swfsql · 2024-03-01T16:46:57Z

I'm not entirely sure, but I believe you can create a wrapper structure that defines how the forward_mut method behaves (assuming you want to implement a Module), and in that method when using the linear layers that you intend to not train, instead of calling their forward_mut methods you'd call the forward instead. But I'm not sure how you'd need to go about the Tapes on the inputs data, maybe it can be kept the same.

opfromthestart · 2024-08-06T00:24:03Z

If its possible to isolate the trainable parts of the model, you can just make the optimizer only take the trainable parts as input. Eg if you are only training the last layer, you can make the optimizer only take the last layer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-trainable parameters? #921

Non-trainable parameters? #921

hovinen commented Mar 1, 2024

swfsql commented Mar 1, 2024 •

edited

Loading

opfromthestart commented Aug 6, 2024

Non-trainable parameters? #921

Non-trainable parameters? #921

Comments

hovinen commented Mar 1, 2024

swfsql commented Mar 1, 2024 • edited Loading

opfromthestart commented Aug 6, 2024

swfsql commented Mar 1, 2024 •

edited

Loading