You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Gradient based algorithms are the default training algorithms for ANN. Hence, providing support for such algorithms (SGD, ADAM, RMSProp, etc.) is critical in order to provide out of the box benchmarking capabilities. Our suggestion is to proceed as follows:
Create a new class in base package (e.g. class derivable : public solution {} that will inherit from solution class). derivable should have an array of floats (e.g. derivable::m_df) that represents the derivative of the solution::fitness() function respect each parameter in solution::get_params(). Hence, derivable::get_df() size will be solution::size().
Define a getter in derivable class (e.g. derivable:df()) that, in case the solution was modified will calculate the derivative of the fitness function and store the result in derivable::m_df array. In case the solution is not modified, it will simply return derivable::m_df. The implementation of this method should be the same as in the current solution::fitness() method.
As in the case of solution::fitness() and solution::calculate_fitness() consider an implementation of derivable:df() and a protected virtual derivable:calculate_df() = 0 method in derivable. It is probably a good idea to do not create derivable::m_df before the first derivable:df() call just in case the derivative gets never used.
Each child of derivable in solutions package should re-implement its own version of virtual derivable:calculate_df() = 0 according to the fitness function (only if the fitness function is derivable of course). This means that the network class should inherit from derivableinstead of solution and implement virtual derivable::calculate_df() = 0.
The network:calculate_df() implementation will call a layer::backprop() method defined in the layer class, passing the position in the derivable::m_df array where the layer will store the derivative of its corresponding parameters. layer::backprop() method should be similar to the current layer::prop() method.
Each child of layer in package layers should re-implement its own version of virtual layer:backprop() = 0. Currently there should be a single layer fc (fully connected layers) implemented in the library.
Create a new class in algorithms package class sgd : public algorithm that using the derivative and the fitness function of a derivable solution can implement Stochastic Gradient Descent Algorithm.
The text was updated successfully, but these errors were encountered:
Gradient based algorithms are the default training algorithms for ANN. Hence, providing support for such algorithms (SGD, ADAM, RMSProp, etc.) is critical in order to provide out of the box benchmarking capabilities. Our suggestion is to proceed as follows:
Create a new class in
base
package (e.g.class derivable : public solution {}
that will inherit fromsolution class
).derivable
should have an array of floats (e.g.derivable::m_df
) that represents the derivative of thesolution::fitness()
function respect each parameter insolution::get_params()
. Hence,derivable::get_df()
size will besolution::size()
.Define a getter in
derivable
class (e.g.derivable:df()
) that, in case the solution was modified will calculate the derivative of the fitness function and store the result inderivable::m_df
array. In case the solution is not modified, it will simply returnderivable::m_df
. The implementation of this method should be the same as in the currentsolution::fitness()
method.As in the case of
solution::fitness()
andsolution::calculate_fitness()
consider an implementation ofderivable:df()
and a protectedvirtual derivable:calculate_df() = 0
method inderivable
. It is probably a good idea to do not createderivable::m_df
before the firstderivable:df()
call just in case the derivative gets never used.Each child of
derivable
insolutions
package should re-implement its own version ofvirtual derivable:calculate_df() = 0
according to the fitness function (only if the fitness function is derivable of course). This means that thenetwork
class should inherit fromderivable
instead ofsolution
and implementvirtual derivable::calculate_df() = 0
.The
network:calculate_df()
implementation will call alayer::backprop()
method defined in the layer class, passing the position in thederivable::m_df
array where the layer will store the derivative of its corresponding parameters.layer::backprop()
method should be similar to the currentlayer::prop()
method.Each child of
layer
in packagelayers
should re-implement its own version ofvirtual layer:backprop() = 0
. Currently there should be a single layerfc
(fully connected layers) implemented in the library.Create a new class in
algorithms
packageclass sgd : public algorithm
that using the derivative and the fitness function of aderivable
solution can implement Stochastic Gradient Descent Algorithm.The text was updated successfully, but these errors were encountered: