Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stochastic Gradient Descent algorithm #3

Open
7 tasks
jairodelgado opened this issue Jun 4, 2019 · 0 comments
Open
7 tasks

Stochastic Gradient Descent algorithm #3

jairodelgado opened this issue Jun 4, 2019 · 0 comments
Milestone

Comments

@jairodelgado
Copy link
Collaborator

jairodelgado commented Jun 4, 2019

Gradient based algorithms are the default training algorithms for ANN. Hence, providing support for such algorithms (SGD, ADAM, RMSProp, etc.) is critical in order to provide out of the box benchmarking capabilities. Our suggestion is to proceed as follows:

  • Create a new class in base package (e.g. class derivable : public solution {} that will inherit from solution class). derivable should have an array of floats (e.g. derivable::m_df) that represents the derivative of the solution::fitness() function respect each parameter in solution::get_params(). Hence, derivable::get_df() size will be solution::size().

  • Define a getter in derivable class (e.g. derivable:df()) that, in case the solution was modified will calculate the derivative of the fitness function and store the result in derivable::m_df array. In case the solution is not modified, it will simply return derivable::m_df. The implementation of this method should be the same as in the current solution::fitness() method.

  • As in the case of solution::fitness() and solution::calculate_fitness() consider an implementation of derivable:df() and a protected virtual derivable:calculate_df() = 0 method in derivable. It is probably a good idea to do not create derivable::m_df before the first derivable:df() call just in case the derivative gets never used.

  • Each child of derivable in solutions package should re-implement its own version of virtual derivable:calculate_df() = 0 according to the fitness function (only if the fitness function is derivable of course). This means that the network class should inherit from derivableinstead of solution and implement virtual derivable::calculate_df() = 0.

  • The network:calculate_df() implementation will call a layer::backprop() method defined in the layer class, passing the position in the derivable::m_df array where the layer will store the derivative of its corresponding parameters. layer::backprop() method should be similar to the current layer::prop() method.

  • Each child of layer in package layers should re-implement its own version of virtual layer:backprop() = 0. Currently there should be a single layer fc (fully connected layers) implemented in the library.

  • Create a new class in algorithms package class sgd : public algorithm that using the derivative and the fitness function of a derivable solution can implement Stochastic Gradient Descent Algorithm.

@jairodelgado jairodelgado changed the title Support for gradient based optimization algorithms Stochastic Gradient Descent algorithm Jan 26, 2020
@jairodelgado jairodelgado added this to the v.1.1 milestone Feb 8, 2020
@jairodelgado jairodelgado assigned ghost Feb 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant