Releases · ahrefs/ocannl

03 Jun 17:50

lukstafi

v0.2.0

60ee882

The "device memory" concept for multicore

Treats the C function stack of the monolithic update step as a "device memory". There is no explicit synchronization; instead, we implement "update on host" where needed: updates that would affect other tasks happen directly on the host (updating, e.g. adding to, the host's value of a tensor cell rather than its task-local copy which might be stale).

Assets 2

12 May 13:21

lukstafi

v0.1.2

c38030f

Parallel computations (multicore SGD)

Attempt at parallelizing for multicore, failed in that the Gccjit backend computations are bottlenecked by memory accesses.
Further work in this direction would need to e.g. copy the relevant sub-tensors for each of the parallel tasks.

Assets 2

06 May 21:19

lukstafi

v0.1.1

7bf3d7a

Virtual nodes and constants inlining

CPU single thread with in-lining optimizations. Operators: arithmetic, power (non-differentiable exponent), ReLu. Shape inference: pointwise; transpose; compose; extended einsum (arbitrary permuting and summing-out of individual or matched axes, pointwise ellipsis, broadcasting); dynamic indexing with inner-product-like (pointwise) and outer-product-like variants. Backends: interpreter with tracing, compiled by ocamlopt, compiled in-process by gccjit. Optimizations: virtual nodes -- when cells of a tensor are not "recurrent" (accessed across steps) and not accessed too many times, in-lines the computation and does not materialize tensors; scalar constant subexpression elimination -- for 1D constant tensors, computes the subexpression at compile time and in-lines the value. Text-based visualization: tensors with up to 5 varying axes (other axes fixed), computation graphs with side-by-side subtree layout, plotting "line" graphs and decision boundaries, benchmark tables.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ahrefs/ocannl

The "device memory" concept for multicore

Parallel computations (multicore SGD)

Virtual nodes and constants inlining