(B) bin/moons_benchmark with the cc backend crashes with half-prec overflow {cm:2024-11-24} (B) remove syncing from the data parallel algo: stream-to-stream syncing is now automatic {cm:2024-11-23} (A) cuda backend crashes in bin/moons_benchmark {cm:2024-11-22} (B) figure out why cuda backend parallelism slows down in later epochs {cm:2024-11-25} clean up event hashtables when a stream or device gets synchronized {cm:2024-12-03}