v0.3.0

jon-tow released this 08 Dec 08:34

62ca184

HuggingFace Datasets Integration

This release integrates HuggingFace datasets as the core dataset management interface, removing previous custom downloaders.

What's Changed

Refactor Task downloading to use HuggingFace.datasets by @jon-tow in #300
Add templates and update docs by @jon-tow in #308
Add dataset features to TriviaQA by @jon-tow in #305
Add SWAG by @jon-tow in #306
Fixes for using lm_eval as a library by @dirkgr in #309
Researcher2 by @researcher2 in #261
Suggested updates for the task guide by @StephenHogg in #301
Add pre-commit by @Mistobaan in #317
Decontam import fix by @jon-tow in #321
Add bootstrap_iters kwarg by @Muennighoff in #322
Update decontamination.md by @researcher2 in #331
Fix key access in squad evaluation metrics by @konstantinschulz in #333
Fix make_disjoint_window for tail case by @richhankins in #336
Manually concat tokenizer revision with subfolder by @jon-tow in #343
[deps] Use minimum versioning for numexpr by @jon-tow in #352
Remove custom datasets that are in HF by @jon-tow in #330
Add TextSynth API by @jon-tow in #299
Add the original LAMBADA dataset by @jon-tow in #357

New Contributors

@dirkgr made their first contribution in #309
@Mistobaan made their first contribution in #317
@konstantinschulz made their first contribution in #333
@richhankins made their first contribution in #336

Full Changelog: v0.2.0...v0.3.0

Contributors

Mistobaan, richhankins, and 6 other contributors

Assets 2