v0.3.0
HuggingFace Datasets Integration
This release integrates HuggingFace datasets
as the core dataset management interface, removing previous custom downloaders.
What's Changed
- Refactor
Task
downloading to useHuggingFace.datasets
by @jon-tow in #300 - Add templates and update docs by @jon-tow in #308
- Add dataset features to
TriviaQA
by @jon-tow in #305 - Add
SWAG
by @jon-tow in #306 - Fixes for using lm_eval as a library by @dirkgr in #309
- Researcher2 by @researcher2 in #261
- Suggested updates for the task guide by @StephenHogg in #301
- Add pre-commit by @Mistobaan in #317
- Decontam import fix by @jon-tow in #321
- Add bootstrap_iters kwarg by @Muennighoff in #322
- Update decontamination.md by @researcher2 in #331
- Fix key access in squad evaluation metrics by @konstantinschulz in #333
- Fix make_disjoint_window for tail case by @richhankins in #336
- Manually concat tokenizer revision with subfolder by @jon-tow in #343
- [deps] Use minimum versioning for
numexpr
by @jon-tow in #352 - Remove custom datasets that are in HF by @jon-tow in #330
- Add
TextSynth
API by @jon-tow in #299 - Add the original
LAMBADA
dataset by @jon-tow in #357
New Contributors
- @dirkgr made their first contribution in #309
- @Mistobaan made their first contribution in #317
- @konstantinschulz made their first contribution in #333
- @richhankins made their first contribution in #336
Full Changelog: v0.2.0...v0.3.0