You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the code, it seems that when use_cache is enabled, the task results for the same model are saved after the evaluation is completed successfully.
The "save_to_cache" function is called after the evaluation is completed in the code, but if it ends abnormally, there seems to be no way to continue the evaluation from where it left off.
For example, if you are evaluating 10 tasks and it ends abnormally after 80% progress, you have to start the evaluation from the beginning again.
I would greatly appreciate it if you could guide me once again if there is such a feature.
Hi! we do actually have this implemented using `--use_cache <DIR>` to cache the model results while evaluating and skip previously evaluated samples on resumption. Caching is rank-dependent though, so restart with the same GPU count if interrupted! Also have `--cache_requests` so the dataset preprocessing steps can be saved and evaluation can resume quicker.
I should update the README to make these more prominent!
Hi! for example if you run with: lm_eval --model hf --model_args pretrained=EleutherAI/pythia-14m --tasks gsm8k,arc_easy --use_cache test_dir ,
the model outputs will get cached as they are generated. If the evaluation ends prematurely or you just want to calculate the metrics, the next time you run the command with --use_cache test_dir, it will check which samples have already been completed and skip them.
Hi! for example if you run with: lm_eval --model hf --model_args pretrained=EleutherAI/pythia-14m --tasks gsm8k,arc_easy --use_cache test_dir , the model outputs will get cached as they are generated. If the evaluation ends prematurely or you just want to calculate the metrics, the next time you run the command with --use_cache test_dir, it will check which samples have already been completed and skip them.
Thank you for the kind guidance. 😊
It seems there is a bug in the save_to_cache function located in caching/cache.py.
If file_name contains a space (' ') or a slash ('/'), the .pickle file is not created.
In my case, the issue occurred because the cache_key generation process included special characters in tokenizer_name.
You're welcome. Will take a look, but note save_to_cache is used to cache pre-processing steps (mostly data), so the program can start quicker next time around. Results caching is handled by CachingLM:
Thank you for your kind reply.
In the code, it seems that when use_cache is enabled, the task results for the same model are saved after the evaluation is completed successfully.
The "save_to_cache" function is called after the evaluation is completed in the code, but if it ends abnormally, there seems to be no way to continue the evaluation from where it left off.
For example, if you are evaluating 10 tasks and it ends abnormally after 80% progress, you have to start the evaluation from the beginning again.
I would greatly appreciate it if you could guide me once again if there is such a feature.
==========================================================
I should update the README to make these more prominent!
Originally posted by @baberabb in #2533 (comment)
The text was updated successfully, but these errors were encountered: