Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Reward EOS close to max_tokens #2694

Open
2 tasks done
komninoschatzipapas opened this issue Jan 1, 2025 · 0 comments
Open
2 tasks done

[Feature] Reward EOS close to max_tokens #2694

komninoschatzipapas opened this issue Jan 1, 2025 · 0 comments

Comments

@komninoschatzipapas
Copy link

Checklist

Motivation

A lot of times when you're dealing with constained max_tokens sampling, you end up getting cut-off responses. I believe if we were to have a penalizer start rewarding the EOS token more and more as you get closer to max_tokens, this would help LLMs create answers that are both complete and within the token limit.

The reward would start at 0% on the 0th token and and end up at 100% for the max_tokensth token. We would presumably want to scale this exponentially to avoid getting very short responses that are much shorter than max_tokens.

Unfortunately my Python skills are not sufficient to implement this so I'm creating a feature request instead.

Related resources

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant