[Feature] Reward EOS close to max_tokens #2694

komninoschatzipapas · 2025-01-01T12:26:09Z

Checklist

1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
2. Please use English, otherwise it will be closed.

Motivation

A lot of times when you're dealing with constained max_tokens sampling, you end up getting cut-off responses. I believe if we were to have a penalizer start rewarding the EOS token more and more as you get closer to max_tokens, this would help LLMs create answers that are both complete and within the token limit.

The reward would start at 0% on the 0th token and and end up at 100% for the max_tokensth token. We would presumably want to scale this exponentially to avoid getting very short responses that are much shorter than max_tokens.

Unfortunately my Python skills are not sufficient to implement this so I'm creating a feature request instead.

Related resources

No response

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Reward EOS close to max_tokens #2694

[Feature] Reward EOS close to max_tokens #2694

komninoschatzipapas commented Jan 1, 2025

[Feature] Reward EOS close to max_tokens #2694

[Feature] Reward EOS close to max_tokens #2694

Comments

komninoschatzipapas commented Jan 1, 2025

Checklist

Motivation

Related resources