[Feature] Add docs for pass in token ids directly #2661

zhaochenyang20 · 2024-12-30T07:51:00Z

Checklist

1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
2. Please use English, otherwise it will be closed.

Motivation

In most of RLHF frameworks, the prompts are pre-tokenized when data processing, so they can directly pass in token ids to the sglang engine rather than the prompts. So we should add docs on how to do this and how to get tokens directly.

Related resources

No such.

shuaills · 2024-12-30T15:44:10Z

The --skip-tokenizer-init flag allows direct input of token IDs instead of text prompts.

if self.server_args.skip_tokenizer_init: json_data = { "input_ids": prompt, "sampling_params": sampling_params, "stream": True, }

parser.add_argument( "--skip-tokenizer-init", action="store_true", help="If set, skip init tokenizer and pass input_ids in generate request", )

shuaills · 2024-12-30T18:56:58Z

Also clarify the naming when skip_init_tokenizer is on, prompt -> prompt token

zhaochenyang20 · 2024-12-30T20:48:46Z

@shuaills Also, check whether these two parameters are conflicted.

Add an example on:

launch a server with clear arguments.
pass in token ids instead of prompts.
get output tokens and input tokens from model outputs.
assert output tokens from engine are the same as passed in parameters.

I think we should change this:

https://github.com/sgl-project/sglang/blob/main/test/srt/test_engine_token_ids.py

And add examples in:

https://github.com/sgl-project/sglang/tree/main/examples/runtime

And change docs accordingly.

zhaochenyang20 · 2024-12-30T20:57:53Z

Also, please take special consideration of special tokens and chat template. I assume that:

give a string prompt = "xxxxxxxxxxxxxxx"

launch an engine and pass in prompt, get the input tokens A1 and output tokens B1.
launch an engine that accepts pass-in token ids. First tokenize prompt with a hugging face tokenizer to get input tokens A2. Pass in A2 to the engine and get the input tokens A3 and output tokens B2 from the engine.

We should have:

A1 == A2 == A3

Also, the sampling parameter may introduce some randomness for B1 == B2, like this:

https://sgl-project.github.io/references/faq.html#the-results-are-not-deterministic-even-with-a-temperature-of-0

Maybe you can give link to this in the unit tests.

Also, do not remove any of the current test cases in [test_engine_token_ids.py](https://github.com/sgl-project/sglang/blob/main/test/srt/test_engine_token_ids.py). Only adds new tests to it.

shuaills · 2024-12-30T21:02:49Z

Thanks for the clarification. Sounds good.

zhaochenyang20 · 2024-12-30T23:33:13Z

When turning on skip, only tokens are involved.

If return is turned on, both input tokens and output tokens will be returned. => This is used in unit tests to ensure that the passed-in tokens match the returned input tokens.
Without return, only output tokens are returned. => It should be asserted that there are no input tokens in the return value. Although this is not a problem, it may introduce a small overhead.

When turning off skip, only strings are allowed.

If return is turned on, both input tokens and output tokens will be returned. => It is necessary to ensure that tokenizer.tokenize(prompt) == output["input_token_ids"] and that the output tokens are in the return value.
If return is turned off, only strings are input and output. => It should be ensured that there are no tokens in the return value but there are token counts.

zhaochenyang20 · 2024-12-30T23:34:04Z

Be careful about the chat template and special tokens.

zhaochenyang20 · 2024-12-30T23:39:38Z

What's the usage of special tokens and chat templates.

zhaochenyang20 · 2024-12-31T22:44:28Z

@shuaills And keep care of multi-modal models.

zhaochenyang20 added documentation Improvements or additions to documentation good first issue Good for newcomers labels Dec 30, 2024

zhaochenyang20 self-assigned this Dec 30, 2024

zhaochenyang20 added the RLHF Using SGLang for post training label Dec 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add docs for pass in token ids directly #2661

[Feature] Add docs for pass in token ids directly #2661

zhaochenyang20 commented Dec 30, 2024

shuaills commented Dec 30, 2024

shuaills commented Dec 30, 2024

zhaochenyang20 commented Dec 30, 2024

zhaochenyang20 commented Dec 30, 2024

shuaills commented Dec 30, 2024

zhaochenyang20 commented Dec 30, 2024

zhaochenyang20 commented Dec 30, 2024

zhaochenyang20 commented Dec 30, 2024 •

edited

Loading

zhaochenyang20 commented Dec 31, 2024

[Feature] Add docs for pass in token ids directly #2661

[Feature] Add docs for pass in token ids directly #2661

Comments

zhaochenyang20 commented Dec 30, 2024

Checklist

Motivation

Related resources

shuaills commented Dec 30, 2024

shuaills commented Dec 30, 2024

zhaochenyang20 commented Dec 30, 2024

zhaochenyang20 commented Dec 30, 2024

shuaills commented Dec 30, 2024

zhaochenyang20 commented Dec 30, 2024

zhaochenyang20 commented Dec 30, 2024

zhaochenyang20 commented Dec 30, 2024 • edited Loading

zhaochenyang20 commented Dec 31, 2024

zhaochenyang20 commented Dec 30, 2024 •

edited

Loading