Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Use the new entry point so the /metrics endpoint works. #1522

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

xiangshen-dk
Copy link
Contributor

Description

This PR change the entrypoint command to use vllm.entrypoints.openai.api_server. The old one vllm.entrypoints.api_server doesn't expose the /metrics route.

Once it's merged, the example in the doc needs to be updated to the following:

USER_PROMPT="I'm new to coding. If you could only recommend one programming language to start with, what would it be and why?"

curl -X POST http://localhost:8000/v1/completions \
  -H "Content-Type: application/json" \
  -d @- <<EOF
{
    "model": "google/gemma-2-2b-it",
    "prompt": "<start_of_turn>user\n${USER_PROMPT}<end_of_turn>\n",
    "temperature": 0.90,
    "top_p": 1.0,
    "max_tokens": 128
}
EOF

The gradio.yaml file needs to be updated as well.

Tasks

  • The contributing guide has been read and followed.
  • The samples added / modified have been fully tested.
  • Workflow files have been added / modified, if applicable.
  • Region tags have been properly added, if new samples.
  • All dependencies are set to up-to-date versions, as applicable.
  • Merge this pull-request for me once it is approved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant