Trouble Incorporating Locally Downloaded Llama Model into TGI Docker Image #1185

opjakhar · 2023-10-20T10:53:28Z

opjakhar
Oct 20, 2023

Hello everyone,

I've been attempting to utilize the model I downloaded on my local system within the TGI Docker image, with the intention of avoiding the initial download. I understand that it normally downloads the model only the first time and then uses the downloaded version for subsequent runs. However, I'm encountering an issue where providing the path to the local directory containing the model is not accepted by the Docker image.

I've come across information suggesting that it's possible when running through the CLI by passing the model directory as a parameter. In the case of Docker, there is an option to pass the model directory using --model-id $model_dir, but I'm experiencing a failure.

I'm sharing the error snippet below for reference:

(base)user1@vm-machine:~/TGI$ docker run --gpus all --shm-size 1g -e HUGGING_FACE_HUB_TOKEN=$token -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.1.0 --model-id $model --num-shard 1 
2023-10-19T08:19:34.654400Z  INFO text_generation_launcher: Args { model_id: "<model_path>/models--meta-llama--Llama-2-7b-hf/snapshots/6f45f2e6456576ff2481f2241aaee423345565b5b0bbb9/", revision: None, validation_workers: 2, sharded: None, num_shard: Some(1), quantize: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_top_n_tokens: 5, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, hostname: "4e9fcf7ce908", port: 80, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, rope_scaling: None, rope_factor: None, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, env: false }
2023-10-19T08:19:34.654519Z  INFO download: text_generation_launcher: Starting download process.
2023-10-19T08:20:30.602751Z ERROR download: text_generation_launcher: Download encountered an error: Traceback (most recent call last):

  File "/opt/conda/bin/text-generation-server", line 8, in <module>
    sys.exit(app())

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 115, in download_weights
    utils.weight_files(model_id, revision, extension)

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 96, in weight_files
    filenames = weight_hub_files(model_id, revision, extension)

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 25, in weight_hub_files
    info = api.model_info(model_id, revision=revision)

  File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 110, in _inner_fn
    validate_repo_id(arg_value)

  File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 158, in validate_repo_id
    raise HFValidationError(

huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '<model_path>/models--meta-llama--Llama-2-7b-hf/snapshots/6f45f2e6456576ff2481f2241aaee423345565b5b0bbb9/'. Use repo_type argument if needed.

Please correct me if i am doing something wrong or missing anything. Any insights or suggestions on how to resolve this issue would be greatly appreciated."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trouble Incorporating Locally Downloaded Llama Model into TGI Docker Image #1185

{{title}}

Replies: 0 comments

Select a reply

Trouble Incorporating Locally Downloaded Llama Model into TGI Docker Image #1185

opjakhar Oct 20, 2023

Replies: 0 comments

opjakhar
Oct 20, 2023