Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy grants tagger with sage #4

Open
nsorros opened this issue Jun 1, 2023 · 12 comments
Open

Deploy grants tagger with sage #4

nsorros opened this issue Jun 1, 2023 · 12 comments

Comments

@nsorros
Copy link
Contributor

nsorros commented Jun 1, 2023

We might need to update the README but worth checking that deployment works for you first @ArneRobben. Note that you need a couple of steps for deploying custom containers like the wellcome bert mesh model which are out of scope for this tool but should probably neverthless be either in the documentatinon or as a example scripts

sage deploy IMAGE_URI text-classification ROLE --endpoint-name test-wellcome-bert-mesh

You need to first build and push a container to pass the IMAGE_URI

Build

API

#!/usr/bin/env python

from transformers import AutoModel, AutoTokenizer
from fastapi import FastAPI
from pydantic import BaseModel


class Input(BaseModel):
    text: str


app = FastAPI()
tokenizer = AutoTokenizer.from_pretrained(
    "Wellcome/WellcomeBertMesh"
)
model = AutoModel.from_pretrained(
    "Wellcome/WellcomeBertMesh",
    trust_remote_code=True
)

@app.post("/invocations")
async def predict(input: Input):
    inputs = tokenizer([input.text], padding="max_length")
    labels = model(**inputs, return_labels=True)
    return labels[0]


@app.get("/ping")
def health():
    return "\n"

Dockerfile

FROM python:3.8.10-slim-buster

COPY requirements.txt .
RUN pip install -r requirements.txt
RUN pip install torch==2.0.0 --index-url https://download.pytorch.org/whl/cpu

COPY api.py .

COPY run_api.sh /usr/bin/serve
RUN chmod 755 /usr/bin/serve

EXPOSE 8080

Build

#!/bin/bash

# Amd64 machine (most common)
#docker build -t custom-container:latest .

# Apple M1 machine or non-amd64 machine
docker buildx build --platform linux/amd64 -t custom-container:latest .

Push

Note you need to create a ECR repository, in this examples this is wellcome-custom-containers

aws ecr get-login-password --region eu-west-1 | docker login --username AWS --password-stdin WELLCOME_ACCOUNT_ID.dkr.ecr.eu-west-1.amazonaws.com
docker tag custom-container:latest WELLCOME_ACCOUNT_ID.dkr.ecr.eu-west-1.amazonaws.com/wellcome-custom-containers
docker push WELLCOME_ACCOUNT_ID.dkr.ecr.eu-west-1.amazonaws.com/wellcome-custom-containers

Test

sage predict test-wellcome-bert-mesh "This grant is about malaria"
@nsorros nsorros mentioned this issue Jun 1, 2023
@ArneRobben
Copy link
Contributor

Hi Nick,

Back from paternity leave! Happy to try this on my side too.

Two things:

  • the dockerfile installs dependencies from a requirements file but the project installs requirements using Poetry. I tried adding in the following lines:
# using an app folder to install the poetry environment
RUN mkdir /sage
COPY /sage /sage
COPY pyproject.toml poetry.lock /sage
WORKDIR /sage

# install poetry environment
RUN apt-get update && apt-get -y install python3-pip
RUN pip install poetry==1.3.0

RUN poetry config virtualenvs.create false
RUN poetry install $(test "$YOUR_ENV" == production && echo "--no-dev") --no-interaction --no-ansi

But I get a setuptools error. I'll copy my error log below. Could you try on your side?

  • what's in your run_api.sh file? A quick look online shows it might be just simply need to be
# Start the FastAPI server using Uvicorn
uvicorn api:app --host 0.0.0.0 --port 8080

Anything else in there?

Here is the error code for trying to install poetry in docker:

docker build --no-cache .
[+] Building 67.0s (14/18)                                                                                                                                            
 => [internal] load build definition from Dockerfile                                                                                                             0.0s
 => => transferring dockerfile: 719B                                                                                                                             0.0s
 => [internal] load .dockerignore                                                                                                                                0.0s
 => => transferring context: 2B                                                                                                                                  0.0s
 => [internal] load metadata for docker.io/library/python:3.10-slim-bullseye                                                                                     0.6s
 => [internal] load build context                                                                                                                                0.0s
 => => transferring context: 558B                                                                                                                                0.0s
 => CACHED [ 1/14] FROM docker.io/library/python:3.10-slim-bullseye@sha256:a704ba2899da520dacf80a8b048de02e37476842539a95abc2883fed935a6746                      0.0s
 => [ 2/14] RUN mkdir /sage                                                                                                                                      0.2s
 => [ 3/14] COPY /sage /sage                                                                                                                                     0.0s
 => [ 4/14] COPY pyproject.toml poetry.lock /sage                                                                                                                0.0s
 => [ 5/14] WORKDIR /sage                                                                                                                                        0.0s
 => [ 6/14] RUN pip wheel --no-deps -w dist sage                                                                                                                 1.7s
 => [ 7/14] RUN apt-get update && apt-get -y install python3-pip                                                                                                35.0s
 => [ 8/14] RUN pip install poetry==1.3.0                                                                                                                       12.3s 
 => [ 9/14] RUN poetry config virtualenvs.create false                                                                                                           0.6s 
 => ERROR [10/14] RUN poetry install $(test "$YOUR_ENV" == production && echo "--no-dev") --no-interaction --no-ansi                                            16.4s 
------                                                                                                                                                                
 > [10/14] RUN poetry install $(test "$YOUR_ENV" == production && echo "--no-dev") --no-interaction --no-ansi:                                                        
#14 0.266 /bin/sh: 1: test: unexpected operator                                                                                                                       
#14 0.538 Skipping virtualenv creation, as specified in config file.                                                                                                  
#14 0.613 Installing dependencies from lock file
#14 0.804 
#14 0.804 Package operations: 37 installs, 10 updates, 0 removals
#14 0.804 
#14 0.805   • Installing jmespath (1.0.1)
#14 0.806   • Installing python-dateutil (2.8.2)
#14 0.807   • Updating urllib3 (1.26.16 -> 1.26.15)
#14 1.388   • Installing botocore (1.29.126)
#14 1.389   • Installing dill (0.3.6)
#14 1.391   • Installing mdurl (0.1.2)
#14 3.508   • Updating attrs (23.1.0 -> 22.2.0)
#14 3.511   • Installing contextlib2 (21.6.0)
#14 3.512   • Updating distlib (0.3.7 -> 0.3.6)
#14 3.514   • Updating filelock (3.12.2 -> 3.12.0)
#14 3.516   • Installing markdown-it-py (2.2.0)
#14 3.518   • Installing multiprocess (0.70.14)
#14 3.524   • Installing numpy (1.24.3)
#14 3.529   • Updating platformdirs (2.6.2 -> 3.5.0)
#14 3.532   • Installing pox (0.3.2)
#14 3.538   • Installing ppft (1.7.6.6)
#14 3.801   • Installing protobuf (3.20.3)
#14 4.528   • Installing pygments (2.15.1)
#14 4.634   • Installing pyrsistent (0.19.3)
#14 4.645   • Installing pytz (2023.3)
#14 4.682   • Installing s3transfer (0.6.0)
#14 4.784   • Updating setuptools (65.5.1 -> 67.7.2)
#14 4.787   • Installing tzdata (2023.3)
#14 4.929   • Updating zipp (3.16.2 -> 3.15.0)
#14 8.533   • Installing boto3 (1.26.126)
#14 8.533   • Installing cfgv (3.3.1)
#14 8.536   • Updating charset-normalizer (3.2.0 -> 3.1.0)
#14 8.537   • Installing click (8.1.3)
#14 8.540   • Installing cloudpickle (2.2.1)
#14 8.540   • Installing colorama (0.4.6)
#14 8.544   • Installing google-pasta (0.2.0)
#14 8.545   • Installing identify (2.5.24)
#14 8.547   • Updating importlib-metadata (6.8.0 -> 4.13.0)
#14 8.555   • Updating jsonschema (4.18.4 -> 4.17.3)
#14 9.187   • Installing nodeenv (1.7.0)
#14 9.321   • Installing pandas (2.0.1)
#14 9.377   • Installing pathos (0.3.0)
#14 9.393   • Installing protobuf3-to-dict (0.1.5)
#14 9.444   • Installing pyyaml (5.4.1)
#14 9.470   • Installing rich (13.3.5)
#14 9.500   • Installing schema (0.7.5)
#14 9.505   • Installing smdebug-rulesconfig (1.0.1)
#14 9.547   • Installing tblib (1.7.0)
#14 9.560   • Installing typing-extensions (4.5.0)
#14 15.37 
#14 15.37   CalledProcessError
#14 15.37 
#14 15.37   Command '['/usr/local/bin/python', '-m', 'pip', 'install', '--use-pep517', '--disable-pip-version-check', '--isolated', '--no-input', '--prefix', '/usr/local', '--no-deps', '/root/.cache/pypoetry/artifacts/d0/fd/70/01fcd24b796aff9a841f2181b2fa2c3a9083005fcb023ce39fcf59cd04/PyYAML-5.4.1.tar.gz']' returned non-zero exit status 1.
#14 15.37 
#14 15.37   at /usr/local/lib/python3.10/subprocess.py:526 in run
#14 15.40        522│             # We don't call process.wait() as .__exit__ does that for us.
#14 15.40        523│             raise
#14 15.40        524│         retcode = process.poll()
#14 15.40        525│         if check and retcode:
#14 15.40     →  526│             raise CalledProcessError(retcode, process.args,
#14 15.40        527│                                      output=stdout, stderr=stderr)
#14 15.40        528│     return CompletedProcess(process.args, retcode, stdout, stderr)
#14 15.40        529│ 
#14 15.40        530│ 
#14 15.40 
#14 15.40 The following error occurred when trying to handle this error:
#14 15.40 
#14 15.40 
#14 15.40   EnvCommandError
#14 15.40 
#14 15.40   Command ['/usr/local/bin/python', '-m', 'pip', 'install', '--use-pep517', '--disable-pip-version-check', '--isolated', '--no-input', '--prefix', '/usr/local', '--no-deps', '/root/.cache/pypoetry/artifacts/d0/fd/70/01fcd24b796aff9a841f2181b2fa2c3a9083005fcb023ce39fcf59cd04/PyYAML-5.4.1.tar.gz'] errored with the following return code 1, and output: 
#14 15.40   Processing /root/.cache/pypoetry/artifacts/d0/fd/70/01fcd24b796aff9a841f2181b2fa2c3a9083005fcb023ce39fcf59cd04/PyYAML-5.4.1.tar.gz
#14 15.40     Installing build dependencies: started
#14 15.40     Installing build dependencies: finished with status 'done'
#14 15.40     Getting requirements to build wheel: started
#14 15.40     Getting requirements to build wheel: finished with status 'error'
#14 15.40     error: subprocess-exited-with-error
#14 15.40     
#14 15.40     × Getting requirements to build wheel did not run successfully.
#14 15.40     │ exit code: 1
#14 15.40     ╰─> [62 lines of output]
#14 15.40         /tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/config/setupcfg.py:293: _DeprecatedConfig: Deprecated config in `setup.cfg`
#14 15.40         !!
#14 15.40         
#14 15.40                 ********************************************************************************
#14 15.40                 The license_file parameter is deprecated, use license_files instead.
#14 15.40         
#14 15.40                 By 2023-Oct-30, you need to update your project and remove deprecated calls
#14 15.40                 or your builds will no longer be supported.
#14 15.40         
#14 15.40                 See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.
#14 15.40                 ********************************************************************************
#14 15.40         
#14 15.40         !!
#14 15.40           parsed = self.parsers.get(option_name, lambda x: x)(value)
#14 15.40         running egg_info
#14 15.40         writing lib3/PyYAML.egg-info/PKG-INFO
#14 15.40         writing dependency_links to lib3/PyYAML.egg-info/dependency_links.txt
#14 15.40         writing top-level names to lib3/PyYAML.egg-info/top_level.txt
#14 15.40         Traceback (most recent call last):
#14 15.40           File "/usr/local/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
#14 15.40             main()
#14 15.40           File "/usr/local/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
#14 15.40             json_out['return_val'] = hook(**hook_input['kwargs'])
#14 15.40           File "/usr/local/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
#14 15.40             return hook(config_settings)
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 341, in get_requires_for_build_wheel
#14 15.40             return self._get_build_requires(config_settings, requirements=['wheel'])
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 323, in _get_build_requires
#14 15.40             self.run_setup()
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 338, in run_setup
#14 15.40             exec(code, locals())
#14 15.40           File "<string>", line 271, in <module>
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/__init__.py", line 107, in setup
#14 15.40             return distutils.core.setup(**attrs)
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
#14 15.40             return run_commands(dist)
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
#14 15.40             dist.run_commands()
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
#14 15.40             self.run_command(cmd)
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 1234, in run_command
#14 15.40             super().run_command(command)
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
#14 15.40             cmd_obj.run()
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/command/egg_info.py", line 314, in run
#14 15.40             self.find_sources()
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/command/egg_info.py", line 322, in find_sources
#14 15.40             mm.run()
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/command/egg_info.py", line 551, in run
#14 15.40             self.add_defaults()
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/command/egg_info.py", line 589, in add_defaults
#14 15.40             sdist.add_defaults(self)
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/command/sdist.py", line 104, in add_defaults
#14 15.40             super().add_defaults()
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/sdist.py", line 251, in add_defaults
#14 15.40             self._add_defaults_ext()
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/sdist.py", line 336, in _add_defaults_ext
#14 15.40             self.filelist.extend(build_ext.get_source_files())
#14 15.40           File "<string>", line 201, in get_source_files
#14 15.40           File "/tmp/pip-build-env-rmrwaxra/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 107, in __getattr__
#14 15.40             raise AttributeError(attr)
#14 15.40         AttributeError: cython_sources
#14 15.40         [end of output]
#14 15.40     
#14 15.40     note: This error originates from a subprocess, and is likely not a problem with pip.
#14 15.40   error: subprocess-exited-with-error
#14 15.40   
#14 15.40   × Getting requirements to build wheel did not run successfully.
#14 15.40   │ exit code: 1
#14 15.40   ╰─> See above for output.
#14 15.40   
#14 15.40   note: This error originates from a subprocess, and is likely not a problem with pip.
#14 15.40   
#14 15.40 
#14 15.40   at /usr/local/lib/python3.10/site-packages/poetry/utils/env.py:1540 in _run
#14 15.44       1536│                 output = subprocess.check_output(
#14 15.44       1537│                     command, stderr=subprocess.STDOUT, env=env, **kwargs
#14 15.44       1538│                 )
#14 15.44       1539│         except CalledProcessError as e:
#14 15.44     → 1540│             raise EnvCommandError(e, input=input_)
#14 15.44       1541│ 
#14 15.44       1542│         return decode(output)
#14 15.44       1543│ 
#14 15.44       1544│     def execute(self, bin: str, *args: str, **kwargs: Any) -> int:
#14 15.44 
#14 15.44 The following error occurred when trying to handle this error:
#14 15.44 
#14 15.44 
#14 15.44   PoetryException
#14 15.44 
#14 15.44   Failed to install /root/.cache/pypoetry/artifacts/d0/fd/70/01fcd24b796aff9a841f2181b2fa2c3a9083005fcb023ce39fcf59cd04/PyYAML-5.4.1.tar.gz
#14 15.44 
#14 15.44   at /usr/local/lib/python3.10/site-packages/poetry/utils/pip.py:58 in pip_install
#14 15.44        54│ 
#14 15.44        55│     try:
#14 15.44        56│         return environment.run_pip(*args)
#14 15.44        57│     except EnvCommandError as e:
#14 15.44     →  58│         raise PoetryException(f"Failed to install {path.as_posix()}") from e
#14 15.44        59│ 
#14 15.44 
------
executor failed running [/bin/sh -c poetry install $(test "$YOUR_ENV" == production && echo "--no-dev") --no-interaction --no-ansi]: exit code: 1

@nsorros
Copy link
Contributor Author

nsorros commented Jul 24, 2023

Which repo did you try this on? Grants tagger uses a requirements.txt https://github.com/wellcometrust/grants_tagger

In general the idea is that you can use the tool sage to deploy a model of your choice. For many models you simply point the tool to the model path and that works but for grants tagger you need to follow the custom route which is

  • create an api
  • create a docker container that has the api and any packages needed for the model and the api to work
  • push the container to your container registry
  • use sage to deploy

@nsorros
Copy link
Contributor Author

nsorros commented Jul 24, 2023

If on the other hand the model you want to deploy uses poetry for managing dependencies, here is a minimal Dockerfile that works for installing those deps

FROM python:3.8.10-slim-buster

# Install Poetry
RUN pip install poetry

# Install deps
COPY pyproject.toml poetry.lock ./
RUN poetry install

@ArneRobben
Copy link
Contributor

I tested the deployment

  • to build the grants_tagger image, I got an error that pip couldn't install torch==2.0.1+cpi, so I modified the requirements file to torch==2.0.1

  • after that, the docker build worked and I could docker push to ECR

  • deployment with sage didn't work yet though. I got an error Error hosting endpoint test-wellcome-bert-mesh: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint.., looking at the Cloudwatch log for that endpoing, I lots of: exec /usr/bin/serve: exec format error so not sure if there's something in the api.py file that doesn't work. Also what's in your run_api.sh file?

Let me know what you think

@nsorros
Copy link
Contributor Author

nsorros commented Jul 27, 2023

For testing purposes I am testing with the following files

requirements.txt

transformers==4.29.2
pydantic
fastapi
torch
uvicorn

run_api.sh

#!/bin/bash

uvicorn api:app --port 8080 --host 0.0.0.0

First test is to ensure the api works outside of docker container. Run

bash run_api.sh

Second test is to ensure the api works inside the container

I am running

docker build -t mesh-container .

and then

docker run mesh-container serve

Final test is to push to ECR and try with sagemaker

@ArneRobben
Copy link
Contributor

🥳 The API works locally!

curl -X POST -H "Content-Type: application/json" -d '{"text":"checkidiecheck"}' http://localhost:8080/invocations

results in ["Humans"]

😢 I did the docker run [image-name] serve, which seems to work fine:

INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)

But the above curl command results in: curl: (7) Failed to connect to localhost port 8080 after 0 ms: Connection refused. But I think this is likely my doing.

🥳 🥳 because I also tried to push with sage to sagemaker and low and behold, it works:

sage predict test-wellcome-bert-mesh "this is a test that contains the word malaria"

returns

Result: b'["Antimalarials","Humans","Malaria"]'

@ArneRobben
Copy link
Contributor

I think fastapi & uvicorn weren't part of my requirements file previously (I previously just used the requirements from grants_tagger)

@ArneRobben
Copy link
Contributor

I also made an sh file to help with the build and push:

this is my build_and_push.sh:

# The name of our algorithm/ECR repo
algorithm_name=sagemaker-grants-tagger

account=$(aws sts get-caller-identity --query Account --output text)

# Get the region defined in the current configuration (default to us-west-2 if none defined)
region=$(aws configure get region)
region=${region:-eu-west-1}

fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"

# If the repository doesn't exist in ECR, create it.
aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1

if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# Get the login command from ECR and execute it directly
aws ecr get-login-password --region ${region}|docker login --username AWS --password-stdin ${fullname}

docker tag custom-container:latest ${account}.dkr.ecr.eu-west-1.amazonaws.com/wellcome-custom-containers

# # Build the docker image locally with the image name and then push it to ECR
# # with the full name.

docker build --platform=linux/amd64 -t ${algorithm_name} -f Dockerfile.sagemaker .
docker tag ${algorithm_name} ${fullname}

docker push ${fullname}

FYI - we have to include sagemaker into the name of the ECR repo otherwise we can't push.. it's the way our iam roles are set up 🙈

@ArneRobben
Copy link
Contributor

I also think it's useful to use a separate requirements file just for the docker build containing just the requirements you have mentioned above

@nsorros
Copy link
Contributor Author

nsorros commented Jul 28, 2023

You need to be careful with different requirements file to ensure the project with with the versions you use but in this project there is defintiely a need for a lighter set of requirements. I think the current set of requirements based in https://github.com/wellcometrust/grants_tagger/blob/main/unpinned_requirements.txt are relatively light.

@nsorros
Copy link
Contributor Author

nsorros commented Jul 28, 2023

Some points to keep in mind when deploying with custom containers

  • you need to name the entry point in the Dockefile /usr/bin/serve and have execution mode chmod 755
  • the api file needs to have a shebang #!/usr/bin/env python

@nsorros
Copy link
Contributor Author

nsorros commented Jul 28, 2023

In regards to the error most probably you needed a port forward in your docker run so to add -p 8080:8080

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants