This module contains docker file to build djl-serving docker image. This docker image is compatible with SageMaker hosting.
Currently, we created docker compose to simplify the building experience. Just run
cd serving/docker
export DJL_VERSION=$(awk -F '=' '/djl / {gsub(/ ?"/, "", $2); print $2}' ../../gradle/libs.versions.toml)
docker compose build --build-arg djl_version=${DJL_VERSION} <compose-target>
You can find different compose-target
in docker-compose.yml
, like cpu
, lmi
...
You can find DJL latest release docker image on dockerhub. DJLServing also publishes nightly publish to the dockerhub nightly. You can just pull the image you need from there.
djl-serving will load all models stored in /opt/ml/model
folder. You only need to
download your model files and mount the model folder to /opt/ml/model
in the docker container.
Here are a few examples to run djl-serving docker image:
docker pull deepjavalibrary/djl-serving:0.28.0
mkdir models
cd models
curl -O https://resources.djl.ai/test-models/pytorch/bert_qa_jit.tar.gz
docker run -it --rm -v $PWD:/opt/ml/model -p 8080:8080 deepjavalibrary/djl-serving:0.28.0
docker pull deepjavalibrary/djl-serving:0.28.0-pytorch-gpu
mkdir models
cd models
curl -O https://resources.djl.ai/test-models/pytorch/bert_qa_jit.tar.gz
docker run -it --runtime=nvidia --shm-size 2g -v $PWD:/opt/ml/model -p 8080:8080 deepjavalibrary/djl-serving:0.28.0-pytorch-gpu
docker pull deepjavalibrary/djl-serving:0.28.0-pytorch-inf2
mkdir models
cd models
curl -O https://resources.djl.ai/test-models/pytorch/resnet18_inf2_2_4.tar.gz
docker run --device /dev/neuron0 -it --rm -v $PWD:/opt/ml/model -p 8080:8080 deepjavalibrary/djl-serving:0.28.0-pytorch-inf2
docker pull deepjavalibrary/djl-serving:0.28.0-aarch64
mkdir models
cd models
curl -O https://resources.djl.ai/test-models/pytorch/resnet18_inf2_2_4.tar.gz
docker run --device /dev/neuron0 -it --rm -v $PWD:/opt/ml/model -p 8080:8080 deepjavalibrary/djl-serving:0.28.0-aarch64
You can pass command line arguments to djl-serving
directly when you using docker run
docker run -it --rm -p 8080:8080 deepjavalibrary/djl-serving:0.28.0 djl-serving -m "djl://ai.djl.huggingface.pytorch/sentence-transformers/all-MiniLM-L6-v2"