Nyc Uber data analysis from streamlit Gallery

Trouble Shooting Docker for Streamlit Web Apps

I am almost on the verge of ditching Dash for Streamlit, though Streamlit still lacks few features I needed. But that is not the topic for today. For last few days I am trying to do a test deploy of a Streamlit application in AWS ECS with Fargate serverless container. So my first intention was to dockerize an streamlit app in the local machine, where all the problem begins. This post is a compilation of all the community suggestion I have tried out and the one works for me in a Corporate Situation.

The first problem is faced when trying to create an image in my office, is PROXY server used by my company (we will discuss about it later how resolve it in mac). The second Problem is that, I thought it is something similar like deploying gunicorn based application and you just need to pip install stuff and use a sh file to spin up few worker thread.

In that post I am not going to explain why some approach does not work or why some approach works. But I will discuss how I have made it up and running. If you know why some approach does not work then feel free to put in comment, so that others know about it including me.

I have set up few standard for me when dockerize any python app recently, and it is about tool chain. For now I just assume that you should be using Pytest, Flake8 in most of your projects and Not talking about them in my dockerization. These are the standard:

  • Use Pipfile and pipenv instead of requirements for better internal dependency management of the packages
  • No `RUN echo > somethings` shell command in the Dockerfile when creating image
  • No Project Related config creation in the docker file
  • Move any kind of optional CLI command from ENTRYPOINT/CMD to some config file if the framework allow (Ex: server.port=8080)
  • Use for any unnecessary file for docker
  • Always setup a Working Directory in the docker image

You might like it or not but I prefer to follow that, Suggest me if some thing is missing here. So here is my Pipfile:

[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
pandas = "*"
streamlit = "*"
watchdog = "*"
[dev-packages][requires]
python_version = "3.8"

Small and simple, though I actually have some dev packages but did not mention here.

You might need to build and rebuild things again and again at the first time and I do not want to write the same code in the terminal So here is my Makefile definition (I stole that trick from one of my x-collegue Petr J.).

.PHONY: build serve kill kill_mac remove serve_detachapp_name = dossier-client:latest
docker_file = DockerfileLocal
build:
docker build -t $(app_name) -f $(docker_file) .
serve:
docker run -p 8080:8080 $(app_name)
serve_detach:
docker run --detach -p 8080:8080 $(app_name)
kill:
@echo 'Killing container...'
docker ps | grep $(app_name) | awk '{print $$1}' | xargs -r docker stop
kill_mac:
@echo 'Killing container...'
docker ps | grep $(app_name) | awk '{print $$1}' | xargs docker stop
remove:
docker ps -a | grep $(app_name) | awk '{print $$1}' | xargs docker rm

If you are not sure what makefile is You can skip that part and keep using terminal. Now every time I run it will start building docker image. Few things to notice here:

  • Docker file name is and it is because few environment variable is needed only for local machine to pass through Proxy Server otherwise my company network does not allow me to download some stuff with mentioning proxy
  • The other one in Mac for xargs -r flag is not supported so it is removed but in aws linux ec2 machine or Sagemaker it is needed So I kept it in two different version

Here is the file content:

.gitignore
.idea
*__pycache__
*.pyc

Now here comes the first version of docker file which does not work:

# DockerfileLocal
FROM python:3.8-slim
ENV REPO_URL=MyApp
COPY . $REPO_URL
WORKDIR $REPO_URL
RUN pip install --upgrade pip
RUN pip install pipenv
RUN pipenv install --system --deploy --ignore-pipfile
EXPOSE 8080ENTRYPOINT [ "streamlit", "run", "startup.py"]

The first problem was the Proxy server. I have tried different ways Adding proxy information in the Docker Demon app. In mac you can do it from docker icon preference>proxy and add HTTP and HTTPS proxy and enable manual proxy or running docker command to add proxy in the docker config (check references). But after adding that It did not work for me while these information are being added I added the proxy information as ENV in my docker image and then it was building the image. The second revision:

FROM python:3.8-slimENV REPO_URL=MyApp
COPY . $REPO_URL
WORKDIR $REPO_URL
ENV HTTP_PROXY="http:<my office proxy>.net:2021"
ENV HTTPS_PROXY="http:<my office proxy>.net:2021"
RUN pip install --upgrade pip
RUN pip install pipenv
RUN pipenv install --system --deploy --ignore-pipfile
RUN unset HTTP_PROXY
RUN unset HTTPS_PROXY
EXPOSE 8080ENTRYPOINT [ "streamlit", "run", "startup.py"]

Not sure if unset has some effect (I have not tested). I know It is not a safe way but I have not push my local docker file into git as it is just for test purpose.

After that the image was built successfully but the app was loading for ever when hitting localhost:8080. Then I try to search in the community for possible solution and found some streamlit specific config. So I created a folder in the root of the directory and added these file:

A config.toml file:

# config.toml
[server]
enableCORS=false
enableXsrfProtection=false
port=8080

Probably it is not safe in production to disable CORS. But I could not make it work without disabling it. Let me know if you have any better solution.

A credentials.toml file:

[general]
email=""

A lot of people echo them in the docker file when creating the image on the flay but I do not like that way.

RUN bash -c 'echo -e "\ [general]\n\ email = \"\"\n\ " > /root/.streamlit/credentials.toml'

Here is the revision three Local Docker file:

FROM python:3.8-slimENV REPO_URL=MyApp
COPY . $REPO_URL
WORKDIR $REPO_URL
ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8
ENV HTTP_PROXY="http://<my office proxy>:2020"
ENV HTTPS_PROXY="http://<my office proxy>:2020"
RUN pip install --upgrade pip
RUN pip install pipenv
RUN pipenv install --system --deploy --ignore-pipfile
RUN unset HTTP_PROXY
RUN unset HTTPS_PROXY
EXPOSE 8080ENTRYPOINT [ "streamlit", "run", "startup.py"]

That’s it. But when deploying app in production I will be using AWS and there will be no problem with proxy So the production docker image do not have any Proxy environment variable setup. Here is how the production file looks like:

FROM python:3.8-slimENV REPO_URL=MyApp
COPY . $REPO_URL
WORKDIR $REPO_URL
ENV LC_ALL=C.UTF-8
ENV LANG=C.UTF-8
RUN pip install --upgrade pip
RUN pip install pipenv
RUN pipenv install --system --deploy --ignore-pipfile
EXPOSE 8080ENTRYPOINT [ "streamlit", "run", "startup.py"]

It was build successfully with CodeCommit, CodeBuild and ECR in AWS.

References:

  1. Configure Docker to Use Proxy
  2. How to Use Streamlit in Docker

Data Science and Data Analytics Developer