Libraries and Dependencies

Your Python script or notebook may have requirements on external libraries. Datapane supports the ability to add local folders and files, which you can deploy alongside your script, the ability to include pip requirements, which are made available to your script when it is run on Datapane, and the ability to specify a Docker container which your script runs in. These are all configured in your datapane.yaml.

Python dependencies

If we were building a reporting tool to pull down financial data, we may want to use the yfinance library in Python. To do this, we could add it to a requirements list in our datapane.yaml

datapane.yaml
...
requirements:
- yfinance

Additional files and folders

Additionally, we may want to include a local Python folder or file. Imagine we have a separate Python file, stock_scaler.py which helps us scale the values of our stocks, and which we want to use in our script, or a folder of SQL scripts which we want to use in Python.

~/C/d/d/my-new-proj> ls
dp-script.py datapane.yaml stock_scaler.py

To include this in the deploy, we could add it to include

datapane.yaml
...
include:
- stock_scaler.py

Docker dependencies

By default, scripts on Datapane run in our standard docker container, which includes the following libraries.

seaborn == 0.10.*
altair-recipes ~= 0.8.0
git+https://github.com/altair-viz/altair_pandas@master#egg=altair-pandas
git+https://github.com/altair-viz/pdvega@master#egg=pdvega
scipy == 1.4.*
scikit-learn == 0.22.*
patsy ~= 0.5.1
lightgbm ~= 2.2.3
lifetimes == 0.11.*
lifelines == 0.23.*
./wheels/fbprophet-0.5-py3-none-any.whl
adtk ~= 0.6.2
# data access
sqlalchemy ~= 1.3
psycopg2-binary ~= 2.8
PyMySQL ~= 0.9.3
google-cloud-bigquery[pandas, pyarrow] ~= 1.17
boto3 ~= 1.12.6
requests ~= 2.23.0
ftpretty ~= 0.3.2
pymongo ~= 3.10.1
# misc
dnspython ~= 1.16.0 # requirement for pymongo to connect to certain instances
sh ~= 1.13.0

If you want your script to run in your own Docker container, you can specify your own.

This support currently only supports public Docker images, and we're adding support for private repositories in the near future. With that in mind, for anything that you don't want to be public, we would recommend continuing to upload private directories through the regular include the mechanism, and including OS requirements or pip requirements.txt in the Dockerfile.

Although you can use any base for your Docker image, we would recommend inheriting off ours. To do this, create a Docker image which inherits from the our base image (nstack/datapane-python-runner) and adds your required dependencies.

from nstack/datapane-python-runner:latest
COPY requirements.txt .
RUN pip3 install --user -r requirements.txt

If you build this and push it to Dockerhub, you can then specify it in your datapane.yaml as follows:

container_image_name: your-image-name

When you run a script, it will run inside this Docker container. Note that the first run may take a bit longer, as it needs to pull the image from Dockerhub. Once it pulls it once, it's cached for future runs.