In the previous example, we are deploying a single script and providing configuration through command-line arguments. This works well for simple scripts, but scripts often need other configuration, such as parameters definitions, other files or folders to deploy, and pip requirements.
Datapane allows you to provide a configuration file called
datapane.yaml. When you
deploy, Datapane looks for this file automatically. Before we continue, create a project structure with the
script init command, which creates a sample
datapane.yaml and a simple script.
~/C/d/d/my-new-proj> datapane script initCreated script 'my-new-proj', edit as needed and upload~/C/d/d/my-new-proj> lsdatapane.yaml dp-script.py
We already have a script, so we can delete the sample
dp-script.py and copy in our own. Because we're replacing the default script, we should specify this in our
datapane.yaml using the
script field. Whilst we're there, we can also add the name.
datapane.yamlname: stock_plotterscript: financial_report.py # this could also be ipynb if it was a notebook
See the API reference for all the available configuration fields.
Your Python script or notebook may have requirements on external libraries. Datapane supports the ability to add local folders and files, which you can deploy alongside your script, the ability to include
pip requirements, which are made available to your script when it is run on Datapane, and the ability to specify a Docker container which your script runs in. These are all configured in your
In the above example, we may want to use the yfinance library in Python, instead of manually reading a CSV from Yahoo. To do this, we can add it to a
requirements list in our
Additionally, we may want to include a local Python folder or file. Imagine we have a separate Python file,
stock_scaler.py which helps us scale the values of our stocks, and which we want to use in our notebook.
~/C/d/d/my-new-proj> lsdp-script.ipynb datapane.yaml stock_scaler.py
To include this in the deploy, we can add it to
By default, scripts on Datapane run in our standard docker container. By default, this includes the following libraries.
seaborn == 0.10.*altair-recipes ~= 0.8.0git+https://github.com/altair-viz/[email protected]#egg=altair-pandasgit+https://github.com/altair-viz/[email protected]#egg=pdvegascipy == 1.4.*scikit-learn == 0.22.*patsy ~= 0.5.1lightgbm ~= 2.2.3lifetimes == 0.11.*lifelines == 0.23.*./wheels/fbprophet-0.5-py3-none-any.whladtk ~= 0.6.2# data accesssqlalchemy ~= 1.3psycopg2-binary ~= 2.8PyMySQL ~= 0.9.3google-cloud-bigquery[pandas, pyarrow] ~= 1.17boto3 ~= 1.12.6requests ~= 2.23.0ftpretty ~= 0.3.2pymongo ~= 3.10.1# miscdnspython ~= 1.16.0 # requirement for pymongo to connect to certain instancessh ~= 1.13.0
If you want your script to run in your own Docker container, you can specify your own.
Although you can use any base for your Docker image, we would recommend inhereting off ours. To do this, create a Docker image which inherits from the our base image (
nstack/datapane-python-runner) and adds your required dependencies.
from nstack/datapane-python-runner:latestCOPY requirements.txt .RUN pip3 install --user -r requirements.txt
If you build this and push it to Dockerhub, you can then specify it in your
datapane.yaml as follows:
When you run a script, it will run inside this Docker container. Note that the first run may take a bit longer, as it needs to pull the image from Dockerhub. Once it pulls it once, it's cached for future runs.