Automation
Automating app creation, including via GitHub Actions.
Introduction
Once you have an app you're happy with, you often need to generate it automatically; for instance, to run on a schedule, or to be triggered through an API. To make this easier, Datapane provides a GitHub action which allows you to run your Python script automatically to generate and publish a new app.
To learn more about GitHub actions, see the documentation.
Configuring your job
Info
This tutorial presumes you have a basic understanding of how to use GitHub actions. For more information, please refer to GitHub's documentation.
Your GitHub action requires access to your Datapane API token, which you can find on your settings page once you have logged in to Datapane. This should not be stored in plain text, and should be added to your repository's secrets section.
The Datapane action requires that the repository contains a Python script which publishes an app. Once you have your token in your repository, you can you can add the Datapane action as a job
, including the path to the Python script, and a reference to your secret token.
jobs:
build_app:
runs-on: ubuntu-latest
name: Run end-of-week Datapane apps
steps:
- uses: actions/[email protected]
- uses: actions/[email protected]
with:
python-version: 3.8
- uses: datapane/[email protected]
with:
script: "apps/end_of_week.py"
token: ${{ secrets.TOKEN }}
Running your action
Running on a schedule
To run your app on a schedule, you can use the schedule
option:
Manual running
Manual runs can be triggered via the workflow_dispatch
option. If your app has user-configurable parameters, you can define these in your workflow and enter them via the GH Action site when manually triggering your workflow.
The parameters in the GH Action UI are all strings, however Datapane will convert them to primitive values as needed, e.g. the string false
becomes a python boolean False
value. Workflow parameters are described in the docs. The input must manually be converted to the parameters
json string to pass to the Datapane build-action
as follows.
on:
workflow_dispatch:
inputs:
company:
description: "Company stock name"
required: true
default: "GOOG"
market:
description: "Country to report for"
required: false
jobs:
build_app:
runs-on: ubuntu-latest
name: Run Parameterised Datapane app
steps:
- uses: actions/[email protected]
- uses: actions/[email protected]
with:
python-version: 3.8
- uses: datapane/[email protected]
with:
script: "apps/financials.py"
token: ${{ secrets.TOKEN }}
parameters: ${{ toJson(github.event.inputs) }}
Once you have committed your manual run, you can run it in the following ways:
Trigger by Github UI
Info
For more information, see GH docs for running a parameterised datapane workflow using the GH Action UI.
GitHub's action UI provides an interface for running your action with parameters.
Trigger by API/Webhook
Info
For more information, see GitHub docs for running a parameterised datapane workflow via an API Call.
To trigger your app generation through an API, you need to send a POST request to /repos/{owner}/{repo}/actions/workflows/{workflow_name}/dispatches
. For instance, for a repo called acme/reporting
, with a workflow as above called financial_app
you could trigger it as follows:
$ curl \
-u GH_USERNAME:GH_TOKEN \
-X POST \
-H "Accept: application/vnd.github.v3+json" \
https://api.github.com/repos/acme/reporting/actions/workflows/financial_app/dispatches \
-d '{"ref":"ref", "inputs": { "company": "APPL", "market": "UK"} }'
Advanced Usage
Caching
Caching pip
Pip dependencies can be cached via actions/cache. The cache key should contain the requirements
and version
input parameters, if they're used.
An example workflow on Ubuntu with caching is shown below:
env:
version: "==0.8.0"
requirements: '["networkx"]'
jobs:
build_app:
runs-on: ubuntu-latest
name: Build Datapane app
steps:
- uses: actions/[email protected]
- uses: actions/[email protected]
with:
python-version: 3.8
- uses: actions/[email protected]
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ env.requirements }}-${{ env.version }}
- uses: datapane/[email protected]
with:
script: "apps/financials.py"
token: ${{ secrets.TOKEN }}
version: "${{ env.version }}"
requirements: "${{ env.requirements }}"
Caching packages
It's also possible to cache the installed packages themselves, speeding up action running, by creating a venv
first, activating it, and caching it between runs.
env:
version: "==0.8.0"
requirements: '["networkx==2.5", "pandas==1.0.5"]'
jobs:
build_app:
runs-on: ubuntu-latest
name: Build Datapane app
steps:
- uses: actions/[email protected]
- uses: actions/[email protected]
with:
python-version: 3.8
- uses: actions/[email protected]
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ env.requirements }}-${{ env.version }}
- uses: actions/[email protected]
with:
path: ~/.venv
key: ${{ runner.os }}-pip-${{ env.requirements }}-${{ env.version }}
- name: Create and activate venv
run: |
python3 -m venv ~/.venv
echo "~/.venv/bin" >> $GITHUB_PATH
- uses: datapane/[email protected]
with:
script: "apps/dp_script.py"
token: ${{ secrets.TOKEN }}
requirements: "${{ env.requirements }}"
Note that when doing this, ensure that you clearly specify your package version in your requirements otherwise you may end up with cache hits for old versions of your packages.