Blobs

Blobs are files and objects which you can store on Datapane and use in your scripts.

It is often neccesary to make use of non-code assets such as datasets, models, or files when generating reports. In many situations, deploying these alongside your script is not ideal.

  1. If they are deployed on a different cadence to your script; for instance, you want to make use of a model which is trained on a daily cadence, even though the code of your script remains static.

  2. If they are deployed from a different environment than your script; for instance, you may train a model on Sagemaker and want to use it in your script.

  3. If they are large, and re-uploading them each time you deploy your script is cumbersome.

For these use-cases, Datapane provides a Blob API which allows you to upload files from any Python or CLI environment, and access them inside your scripts or through the CLI.

CLI

upload

Upload a file and return an id and a url which you can use to retrieve the blob.

datapane blob upload <name> <filename>

download

Download a blob and save it to a file.

datapane blob download <name> <filename> [--version=version]

Python

upload_df, upload_file, upload_obj

Parameters

All upload methods take the object to upload as the first parameter. Depending on the method, this can be a file path, DataFrame, or a Python object.

All methods have the additional parameters:

Parameter

Description

Required

name

The value of your variable

True

visibility

The visibility setting (ORG, PRIVATE, or PUBLIC)

False

If you want other people in your organisation to make use of blobs you created in their scripts, you must set visibility to ORG

import datapane as dp
# Upload a DataFrame
b = dp.Blob.upload_df(df, name='my_df')
# Upload a file
b = dp.Blob.upload_file("~/my_dataset.csv", name='my_ds')
# Upload an object
b = dp.Blob.upload_obj([1,2,3], name='my_list')

download_df, download_file, download_obj

Download a DataFrame, file, or object. All download operations have the following parameters:

Required

name

The name of your blob

True

version

The version of the blob to retrieve

False

owner

The owner of the blob.

False

If you want other people inside your organisation to run your scripts which access a blob which you created, you must specify yourself as the owner in this method. When someone runs your script, it runs under their name, and if you do not set an explicitly specify the owner , it will try and look for the blob under their name and fail.

dp.Blob.get(name='foo', owner='linus')
import datapane as dp
# Download a DataFrame
blob = dp.Blob.get(name="blob_id")
# Download a DataFrame
b = blob.download_df()
# Download a file
b = blob.download_file("~/my_dataset.csv")
# Download an object
b = blob.download_obj()

Share your Blob

You may wish to share your blob to others such as your teammates so that your team could work on the same dataframe, object, or file.

To enable sharing with the public, set visibility=PUBLIC when uploading your dataframe, file, or object to a blob.

dp.Blob.upload_df(df, name='myblob', visibility='PUBLIC')

When others want to access your blob, they could simply retrieve by specifying the name of the blob and your account in owner

blob = dp.Blob.get(name='myblob', owner='khuyentran')
# Retrieve blob
b = blob.download_df() # Or download_file(), download_obj()

Now others can use your blob for their code! If you want to share your blob privately in your organisation, follow the same process, but set the visibility of your blob to ORG