Please see the Blob API Reference for more details.
It is often necessary to make use of non-code assets such as datasets, models, or files when generating reports. In many situations, deploying these alongside your script is not ideal.
If they are deployed on a different cadence to your script; for instance, you want to make use of a model which is trained on a daily cadence, even though the code of your script remains static.
If they are deployed from a different environment than your script; for instance, you may train a model on Sagemaker and want to use it in your script.
If they are large, and re-uploading them each time you deploy your script is cumbersome.
For these use-cases, Datapane provides a Blob API which allows you to upload files from any Python or CLI environment, and access them inside your scripts or through the CLI. See the Python API Docs for more information on using Blobs.
Upload a file and return an id and a url which you can use to retrieve the blob.
datapane blob upload <name> <filename>
Download a blob and save it to a file.
datapane blob download <name> <filename> [--version=version]
All upload methods take the object to upload as the first parameter. Depending on the method, this can be a file path, DataFrame, or a Python object.
All methods have the additional parameters:
Parameter | Description | Required |
| The value of your variable | True |
| The visibility setting ( | False |
If you want other people in your organization to make use of blobs you created in their scripts, you must set visibility to ORG
import datapane as dp​# Upload a DataFrameb = dp.Blob.upload_df(df, name='my_df')​# Upload a fileb = dp.Blob.upload_file("~/my_dataset.csv", name='my_ds')​# Upload an objectb = dp.Blob.upload_obj([1,2,3], name='my_list')
Download a DataFrame, file, or object. All download operations have the following parameters:
​ | ​ | Required |
| The name of your blob | True |
| The version of the blob to retrieve | False |
| The owner of the blob. | False |
If you want other people inside your organization to run your scripts which access a blob which you created, you must specify yourself as the owner
in this method. When someone runs your script, it runs under their name, and if you do not set an explicitly specify the owner
, it will try and look for the blob under their name and fail.
dp.Blob.get(name='foo', owner='linus')
import datapane as dp​# Download a DataFrameblob = dp.Blob.get(name="blob_id")​# Download a DataFrameb = blob.download_df()​# Download a fileb = blob.download_file("~/my_dataset.csv")​# Download an objectb = blob.download_obj()
You may wish to share your blob to others such as your teammates so that your team could work on the same dataframe, object, or file.
To enable sharing with the public, set visibility=PUBLIC
when uploading your dataframe, file, or object to a blob.
dp.Blob.upload_df(df, name='myblob', visibility='PUBLIC')
When others want to access your blob, they could simply retrieve by specifying the name of the blob and your account in owner
blob = dp.Blob.get(name='myblob', owner='khuyentran')​# Retrieve blobb = blob.download_df() # Or download_file(), download_obj()
Now others can use your blob for their code! If you want to share your blob privately in your organization, follow the same process, but set the visibility of your blob to ORG