It is often neccesary to make use of non-code assets such as datasets, models, or files when generating reports. In many situations, deploying these alongside your script is not ideal.
If they are deployed on a different cadence to your script; for instance, you want to make use of a model which is trained on a daily cadence, even though the code of your script remains static.
If they are deployed from a different environment than your script; for instance, you may train a model on Sagemaker and want to use it in your script.
If they are large, and re-uploading them each time you deploy your script is cumbersome.
For these use-cases, Datapane provides a Blob API which allows you to upload files from any Python or CLI environment, and access them inside your scripts or through the CLI.
Upload a file and return an id and a url which you can use to retrieve the blob.
datapane blob upload <name> <filename>
Download a blob and save it to a file.
datapane blob download <name> <filename> [--version=version]
All upload methods take the object to upload as the first parameter. Depending on the method, this can be a file path, DataFrame, or a Python object.
All methods have the additional parameters:
The value of your variable
The visibility setting (
import datapane as dp# Upload a DataFrameb = dp.Blob.upload_df(df, name='my_df')# Upload a fileb = dp.Blob.upload_file("~/my_dataset.csv", name='my_ds')# Upload an objectb = dp.Blob.upload_obj([1,2,3], name='my_list')
Download a DataFrame, file, or object. All download operations have the following parameters:
The name of your blob
The version of the blob to retrieve
The owner of the blob.
import datapane as dp# Download a DataFrameblob = dp.Blob.get(name="blob_id")# Download a DataFrameb = blob.download_df()# Download a fileb = blob.download_file("~/my_dataset.csv")# Download an objectb = blob.download_obj()
You may wish to share your blob to others such as your teammates so that your team could work on the same dataframe, object, or file.
To enable sharing with the public, set
visibility=PUBLIC when uploading your dataframe, file, or object to a blob.
dp.Blob.upload_df(df, name='myblob', visibility='PUBLIC')
When others want to access your blob, they could simply retrieve by specifying the name of the blob and your account in
blob = dp.Blob.get(name='myblob', owner='khuyentran')# Retrieve blobb = blob.download_df() # Or download_file(), download_obj()
Now others can use your blob for their code! If you want to share your blob privately in your organisation, follow the same process, but set the visibility of your blob to