PetalData

Export data science-ready datasets from cloud apps like Stripe, Hubspot, and Metabase.

Do you want to patch together yet another script to export and transform your data? Of course not! PetalData makes it easy to export data from cloud apps into a data science-ready format. Wow your teammates - you'll deliver insights in minutes.

There's no ETL pipeline to manage, no data warehouse, and no forced demos. We're about fast access to your data for a lot less 💰. Try the examples on the right. You don't even need to signup!

Try these examples👇

import petaldata
petaldata.datasets.stripe.api_key = "[YOUR STRIPE API KEY]"

invoices = petaldata.datasets.stripe.Invoices()
invoices.download(limit=5)
# Access downloaded invoices via a Pandas Dataframe
df = invoices.df
import petaldata
petaldata.datasets.hubspot.api_key = "[YOUR HUBSPOT API KEY]"

contacts = petaldata.datasets.hubspot.Contacts()
contacts.download(limit=5)
# Access downloaded contacts via a Pandas Dataframe
df = contacts.df

Install our Python package via pip install petaldata.


No ETL pipelines. No data warehouse.

Running a full ETL (extract, transform, load) pipeline and data warehouse is expensive and time-consuming to maintain. If you aren't ready to make this investment (and aren't storing terabytes of data in your cloud apps ... ahem - that's most of us) PetalData is your secret sauce. Data scientists and data hobbyists can quickly explore the data in the cloud apps you already use without assembling several expensive parts.


How PetalData Works

There are two components to PetalData: our Python library for downloading datasets and our web service.

While we provide a vanilla HTTPS API, you'll primarily interact with PetalData with the petaldata Python package. This library standardizes the process of downloading, saving, and updating datasets across cloud apps. Datasets can be stored locally on your computer or in Amazon S3.

The most brittle parts of the download processes are conducted inside our web service, accessible at petaldata.app. The Python library sends requests to the petaldata web service. Server-side, we contact the appropriate cloud app, fetch your data, format it, and stream it back to the client. Data exporting is a brittle process that requires continual debugging, tweaks, and maintenance. If our web service didn't perform this hard work, debugging issues and frequent release cycles would be painful for everyone.


Security

We're just as paranoid about leaking data as you are. All network traffic to and from PetalData is via HTTPS. PetalData does not store or log credentials for your cloud services on our servers.


Supported Cloud Apps

You can find the cloud apps we currently support in the sidebar under "Datasets". We're adding more and we plan to open-source much of this logic to support as many cloud apps as possible. Watch our GitHub repo to be notified of new datasets. You can request additional datasets by opening an issue on the repo.