Usage

Basic Usage

Downloading Datasets

Simple Download

from delong_datasets import download_dataset

# Minimal usage
data = download_dataset("<dataset_id>", "your-token")
print(data)

With Options

from delong_datasets import download_dataset, DownloadOptions

# Configure download options
opts = DownloadOptions(
    columns=["patient_id", "diagnosis", "age"],  # Column filtering
    limit=1000,                                    # Row limit
    stream=False                                   # Streaming mode
)

data = download_dataset("<dataset_id>", "your-token", opts)

Streaming Large Datasets

Working with Data

Convert to Pandas

Convert to PyArrow

Access as NumPy

Exporting Data

Export to CSV

Export to Parquet

Export to JSON

CLI Export


Advanced Features

Column Filtering

Request only the columns you need to reduce bandwidth and improve performance:

Benefits:

  • Reduced network bandwidth

  • Faster downloads

  • Lower memory usage

  • Privacy: don't access columns you don't need

Pagination

Handle large datasets efficiently with pagination:

Streaming Mode

For very large datasets that don't fit in memory:

Custom Timeout and Retries

Working with Multiple Datasets

Last updated