Usage
Basic Usage
Downloading Datasets
Simple Download
from delong_datasets import download_dataset
# Minimal usage
data = download_dataset("<dataset_id>", "your-token")
print(data)With Options
from delong_datasets import download_dataset, DownloadOptions
# Configure download options
opts = DownloadOptions(
columns=["patient_id", "diagnosis", "age"], # Column filtering
limit=1000, # Row limit
stream=False # Streaming mode
)
data = download_dataset("<dataset_id>", "your-token", opts)Streaming Large Datasets
Working with Data
Convert to Pandas
Convert to PyArrow
Access as NumPy
Exporting Data
Export to CSV
Export to Parquet
Export to JSON
CLI Export
Advanced Features
Column Filtering
Request only the columns you need to reduce bandwidth and improve performance:
Benefits:
Reduced network bandwidth
Faster downloads
Lower memory usage
Privacy: don't access columns you don't need
Pagination
Handle large datasets efficiently with pagination:
Streaming Mode
For very large datasets that don't fit in memory:
Custom Timeout and Retries
Working with Multiple Datasets
Last updated

