Getting Started¶
This guide will walk you through installing DDR and running your first routing experiment.
Prerequisites¶
Before installing DDR, ensure you have:
- Python 3.11+: DDR requires Python 3.11 or later
- uv: The fast Python package manager (install instructions)
- Git: For cloning the repository
For GPU support (optional but recommended):
- CUDA 12.4+: Required for GPU acceleration
- CuPy: Installed via the cuda dependency group
Installation¶
Clone the Repository¶
Install Dependencies¶
DDR uses uv for dependency management. The repository is organized as a workspace with three packages:
| Package | Description |
|---|---|
ddr |
Core routing library |
ddr-engine |
Geospatial data preparation tools |
ddr-benchmarks |
Benchmarking tools for model comparison |
Choose the appropriate installation based on your needs:
The full workspace is recommended for development and paper verification. Use core-only for production routing. GPU support requires an NVIDIA GPU with CUDA 12.4+.
Verify Installation¶
Data Preparation¶
Before training, you need to create the sparse adjacency matrices that define the river network topology.
Step 1: Download Geospatial Data¶
DDR requires a geospatial fabric defining the river network. Currently supported:
NOAA-OWP Hydrofabric v2.2 (Recommended for CONUS):
- Download from Lynker-Spatial
- File:
conus_nextgen.gpkg
MERIT Hydro (Global coverage):
- Download from MERIT Hydro website
- Or use the Google Drive mirror
Step 2: Prepare Gauge Information¶
Create a CSV file with gauge information. Required columns:
| Column | Description | Example |
|---|---|---|
STAID |
USGS Station ID (7-8 digits, not zero padded) | 1563500 |
DRAIN_SQKM |
Drainage area in km² | 2847.5 |
LAT_GAGE |
Latitude of gauge | 40.2345 |
LNG_GAGE |
Longitude of gauge | -76.8901 |
STANAME |
Station name (optional) | Susquehanna River |
NOTE: to use MERIT you will need to have COMID also specified and mapped to each river gage
You can find pre-prepared gauge lists in the references repository.
Step 3: Build Adjacency Matrices¶
Run the engine script to create the sparse network matrices:
This creates two zarr stores:
| File | Description |
|---|---|
*_conus_adjacency.zarr |
Full CONUS river network in sparse COO format |
*_gages_conus_adjacency.zarr |
Per-gauge upstream subnetworks |
These zarr stores are coo matrices stored using the binsparse-python specification
Configuration¶
DDR uses Hydra for configuration management. Configuration files are in YAML format.
Configuration Structure¶
The most important part of your config is shown below. These are the data sources for DDR to work
mode: training # training, testing, or routing
geodataset: lynker_hydrofabric # the geodataset used to determine river connectivity
name: ddr-v${oc.env:DDR_VERSION,dev}-${geodataset}-${mode} # The name of the training run
data_sources:
attributes: "s3://mhpi-spatial/hydrofabric_v2.2_attributes/" # The path to your geodataset attributes
geospatial_fabric_gpkg: /path/to/conus_nextgen.gpkg # the path to your geodataset for river connectivity
conus_adjacency: /path/to/hydrofabric_v2.2_conus_adjacency.zarr # the output of engine/ using your geodataset
gages: /path/to/training_gages.csv # the training gages used
gages_adjacency: /path/to/hydrofabric_v2.2_gages_conus_adjacency.zarr # the output of engine/ using your geodataset for gage subgraph connectivity
streamflow: "s3://mhpi-spatial/hydrofabric_v2.2_dhbv_retrospective" # the unit catchment streamflow prediction output you'll be using
observations: "s3://mhpi-spatial/usgs_streamflow_observations/" # the USGS observations to train your model against
target_catchments:
- 1234 # the river ID for the catchment you want to route upstream of (MERIT EXAMPLE)
- wb-1234 # the river ID for the catchment you want to route upstream of (Lynker Hydrofabric EXAMPLE)
Key Configuration Options¶
Training¶
| Option | Description | Default |
|---|---|---|
mode |
Operating mode: training, testing, routing |
Required |
geodataset |
Dataset type: lynker_hydrofabric, merit |
Required |
experiment.rho |
Time window length for training (days) | None (full period) |
experiment.warmup |
Days excluded from loss calculation to allow the model to warm up | 3 |
experiment.batch_size |
Number of gauges per batch | 1 |
kan.hidden_size |
KAN hidden layer size (recommend 2n+1 where n=input features) | 21 |
device |
GPU ID or "cpu" |
0 |
Testing/Routing (Inference)¶
| Option | Description | Default |
|---|---|---|
mode |
Operating mode: training, testing, routing |
Required |
geodataset |
Dataset type: lynker_hydrofabric, merit |
Required |
experiment.warmup |
Days excluded from loss calculation to allow the model to warm up | 3 |
experiment.batch_size |
Number of days included in the batch | 1 |
kan.hidden_size |
KAN hidden layer size (recommend 2n+1 where n=input features) | 21 |
device |
GPU ID or "cpu" |
0 |
NOTE: More geodataset support is coming soon
Running Your First Model¶
Quick Start¶
DDR provides a CLI entry point after installation. Copy a config template and customize it for your data:
Then run using the ddr CLI:
# Training
ddr train --config-name my_training
# Testing
ddr test --config-name my_testing
# Routing a trained model over specified catchments / the whole dataset
ddr route --config-name my_routing
# Checking the baseline unit catchment metrics
ddr summed-q-prime --config-name my_config
You can also call scripts directly:
Config templates are available in config/templates/ for both MERIT and Lynker Hydrofabric datasets, covering training and routing modes.
NOTE: Please change the config to match what mode/geodataset/method you need to work with. Templates use ${oc.env:DDR_DATA_DIR,./data} so you can set the DDR_DATA_DIR environment variable or edit paths directly.
Monitoring¶
Training progress is logged to the output directory. Model checkpoints are saved to the params.save_path directory.
Expected Model outputs¶
After running DDR, you'll have:
output/ddr-{version}-{geodataset}-{mode}/YYYY-MM-DD_HH-MM-SS/
├── model/ # KAN model states
├── plots/ # any DDR generated plots
├── saved_models/
├── ddr_{version}-{geodataset}-{mode}_epoch_1_mb_0.pt # Checkpoint file
...
└── ddr_{version}-{geodataset}-{mode}_epoch_5_mb_42.pt # Checkpoint file
├── pydantic_config.yaml # Validated configuration
└── ddr-{version}-{geodataset}-{mode}.log # the log file of all information generated during training
Next Steps¶
- Model Training: Detailed training guide
- Model Testing: Evaluate your trained model
- Routing: Run inference with trained weights
- Engine: Learn about data preparation