Configuration Reference¶
The benchmark configuration extends DDR's standard Hydra configuration with additional options for model comparison.
Example Config¶
An example configuration template is provided at benchmarks/config/example_benchmark.yaml. Copy it and customize the paths for your environment:
Full Configuration¶
# benchmarks/config/benchmark.yaml
defaults:
- _self_
- hydra: settings
# === Standard DDR Configuration ===
# Only MERIT is currently supported for benchmarking.
mode: testing
geodataset: merit
name: benchmarks-v${oc.env:BENCHMARKS_VERSION,dev}-${geodataset}
device: 0
data_sources:
attributes: /path/to/attributes.nc
geospatial_fabric_gpkg: /path/to/river_network.shp
conus_adjacency: /path/to/conus_adjacency.zarr
gages_adjacency: /path/to/gages_adjacency.zarr
statistics: /path/to/statistics
streamflow: /path/to/streamflow
observations: /path/to/observations
gages: /path/to/gages.csv # Required for gauge maps (needs STAID, LAT_GAGE, LNG_GAGE)
params:
parameter_ranges:
n: [0.02, 0.2]
q_spatial: [0.0, 1.0]
top_width: [1.0, 6000.0]
side_slope: [0.5, 50.0]
log_space_parameters: [top_width, side_slope]
defaults:
p_spatial: 21
tau: 3
experiment:
batch_size: 365
start_time: 1995/10/01
end_time: 2010/09/30
warmup: 3
checkpoint: /path/to/trained_model.pt
kan:
hidden_size: 21
input_var_names:
- SoilGrids1km_clay
- aridity
- meanelevation
- meanP
- NDVI
- meanslope
- log10_uparea
- SoilGrids1km_sand
- ETPOT_Hargr
- Porosity
num_hidden_layers: 2
learnable_parameters:
- n
- q_spatial
- top_width
- side_slope
grid: 50
k: 2
# === DiffRoute Configuration ===
diffroute:
enabled: true
irf_fn: muskingum
max_delay: 100
dt: 0.0416667
k: 0.1042
x: 0.3
# === Optional Baseline ===
summed_q_prime: null # or /path/to/summed_q_prime.zarr
DiffRoute Options¶
diffroute.enabled¶
Type: bool
Default: true
Enable or disable DiffRoute comparison. Set to false to run DDR-only benchmarks (useful on CPU-only systems since DiffRoute requires CUDA).
diffroute.irf_fn¶
Type: str
Default: "muskingum"
Impulse Response Function model to use. Options:
| IRF | Parameters | Description |
|---|---|---|
muskingum |
k, x | Classic Muskingum routing |
linear_storage |
tau | Single linear reservoir |
nash_cascade |
tau, n | Cascade of n linear reservoirs |
pure_lag |
delay | Pure time delay |
hayami |
D, L, c | Diffusive wave approximation |
diffroute.max_delay¶
Type: int
Default: 100
Maximum number of timesteps for the LTI router's impulse response. Larger values allow longer travel times but increase memory usage.
diffroute.dt¶
Type: float
Default: 0.0416667 (1 hour in days)
Timestep size in days. Must match DDR's internal timestep (1 hour).
Common values:
| Timestep | dt (days) |
|---|---|
| 15 min | 0.0104167 |
| 1 hour | 0.0416667 |
| 3 hours | 0.125 |
| 1 day | 1.0 |
diffroute.k¶
Type: float or null
Default: 0.1042 (9000s = 2.5 hours, RAPID default)
Muskingum k parameter (wave travel time through reach) in days.
- Must be in same units as
dt - For stability:
k >= dt / (2*(1-x)) - Physical interpretation:
k = reach_length / wave_celerity
If null, defaults to 0.1042 days (RAPID default of 9000 seconds).
diffroute.x¶
Type: float
Default: 0.3
Muskingum x parameter (weighting factor).
- Range: 0.0 to 0.5
x = 0: Pure reservoir (maximum attenuation)x = 0.5: Pure translation (no attenuation)- Typical values: 0.1 - 0.3
Summed Q' Baseline¶
summed_q_prime¶
Type: str or null
Default: null
Path to a pre-computed summed Q' zarr store for baseline comparison. The store is generated by scripts/summed_q_prime.py and contains lateral inflow sums at each gage (i.e., streamflow without routing).
When provided, the benchmark will:
- Load the summed Q' predictions from the zarr store
- Align gage IDs between the benchmark and the store
- Compute NSE/KGE/RMSE metrics for the summed Q' baseline
- Include summed Q' in all comparison plots alongside DDR and DiffRoute
This is useful for quantifying how much value routing adds over raw lateral inflow summation.
# Generate summed Q' store
python scripts/summed_q_prime.py
# Use in benchmark
cd benchmarks
uv run python scripts/benchmark.py \
summed_q_prime=/path/to/summed_q_prime.zarr
Command-Line Overrides¶
Override any configuration option from the command line:
cd benchmarks
# Change DiffRoute parameters
uv run python scripts/benchmark.py diffroute.k=0.1 diffroute.x=0.2
# Change experiment settings
uv run python scripts/benchmark.py \
experiment.start_time=2000/10/01 \
experiment.end_time=2005/09/30
# Use different checkpoint
uv run python scripts/benchmark.py \
experiment.checkpoint=/path/to/other_model.pt
# Run on specific GPU
uv run python scripts/benchmark.py device=0
# Disable DiffRoute (DDR only)
uv run python scripts/benchmark.py diffroute.enabled=false
# Include summed Q' baseline
uv run python scripts/benchmark.py summed_q_prime=/path/to/summed_q_prime.zarr
Output Directory¶
Results are saved to:
output/benchmarks-v0.1.0-merit/2026-02-06_12-00-00/
├── plots/
│ ├── nse_cdf_comparison.png # CDF of NSE across all gauges
│ ├── kge_cdf_comparison.png # CDF of KGE across all gauges
│ ├── metric_boxplot_comparison.png # 6-panel boxplot (bias, rmse, fhv, flv, nse, kge)
│ ├── gauge_map_ddr_NSE.png # Map colored by DDR NSE
│ ├── gauge_map_diffroute_NSE.png # Map colored by DiffRoute NSE
│ ├── gauge_map_sqp_NSE.png # Map colored by summed Q' NSE (if enabled)
│ └── hydrographs/ # Per-gage time series with all models overlaid
│ ├── 01234567.png
│ └── ...
├── benchmark_results.zarr
└── .hydra/
├── config.yaml
└── overrides.yaml