Configuration Reference¶

The benchmark configuration extends DDR's standard Hydra configuration with additional options for model comparison.

Example Config¶

An example configuration template is provided at benchmarks/config/example_benchmark.yaml. Copy it and customize the paths for your environment:

cp benchmarks/config/example_benchmark.yaml benchmarks/config/benchmark.yaml

Full Configuration¶

# benchmarks/config/benchmark.yaml

defaults:
  - _self_
  - hydra: settings

# === Standard DDR Configuration ===
# Only MERIT is currently supported for benchmarking.

mode: testing
geodataset: merit
name: benchmarks-v${oc.env:BENCHMARKS_VERSION,dev}-${geodataset}
device: 0

data_sources:
  attributes: /path/to/attributes.nc
  geospatial_fabric_gpkg: /path/to/river_network.shp
  conus_adjacency: /path/to/conus_adjacency.zarr
  gages_adjacency: /path/to/gages_adjacency.zarr
  statistics: /path/to/statistics
  streamflow: /path/to/streamflow
  observations: /path/to/observations
  gages: /path/to/gages.csv   # Required for gauge maps (needs STAID, LAT_GAGE, LNG_GAGE)

params:
  parameter_ranges:
    n: [0.02, 0.2]
    q_spatial: [0.0, 1.0]
    top_width: [1.0, 6000.0]
    side_slope: [0.5, 50.0]
  log_space_parameters: [top_width, side_slope]
  defaults:
    p_spatial: 21
  tau: 3

experiment:
  batch_size: 365
  start_time: 1995/10/01
  end_time: 2010/09/30
  warmup: 3
  checkpoint: /path/to/trained_model.pt

kan:
  hidden_size: 21
  input_var_names:
    - SoilGrids1km_clay
    - aridity
    - meanelevation
    - meanP
    - NDVI
    - meanslope
    - log10_uparea
    - SoilGrids1km_sand
    - ETPOT_Hargr
    - Porosity
  num_hidden_layers: 2
  learnable_parameters:
    - n
    - q_spatial
    - top_width
    - side_slope
  grid: 50
  k: 2

# === DiffRoute Configuration ===

diffroute:
  enabled: true
  irf_fn: muskingum
  max_delay: 100
  dt: 0.0416667
  k: 0.1042
  x: 0.3

# === Optional Baseline ===

summed_q_prime: null  # or /path/to/summed_q_prime.zarr

DiffRoute Options¶

`diffroute.enabled`¶

Type: bool Default: true

Enable or disable DiffRoute comparison. Set to false to run DDR-only benchmarks (useful on CPU-only systems since DiffRoute requires CUDA).

`diffroute.irf_fn`¶

Type: str Default: "muskingum"

Impulse Response Function model to use. Options:

IRF	Parameters	Description
`muskingum`	k, x	Classic Muskingum routing
`linear_storage`	tau	Single linear reservoir
`nash_cascade`	tau, n	Cascade of n linear reservoirs
`pure_lag`	delay	Pure time delay
`hayami`	D, L, c	Diffusive wave approximation

`diffroute.max_delay`¶

Type: int Default: 100

Maximum number of timesteps for the LTI router's impulse response. Larger values allow longer travel times but increase memory usage.

`diffroute.dt`¶

Type: float Default: 0.0416667 (1 hour in days)

Timestep size in days. Must match DDR's internal timestep (1 hour).

Common values:

Timestep	dt (days)
15 min	0.0104167
1 hour	0.0416667
3 hours	0.125
1 day	1.0

`diffroute.k`¶

Type: float or null Default: 0.1042 (9000s = 2.5 hours, RAPID default)

Muskingum k parameter (wave travel time through reach) in days.

Must be in same units as dt
For stability: k >= dt / (2*(1-x))
Physical interpretation: k = reach_length / wave_celerity

If null, defaults to 0.1042 days (RAPID default of 9000 seconds).

`diffroute.x`¶

Type: float Default: 0.3

Muskingum x parameter (weighting factor).

Range: 0.0 to 0.5
x = 0: Pure reservoir (maximum attenuation)
x = 0.5: Pure translation (no attenuation)
Typical values: 0.1 - 0.3

Summed Q' Baseline¶

`summed_q_prime`¶

Type: str or null Default: null

Path to a pre-computed summed Q' zarr store for baseline comparison. The store is generated by scripts/summed_q_prime.py and contains lateral inflow sums at each gage (i.e., streamflow without routing).

When provided, the benchmark will:

Load the summed Q' predictions from the zarr store
Align gage IDs between the benchmark and the store
Compute NSE/KGE/RMSE metrics for the summed Q' baseline
Include summed Q' in all comparison plots alongside DDR and DiffRoute

This is useful for quantifying how much value routing adds over raw lateral inflow summation.

# Generate summed Q' store
python scripts/summed_q_prime.py

# Use in benchmark
cd benchmarks
uv run python scripts/benchmark.py \
    summed_q_prime=/path/to/summed_q_prime.zarr

Command-Line Overrides¶

Override any configuration option from the command line:

cd benchmarks

# Change DiffRoute parameters
uv run python scripts/benchmark.py diffroute.k=0.1 diffroute.x=0.2

# Change experiment settings
uv run python scripts/benchmark.py \
    experiment.start_time=2000/10/01 \
    experiment.end_time=2005/09/30

# Use different checkpoint
uv run python scripts/benchmark.py \
    experiment.checkpoint=/path/to/other_model.pt

# Run on specific GPU
uv run python scripts/benchmark.py device=0

# Disable DiffRoute (DDR only)
uv run python scripts/benchmark.py diffroute.enabled=false

# Include summed Q' baseline
uv run python scripts/benchmark.py summed_q_prime=/path/to/summed_q_prime.zarr

Output Directory¶

Results are saved to:

output/benchmarks-v0.1.0-merit/2026-02-06_12-00-00/
├── plots/
│   ├── nse_cdf_comparison.png            # CDF of NSE across all gauges
│   ├── kge_cdf_comparison.png            # CDF of KGE across all gauges
│   ├── metric_boxplot_comparison.png     # 6-panel boxplot (bias, rmse, fhv, flv, nse, kge)
│   ├── gauge_map_ddr_NSE.png            # Map colored by DDR NSE
│   ├── gauge_map_diffroute_NSE.png      # Map colored by DiffRoute NSE
│   ├── gauge_map_sqp_NSE.png            # Map colored by summed Q' NSE (if enabled)
│   └── hydrographs/                     # Per-gage time series with all models overlaid
│       ├── 01234567.png
│       └── ...
├── benchmark_results.zarr
└── .hydra/
    ├── config.yaml
    └── overrides.yaml

Configuration Reference¶

Example Config¶

Full Configuration¶

DiffRoute Options¶

diffroute.enabled¶

diffroute.irf_fn¶

diffroute.max_delay¶

diffroute.dt¶

diffroute.k¶

diffroute.x¶