Skip to content

Configuration Reference

The benchmark configuration extends DDR's standard Hydra configuration with additional options for model comparison.

Example Config

An example configuration template is provided at benchmarks/config/example_benchmark.yaml. Copy it and customize the paths for your environment:

cp benchmarks/config/example_benchmark.yaml benchmarks/config/benchmark.yaml

Full Configuration

# benchmarks/config/benchmark.yaml

defaults:
  - _self_
  - hydra: settings

# === Standard DDR Configuration ===
# Only MERIT is currently supported for benchmarking.

mode: testing
geodataset: merit
name: benchmarks-v${oc.env:BENCHMARKS_VERSION,dev}-${geodataset}
device: 0

data_sources:
  attributes: /path/to/attributes.nc
  geospatial_fabric_gpkg: /path/to/river_network.shp
  conus_adjacency: /path/to/conus_adjacency.zarr
  gages_adjacency: /path/to/gages_adjacency.zarr
  statistics: /path/to/statistics
  streamflow: /path/to/streamflow
  observations: /path/to/observations
  gages: /path/to/gages.csv   # Required for gauge maps (needs STAID, LAT_GAGE, LNG_GAGE)

params:
  parameter_ranges:
    n: [0.02, 0.2]
    q_spatial: [0.0, 1.0]
    top_width: [1.0, 6000.0]
    side_slope: [0.5, 50.0]
  log_space_parameters: [top_width, side_slope]
  defaults:
    p_spatial: 21
  tau: 3

experiment:
  batch_size: 365
  start_time: 1995/10/01
  end_time: 2010/09/30
  warmup: 3
  checkpoint: /path/to/trained_model.pt

kan:
  hidden_size: 21
  input_var_names:
    - SoilGrids1km_clay
    - aridity
    - meanelevation
    - meanP
    - NDVI
    - meanslope
    - log10_uparea
    - SoilGrids1km_sand
    - ETPOT_Hargr
    - Porosity
  num_hidden_layers: 2
  learnable_parameters:
    - n
    - q_spatial
    - top_width
    - side_slope
  grid: 50
  k: 2

# === DiffRoute Configuration ===

diffroute:
  enabled: true
  irf_fn: muskingum
  max_delay: 100
  dt: 0.0416667
  k: 0.1042
  x: 0.3

# === Optional Baseline ===

summed_q_prime: null  # or /path/to/summed_q_prime.zarr

DiffRoute Options

diffroute.enabled

Type: bool Default: true

Enable or disable DiffRoute comparison. Set to false to run DDR-only benchmarks (useful on CPU-only systems since DiffRoute requires CUDA).

diffroute.irf_fn

Type: str Default: "muskingum"

Impulse Response Function model to use. Options:

IRF Parameters Description
muskingum k, x Classic Muskingum routing
linear_storage tau Single linear reservoir
nash_cascade tau, n Cascade of n linear reservoirs
pure_lag delay Pure time delay
hayami D, L, c Diffusive wave approximation

diffroute.max_delay

Type: int Default: 100

Maximum number of timesteps for the LTI router's impulse response. Larger values allow longer travel times but increase memory usage.

diffroute.dt

Type: float Default: 0.0416667 (1 hour in days)

Timestep size in days. Must match DDR's internal timestep (1 hour).

Common values:

Timestep dt (days)
15 min 0.0104167
1 hour 0.0416667
3 hours 0.125
1 day 1.0

diffroute.k

Type: float or null Default: 0.1042 (9000s = 2.5 hours, RAPID default)

Muskingum k parameter (wave travel time through reach) in days.

  • Must be in same units as dt
  • For stability: k >= dt / (2*(1-x))
  • Physical interpretation: k = reach_length / wave_celerity

If null, defaults to 0.1042 days (RAPID default of 9000 seconds).

diffroute.x

Type: float Default: 0.3

Muskingum x parameter (weighting factor).

  • Range: 0.0 to 0.5
  • x = 0: Pure reservoir (maximum attenuation)
  • x = 0.5: Pure translation (no attenuation)
  • Typical values: 0.1 - 0.3

Summed Q' Baseline

summed_q_prime

Type: str or null Default: null

Path to a pre-computed summed Q' zarr store for baseline comparison. The store is generated by scripts/summed_q_prime.py and contains lateral inflow sums at each gage (i.e., streamflow without routing).

When provided, the benchmark will:

  1. Load the summed Q' predictions from the zarr store
  2. Align gage IDs between the benchmark and the store
  3. Compute NSE/KGE/RMSE metrics for the summed Q' baseline
  4. Include summed Q' in all comparison plots alongside DDR and DiffRoute

This is useful for quantifying how much value routing adds over raw lateral inflow summation.

# Generate summed Q' store
python scripts/summed_q_prime.py

# Use in benchmark
cd benchmarks
uv run python scripts/benchmark.py \
    summed_q_prime=/path/to/summed_q_prime.zarr

Command-Line Overrides

Override any configuration option from the command line:

cd benchmarks

# Change DiffRoute parameters
uv run python scripts/benchmark.py diffroute.k=0.1 diffroute.x=0.2

# Change experiment settings
uv run python scripts/benchmark.py \
    experiment.start_time=2000/10/01 \
    experiment.end_time=2005/09/30

# Use different checkpoint
uv run python scripts/benchmark.py \
    experiment.checkpoint=/path/to/other_model.pt

# Run on specific GPU
uv run python scripts/benchmark.py device=0

# Disable DiffRoute (DDR only)
uv run python scripts/benchmark.py diffroute.enabled=false

# Include summed Q' baseline
uv run python scripts/benchmark.py summed_q_prime=/path/to/summed_q_prime.zarr

Output Directory

Results are saved to:

output/benchmarks-v0.1.0-merit/2026-02-06_12-00-00/
├── plots/
│   ├── nse_cdf_comparison.png            # CDF of NSE across all gauges
│   ├── kge_cdf_comparison.png            # CDF of KGE across all gauges
│   ├── metric_boxplot_comparison.png     # 6-panel boxplot (bias, rmse, fhv, flv, nse, kge)
│   ├── gauge_map_ddr_NSE.png            # Map colored by DDR NSE
│   ├── gauge_map_diffroute_NSE.png      # Map colored by DiffRoute NSE
│   ├── gauge_map_sqp_NSE.png            # Map colored by summed Q' NSE (if enabled)
│   └── hydrographs/                     # Per-gage time series with all models overlaid
│       ├── 01234567.png
│       └── ...
├── benchmark_results.zarr
└── .hydra/
    ├── config.yaml
    └── overrides.yaml