Hot-Start Initialization¶
When DDR begins routing, it needs an initial discharge value at every node in the river network. This page explains the hot start mechanism that computes a physically reasonable initial state.
The Problem¶
Each routing batch requires an initial discharge \(Q_0(i)\) at every segment \(i\) before the first timestep can be routed. A naive approach sets \(Q_0(i) = Q'_0(i)\) — the local lateral inflow at time \(t{=}0\). This is a poor estimate because downstream segments should carry the accumulated flow from all upstream tributaries, not just their own local contribution.
Consider a simple linear network with 5 reaches, each contributing 2 m\(^3\)/s of lateral inflow:
| Node | Naive \(Q_0\) | Correct \(Q_0\) |
|---|---|---|
| 0 (headwater) | 2 | 2 |
| 1 | 2 | 4 |
| 2 | 2 | 6 |
| 3 | 2 | 8 |
| 4 (outlet) | 2 | 10 |
The naive approach underestimates discharge at every non-headwater node, creating an artificial "ramp-up" period at the start of each training window.
The Solution: Topological Accumulation¶
The hot-start computes accumulated discharge by solving:
Where:
- \(\mathbf{N}\) is the adjacency matrix (lower triangular, \(N_{ij} = 1\) if flow goes from \(j\) to \(i\))
- \(\mathbf{Q}'_0\) is the lateral inflow at \(t{=}0\)
- \(\mathbf{Q}_0\) is the initial discharge we want
Expanding for a single node \(i\):
Because nodes are indexed in topological order (headwaters first, outlets last) and \(\mathbf{I} - \mathbf{N}\) is lower triangular, this system is solved efficiently via forward substitution using triangular_sparse_solve — the same sparse solver used for each routing timestep.
When It Applies¶
| Scenario | Initialization |
|---|---|
| Training (every batch) | Hot-start via topological accumulation |
| Inference (first batch) | Hot-start via topological accumulation |
| Inference (subsequent batches) | State carried from previous batch (carry_state=True) |
Training¶
Each training batch samples a random time window. There is no state to carry between batches, so every batch uses the hot-start. Combined with the warmup period (cfg.experiment.warmup), this gives the model a realistic starting point to route from.
Inference¶
The first batch uses the hot-start. All subsequent batches pass carry_state=True, which preserves the discharge state from the end of the previous batch — maintaining physical continuity across the full simulation period.
Implementation¶
The hot-start is implemented in compute_hotstart_discharge() in src/ddr/routing/mmc.py:
from ddr.routing.mmc import compute_hotstart_discharge
discharge_t0 = compute_hotstart_discharge(
q_prime_t0=q_prime[0], # lateral inflow at t=0
mapper=mapper, # PatternMapper from the network
discharge_lb=discharge_lb,
device=device,
)
The function:
- Constructs the \((\mathbf{I} - \mathbf{N})\) matrix using the same
PatternMapperandfill_opinfrastructure asroute_timestep - Solves the lower-triangular system via
triangular_sparse_solve - Clamps the result to
discharge_lb(physical lower bound)
Differentiability¶
triangular_sparse_solve has a custom backward pass, so gradients flow through the hot-start initialization during training.
Relationship to Summed Q'¶
The hot-start solves the same mathematical problem as Summed Lateral Flow, but at runtime rather than as a preprocessing step. The sparse solve approach avoids needing to precompute and store accumulated flows in the dataset, and naturally handles arbitrary subnetworks extracted for gauge-based training.