Martin Gauch


2022

DOI bib
The Great Lakes Runoff Intercomparison Project Phase 4: The Great Lakes (GRIP-GL)
Juliane Mai, Helen C. Shen, Bryan A. Tolson, Étienne Gaborit, Richard Arsenault, James R. Craig, Vincent Fortin, Lauren M. Fry, Martin Gauch, Daniel Klotz, Frederik Kratzert, Nicole O'Brien, Daniel Princz, Sinan Rasiya Koya, Tirthankar Roy, Frank Seglenieks, Narayan Kumar Shrestha, André Guy Tranquille Temgoua, Vincent Vionnet, Jonathan W. Waddell
Hydrology and Earth System Sciences

Abstract. Model intercomparison studies are carried out to test and compare the simulated outputs of various model setups over the same study domain. The Great Lakes region is such a domain of high public interest as it not only resembles a challenging region to model with its trans-boundary location, strong lake effects, and regions of strong human impact but is also one of the most densely populated areas in the United States and Canada. This study brought together a wide range of researchers setting up their models of choice in a highly standardized experimental setup using the same geophysical datasets, forcings, common routing product, and locations of performance evaluation across the 1 million square kilometer study domain. The study comprises 13 models covering a wide range of model types from Machine Learning based, basin-wise, subbasin-based, and gridded models that are either locally or globally calibrated or calibrated for one of each of six predefined regions of the watershed. Unlike most hydrologically focused model intercomparisons, this study not only compares models regarding their capability to simulated streamflow (Q) but also evaluates the quality of simulated actual evapotranspiration (AET), surface soil moisture (SSM), and snow water equivalent (SWE). The latter three outputs are compared against gridded reference datasets. The comparisons are performed in two ways: either by aggregating model outputs and the reference to basin-level or by regridding all model outputs to the reference grid and comparing the model simulations at each grid-cell. The main results of this study are: (1) The comparison of models regarding streamflow reveals the superior quality of the Machine Learning based model in all experiments performance; even for the most challenging spatio-temporal validation the ML model outperforms any other physically based model. (2) While the locally calibrated models lead to good performance in calibration and temporal validation (even outperforming several regionally calibrated models), they lose performance when they are transferred to locations the model has not been calibrated on. This is likely to be improved with more advanced strategies to transfer these models in space. (3) The regionally calibrated models – while losing less performance in spatial and spatio-temporal validation than locally calibrated models – exhibit low performances in highly regulated and urban areas as well as agricultural regions in the US. (4) Comparisons of additional model outputs (AET, SSM, SWE) against gridded reference datasets show that aggregating model outputs and the reference dataset to basin scale can lead to different conclusions than a comparison at the native grid scale. This is especially true for variables with large spatial variability such as SWE. (5) A multi-objective-based analysis of the model performances across all variables (Q, AET, SSM, SWE) reveals overall excellent performing locally calibrated models (i.e., HYMOD2-lumped) as well as regionally calibrated models (i.e., MESH-SVS-Raven and GEM-Hydro-Watroute) due to varying reasons. The Machine Learning based model was not included here as is not setup to simulate AET, SSM, and SWE. (6) All basin-aggregated model outputs and observations for the model variables evaluated in this study are available on an interactive website that enables users to visualize results and download data and model outputs.

DOI bib
The Great Lakes Runoff Intercomparison Project Phase 4: the Great Lakes (GRIP-GL)
Juliane Mai, Helen C. Shen, Bryan A. Tolson, Étienne Gaborit, Richard Arsenault, James R. Craig, Vincent Fortin, Lauren M. Fry, Martin Gauch, Daniel Klotz, Frederik Kratzert, Nicole O'Brien, Daniel Princz, Sinan Rasiya Koya, Tirthankar Roy, Frank Seglenieks, Narayan Kumar Shrestha, André Guy Tranquille Temgoua, Vincent Vionnet, Jonathan W. Waddell
Hydrology and Earth System Sciences, Volume 26, Issue 13

Abstract. Model intercomparison studies are carried out to test and compare the simulated outputs of various model setups over the same study domain. The Great Lakes region is such a domain of high public interest as it not only resembles a challenging region to model with its transboundary location, strong lake effects, and regions of strong human impact but is also one of the most densely populated areas in the USA and Canada. This study brought together a wide range of researchers setting up their models of choice in a highly standardized experimental setup using the same geophysical datasets, forcings, common routing product, and locations of performance evaluation across the 1×106 km2 study domain. The study comprises 13 models covering a wide range of model types from machine-learning-based, basin-wise, subbasin-based, and gridded models that are either locally or globally calibrated or calibrated for one of each of the six predefined regions of the watershed. Unlike most hydrologically focused model intercomparisons, this study not only compares models regarding their capability to simulate streamflow (Q) but also evaluates the quality of simulated actual evapotranspiration (AET), surface soil moisture (SSM), and snow water equivalent (SWE). The latter three outputs are compared against gridded reference datasets. The comparisons are performed in two ways – either by aggregating model outputs and the reference to basin level or by regridding all model outputs to the reference grid and comparing the model simulations at each grid-cell. The main results of this study are as follows: The comparison of models regarding streamflow reveals the superior quality of the machine-learning-based model in the performance of all experiments; even for the most challenging spatiotemporal validation, the machine learning (ML) model outperforms any other physically based model. While the locally calibrated models lead to good performance in calibration and temporal validation (even outperforming several regionally calibrated models), they lose performance when they are transferred to locations that the model has not been calibrated on. This is likely to be improved with more advanced strategies to transfer these models in space. The regionally calibrated models – while losing less performance in spatial and spatiotemporal validation than locally calibrated models – exhibit low performances in highly regulated and urban areas and agricultural regions in the USA. Comparisons of additional model outputs (AET, SSM, and SWE) against gridded reference datasets show that aggregating model outputs and the reference dataset to the basin scale can lead to different conclusions than a comparison at the native grid scale. The latter is deemed preferable, especially for variables with large spatial variability such as SWE. A multi-objective-based analysis of the model performances across all variables (Q, AET, SSM, and SWE) reveals overall well-performing locally calibrated models (i.e., HYMOD2-lumped) and regionally calibrated models (i.e., MESH-SVS-Raven and GEM-Hydro-Watroute) due to varying reasons. The machine-learning-based model was not included here as it is not set up to simulate AET, SSM, and SWE. All basin-aggregated model outputs and observations for the model variables evaluated in this study are available on an interactive website that enables users to visualize results and download the data and model outputs.

2021

DOI bib
Great Lakes Runoff Intercomparison Project Phase 3: Lake Erie (GRIP-E)
Juliane Mai, Bryan A. Tolson, Helen C. Shen, Étienne Gaborit, Vincent Fortin, Nicolas Gasset, Hervé Awoye, Tricia A. Stadnyk, Lauren M. Fry, Emily A. Bradley, Frank Seglenieks, André Guy Tranquille Temgoua, Daniel Princz, Shervan Gharari, Amin Haghnegahdar, Mohamed Elshamy, Saman Razavi, Martin Gauch, Jimmy Lin, Xiaojing Ni, Yongping Yuan, Meghan McLeod, N. B. Basu, Rohini Kumar, Oldřich Rakovec, Luis Samaniego, Sabine Attinger, Narayan Kumar Shrestha, Prasad Daggupati, Tirthankar Roy, Sungwook Wi, Timothy Hunter, James R. Craig, Alain Pietroniro
Journal of Hydrologic Engineering, Volume 26, Issue 9

AbstractHydrologic model intercomparison studies help to evaluate the agility of models to simulate variables such as streamflow, evaporation, and soil moisture. This study is the third in a sequen...

DOI bib
The proper care and feeding of CAMELS: How limited training data affects streamflow prediction
Martin Gauch, Juliane Mai, Jimmy Lin
Environmental Modelling & Software, Volume 135

Accurate streamflow prediction largely relies on historical meteorological records and streamflow measurements. For many regions, however, such data are only scarcely available. Facing this problem, many studies simply trained their machine learning models on the region's available data, leaving possible repercussions of this strategy unclear. In this study, we evaluate the sensitivity of tree- and LSTM-based models to limited training data, both in terms of geographic diversity and different time spans. We feed the models meteorological observations disseminated with the CAMELS dataset, and individually restrict the training period length, number of training basins, and input sequence length. We quantify how additional training data improve predictions and how many previous days of forcings we should feed the models to obtain best predictions for each training set size. Further, our findings show that tree- and LSTM-based models provide similarly accurate predictions on small datasets, while LSTMs are superior given more training data.

DOI bib
Rainfall–runoff prediction at multiple timescales with a single Long Short-Term Memory network
Martin Gauch, Frederik Kratzert, Daniel Klotz, Grey Nearing, Jimmy Lin, Sepp Hochreiter
Hydrology and Earth System Sciences, Volume 25, Issue 4

Abstract. Long Short-Term Memory (LSTM) networks have been applied to daily discharge prediction with remarkable success. Many practical applications, however, require predictions at more granular timescales. For instance, accurate prediction of short but extreme flood peaks can make a lifesaving difference, yet such peaks may escape the coarse temporal resolution of daily predictions. Naively training an LSTM on hourly data, however, entails very long input sequences that make learning difficult and computationally expensive. In this study, we propose two multi-timescale LSTM (MTS-LSTM) architectures that jointly predict multiple timescales within one model, as they process long-past inputs at a different temporal resolution than more recent inputs. In a benchmark on 516 basins across the continental United States, these models achieved significantly higher Nash–Sutcliffe efficiency (NSE) values than the US National Water Model. Compared to naive prediction with distinct LSTMs per timescale, the multi-timescale architectures are computationally more efficient with no loss in accuracy. Beyond prediction quality, the multi-timescale LSTM can process different input variables at different timescales, which is especially relevant to operational applications where the lead time of meteorological forcings depends on their temporal resolution.

2020

DOI bib
Rainfall–Runoff Prediction at Multiple Timescales with a SingleLong Short-Term Memory Network
Martin Gauch, Frederik Kratzert, Daniel Klotz, Grey Nearing, Jimmy Lin, Sepp Hochreiter

Abstract. Long Short-Term Memory Networks (LSTMs) have been applied to daily discharge prediction with remarkable success. Many practical scenarios, however, require predictions at more granular timescales. For instance, accurate prediction of short but extreme flood peaks can make a life-saving difference, yet such peaks may escape the coarse temporal resolution of daily predictions. Naively training an LSTM on hourly data, however, entails very long input sequences that make learning hard and computationally expensive. In this study, we propose two Multi-Timescale LSTM (MTS-LSTM) architectures that jointly predict multiple timescales within one model, as they process long-past inputs at a single temporal resolution and branch out into each individual timescale for more recent input steps. We test these models on 516 basins across the continental United States and benchmark against the US National Water Model. Compared to naive prediction with a distinct LSTM per timescale, the multi-timescale architectures are computationally more efficient with no loss in accuracy. Beyond prediction quality, the multi-timescale LSTM can process different input variables at different timescales, which is especially relevant to operational applications where the lead time of meteorological forcings depends on their temporal resolution.

DOI bib
An Open-Source Interface to the Canadian Surface Prediction Archive
Martin Gauch, James Bai, Juliane Mai, Jimmy Lin
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020

Data-intensive research and decision-making continue to gain adoption across diverse organizations. As researchers and practitioners increasingly rely on analyzing large data products to both answer scientific questions and for operational needs, data acquisition and pre-processing become critical tasks. For environmental science, the Canadian Surface Prediction Archive (CaSPAr) facilitates easy access to custom subsets of numerical weather predictions. We demonstrate a new open-source interface for CaSPAr that provides easy-to-use map-based querying capabilities and automates data ingestion into the CaSPAr batch processing server.