2021
DOI
bib
abs
The Abuse of Popular Performance Metrics in Hydrologic Modeling
Martyn Clark,
Richard M. Vogel,
Jonathan Lamontagne,
Naoki Mizukami,
Wouter Knoben,
Guoqiang Tang,
Shervan Gharari,
Jim Freer,
Paul H. Whitfield,
Kevin Shook,
Simon Michael Papalexiou,
Martyn Clark,
Richard M. Vogel,
Jonathan Lamontagne,
Naoki Mizukami,
Wouter Knoben,
Guoqiang Tang,
Shervan Gharari,
Jim Freer,
Paul H. Whitfield,
Kevin Shook,
Simon Michael Papalexiou
Water Resources Research, Volume 57, Issue 9
The goal of this commentary is to critically evaluate the use of popular performance metrics in hydrologic modeling. We focus on the Nash-Sutcliffe Efficiency (NSE) and the Kling-Gupta Efficiency (KGE) metrics, which are both widely used in hydrologic research and practice around the world. Our specific objectives are: (a) to provide tools that quantify the sampling uncertainty in popular performance metrics; (b) to quantify sampling uncertainty in popular performance metrics across a large sample of catchments; and (c) to prescribe the further research that is, needed to improve the estimation, interpretation, and use of popular performance metrics in hydrologic modeling. Our large-sample analysis demonstrates that there is substantial sampling uncertainty in the NSE and KGE estimators. This occurs because the probability distribution of squared errors between model simulations and observations has heavy tails, meaning that performance metrics can be heavily influenced by just a few data points. Our results highlight obvious (yet ignored) abuses of performance metrics that contaminate the conclusions of many hydrologic modeling studies: It is essential to quantify the sampling uncertainty in performance metrics when justifying the use of a model for a specific purpose and when comparing the performance of competing models.
DOI
bib
abs
The Abuse of Popular Performance Metrics in Hydrologic Modeling
Martyn Clark,
Richard M. Vogel,
Jonathan Lamontagne,
Naoki Mizukami,
Wouter Knoben,
Guoqiang Tang,
Shervan Gharari,
Jim Freer,
Paul H. Whitfield,
Kevin Shook,
Simon Michael Papalexiou,
Martyn Clark,
Richard M. Vogel,
Jonathan Lamontagne,
Naoki Mizukami,
Wouter Knoben,
Guoqiang Tang,
Shervan Gharari,
Jim Freer,
Paul H. Whitfield,
Kevin Shook,
Simon Michael Papalexiou
Water Resources Research, Volume 57, Issue 9
The goal of this commentary is to critically evaluate the use of popular performance metrics in hydrologic modeling. We focus on the Nash-Sutcliffe Efficiency (NSE) and the Kling-Gupta Efficiency (KGE) metrics, which are both widely used in hydrologic research and practice around the world. Our specific objectives are: (a) to provide tools that quantify the sampling uncertainty in popular performance metrics; (b) to quantify sampling uncertainty in popular performance metrics across a large sample of catchments; and (c) to prescribe the further research that is, needed to improve the estimation, interpretation, and use of popular performance metrics in hydrologic modeling. Our large-sample analysis demonstrates that there is substantial sampling uncertainty in the NSE and KGE estimators. This occurs because the probability distribution of squared errors between model simulations and observations has heavy tails, meaning that performance metrics can be heavily influenced by just a few data points. Our results highlight obvious (yet ignored) abuses of performance metrics that contaminate the conclusions of many hydrologic modeling studies: It is essential to quantify the sampling uncertainty in performance metrics when justifying the use of a model for a specific purpose and when comparing the performance of competing models.
2018
Abstract Prewhitening, the process of eliminating or reducing short-term stochastic persistence to enable detection of deterministic change, has been extensively applied to time series analysis of a range of geophysical variables. Despite the controversy around its utility, methodologies for prewhitening time series continue to be a critical feature of a variety of analyses including: trend detection of hydroclimatic variables and reconstruction of climate and/or hydrology through proxy records such as tree rings. With a focus on the latter, this paper presents a generalized approach to exploring the impact of a wide range of stochastic structures of short- and long-term persistence on the variability of hydroclimatic time series. Through this approach, we examine the impact of prewhitening on the inferred variability of time series across time scales. We document how a focus on prewhitened, residual time series can be misleading, as it can drastically distort (or remove) the structure of variability across time scales. Through examples with actual data, we show how such loss of information in prewhitened time series of tree rings (so-called “residual chronologies”) can lead to the underestimation of extreme conditions in climate and hydrology, particularly droughts, reconstructed for centuries preceding the historical period.