R/paretoTimeAnalysis.R
paretoTimeAnalysis.Rd
Identify dominated model realisations, that are inferior to another model in all simulation periods, optionally in multiple catchments. Summary results are then produced to help assess how performance of model realisations and model structures varies across simulation periods.
arguments to paretoTimeAnalysis.crossvalidation
or
paretoTimeAnalysis.data.frame
a crossvalidation runlist
: a list of fitted model objects
produced by crossValidate
If TRUE
, print out a table of models and an
indication of whether they are dominated, as produced by
paretoTimeAnalysis_areModelsDominated
. If not NA
, write out
the table to "show.models_isdominated_models_catchments.csv")
Vector of column names containing performance measures used to determine whether models are dominated across time periods. We assume higher values are better. Values should be transformed prior to use.
Quantities of interest to calculate, as interpreted by
objFunVal
. Defaults are given as examples: 90th percentile
runoff (a scalar prediction) and R Squared using log transformed data (a
performance statistic).
For paretoTimeAnalysis
, no return value. Used for its side
effect of producing text. Optionally writes csv files (see the argument
show.models
).
paretoTimeAnalysis_areModelsDominated
produces a wide-format
data.frame with id variable columns, a column for each sim.period
value, and a column dominated
indicating whether another model is
better in all simulation periods.
Placeholder
crossValidate
for an example use of
paretoTimeAnalysis.crossvalidation
## Dataset consisting of results for two simulation periods,
## obtained by calibration in the same periods with different
## model structures.
data(YeAl97)
## For one catchment, produce a table indicating whether models defined by
## their calib.period and Model.str are dominated according to the objective E
paretoTimeAnalysis_areModelsDominated(subset(YeAl97, Catchment == "Salmon"), objectives = "E")
#> Catchment calib.period Model.str objective First 5Y Second 5Y dominated
#> 1 Salmon First 5Y GSFB E -Inf -Inf TRUE
#> 2 Salmon First 5Y IHACRES E 0.865 0.774 TRUE
#> 3 Salmon First 5Y LASCAM E 0.892 0.905 FALSE
#> 4 Salmon Second 5Y GSFB E -Inf -Inf TRUE
#> 5 Salmon Second 5Y IHACRES E -0.847 0.819 TRUE
#> 6 Salmon Second 5Y LASCAM E 0.795 0.930 FALSE
## For all catchments, performance analysis
paretoTimeAnalysis(YeAl97, objectives = "E")
#>
#> Cross-validation Pareto analysis
#> Which models cannot be rejected, due to dataset uncertainty/non-stationarity?
#>
#> == Eliminating Pareto-dominated models ==
#> (worse than another model in all periods)
#>
#> How many models are dominated (and therefore eliminated)
#> Canning Salmon Stones
#> FALSE 3 2 2
#> TRUE 3 4 4
#>
#> Which model structures are non-dominated in each catchment?
#> Proportion of model instances that are non-dominated
#> GSFB IHACRES LASCAM
#> Canning 0 0.5 1
#> Salmon 0 0.0 1
#> Stones 0 0.0 1
#>
#> Specify show.models=TRUE to show non-dominated and dominated models
#> Specify show.models=prefix to obtain csv of whether models are dominated
#>
#> == Performance across all periods ==
#> What is the range of non-dominated performance (RNDP) across all periods?
#> Is it large - is changing datasets causing problems?
#> Catchment value.min value.max RNDP
#> Canning 0.746 0.958 0.212
#> Salmon 0.795 0.930 0.135
#> Stones 0.824 0.886 0.062
#>
#> == Performance in each period ==
#> What is the RNDP in each period?
#> Is it low even though total RNDP is high? Why?
#> Is there reason to believe the objective function is not comparable over time?
#> Catchment sim.period value.min value.max RNDP
#> Canning First 5Y 0.746 0.958 0.212
#> Canning Second 5Y 0.824 0.945 0.121
#> Salmon First 5Y 0.795 0.892 0.097
#> Salmon Second 5Y 0.905 0.930 0.025
#> Stones First 5Y 0.824 0.886 0.062
#> Stones Second 5Y 0.869 0.883 0.014
#>
#> == Worst non-dominated models in each period ==
#> Do any non-dominated models have unacceptable performance?
#> Which non-dominated model has the worst performance in each period? Why?
#> Is it consistently the same dataset? Is there reason for that dataset to be problematic?
#> Is it consistently the same model structure? Should another model structure have been selected?
#> Is it consistently the same calibration objective function? Is it overfitting part of the record?
#>
#> Canning
#> sim.period worst.performance calib.period Model.str First.5Y Second.5Y
#> First 5Y 0.746 Second 5Y LASCAM 0.746 0.945
#> Second 5Y 0.824 First 5Y LASCAM 0.958 0.824
#>
#> Salmon
#> sim.period worst.performance calib.period Model.str First.5Y Second.5Y
#> First 5Y 0.795 Second 5Y LASCAM 0.795 0.930
#> Second 5Y 0.905 First 5Y LASCAM 0.892 0.905
#>
#> Stones
#> sim.period worst.performance calib.period Model.str First.5Y Second.5Y
#> First 5Y 0.824 Second 5Y LASCAM 0.824 0.883
#> Second 5Y 0.869 First 5Y LASCAM 0.886 0.869
#>
#> == Variation in inferred internal behaviour - Range of non-dominated parameters ==
#> Is the range of parameters of non-dominated models large for any single model structure?
#> Does the difference in performance correspond to different internal model behaviour?
#>
#> argument 'pars' missing, skipping section