| Type: | Package |
| Title: | Extreme Value Modeling for r-Largest Order Statistics |
| Version: | 0.1.0 |
| Description: | Tools for extreme value modeling based on the r-largest order statistics framework. The package provides functions for parameter estimation via maximum likelihood, return level estimation with standard errors, profile likelihood-based confidence intervals, random sample generation, and entropy difference tests for selecting the number of order statistics r. Several r-largest order statistics models are implemented, including the four-parameter kappa (rK4D), generalized logistic (rGLO), generalized Gumbel (rGGD), logistic (rLD), and Gumbel (rGD) distributions. The rK4D methodology is described in Shin et al. (2022) <doi:10.1016/j.wace.2022.100533>, the rGLO model in Shin and Park (2024) <doi:10.1007/s00477-023-02642-7>, and the rGGD model in Shin and Park (2025) <doi:10.1038/s41598-024-83273-y>. The underlying distributions are related to the kappa distribution of Hosking (1994) <doi:10.1017/CBO9780511529443>, the generalized logistic distribution discussed by Ahmad et al. (1988) <doi:10.1016/0022-1694(88)90015-7>, and the generalized Gumbel distribution of Jeong et al. (2014) <doi:10.1007/s00477-014-0865-8>. Penalized likelihood approaches for extreme value estimation follow Martins and Stedinger (2000) <doi:10.1029/1999WR900330> and Coles and Dixon (1999) <doi:10.1023/A:1009905222644>. Selection of r is supported using methods discussed in Bader et al. (2017) <doi:10.1007/s11222-016-9697-3>. The package is intended for hydrological, climatological, and environmental extreme value analysis. |
| License: | GPL-3 |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.3 |
| Depends: | R (≥ 4.1.0) |
| Imports: | eva, graphics, lmomco, numDeriv, Rsolnp, stats |
| Suggests: | testthat (≥ 3.0.0), |
| Config/testthat/edition: | 3 |
| LazyData: | true |
| URL: | https://github.com/yire-shin/evmr |
| BugReports: | https://github.com/yire-shin/evmr/issues |
| NeedsCompilation: | no |
| Packaged: | 2026-03-25 01:20:07 UTC; user |
| Author: | Yire Shin |
| Maintainer: | Yire Shin <shinyire87@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-03-29 16:40:08 UTC |
Bangkok Rainfall Data
Description
Annual top five daily rainfall events recorded in Bangkok, Thailand, from 1961 to 2018. The dataset contains the five largest daily rainfall amounts observed each year.
Usage
bangkok
Format
A data frame with 58 rows and 5 columns:
- X1
Largest daily rainfall in the year (mm)
- X2
Second largest daily rainfall (mm)
- X3
Third largest daily rainfall (mm)
- X4
Fourth largest daily rainfall (mm)
- X5
Fifth largest daily rainfall (mm)
Details
The data are commonly used for extreme value analysis based on r-largest order statistics.
Each row corresponds to one year from 1961 to 2018 and contains the five largest daily rainfall observations recorded in that year.
Source
Rain gauge station records from Bangkok, Thailand.
References
Shin, Y and Park, J-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics.
Examples
data(bangkok)
head(bangkok)
Bevern Stream Flow Data
Description
Annual r-largest stream flow observations from the Bevern River in the UK. The dataset contains the three largest daily stream flow values recorded in each year.
Usage
bevern
Format
A data frame with 52 rows and 4 columns:
- Year
Year of observation
- r1
Largest daily stream flow in the year
- r2
Second largest daily stream flow
- r3
Third largest daily stream flow
Details
This dataset is commonly used for extreme value analysis based on r-largest order statistics.
The data represent annual r-largest daily stream flow observations from the Bevern River. Each row corresponds to one year and contains the three largest daily stream flow measurements recorded in that year.
Source
United Kingdom hydrological records. This is the original data source containing the daily stream flow observations.
References
Shin, Y. and Park, J.-S. (2024). Generalized logistic model for r-largest order statistics, with hydrological application.
Examples
data(bevern)
head(bevern)
Fit and Compare r-Largest Order Statistics Models
Description
Fit multiple extreme value models for r-largest order statistics and return a combined summary table including parameter estimates, standard errors, and return levels.
Usage
evmr(data, models = c("rk4d", "rglo", "rggd", "rgd", "rld"), num_inits = 100)
Arguments
data |
A vector, matrix, or data frame containing r-largest order statistics. |
models |
Character vector specifying models to fit. |
num_inits |
Number of random initial values used in optimization. |
Value
A data frame summarizing fitted models.
Examples
x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
evmr(x$rmat)
data(bangkok)
evmr(bangkok)
Oykel River Stream Flow Data
Description
Annual r-largest daily stream flow observations from the Oykel River in the United Kingdom. The dataset contains the three largest daily stream flow values recorded in each year.
Usage
oykel
Format
A data frame with 42 rows and 4 variables:
- Year
Year of observation
- r1
Largest daily stream flow in the year
- r2
Second largest daily stream flow
- r3
Third largest daily stream flow
Details
The data are used for extreme value analysis based on r-largest order statistics models.
Each row represents one year and contains the three largest
daily stream flow observations recorded in that year.
Missing observations are represented by NA.
Source
United Kingdom hydrological records. This is the original data source containing the daily stream flow data.
References
Shin, Y. and Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics, with hydrological application.
Examples
data(oykel)
head(oykel)
Quantile Function of the Gumbel Distribution
Description
Computes the quantiles of the Gumbel distribution with location
parameter loc and scale parameter scale.
Usage
qgd(p, loc = 0, scale = 1)
Arguments
p |
A numeric vector of probabilities in |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
Details
The quantile function of the Gumbel distribution is
Q(p) = \mu - \sigma \log(-\log(p)),
where \mu is the location parameter and \sigma > 0
is the scale parameter.
Value
A numeric vector of quantiles corresponding to p.
Examples
qgd(0.5, loc = 0, scale = 1)
qgd(c(0.1, 0.5, 0.9), loc = 0, scale = 1)
Quantile Function of the Generalized Gumbel Distribution
Description
Computes the quantiles of the generalized Gumbel distribution
with location parameter loc, scale parameter scale,
and shape parameter shape.
Usage
qggd(p, loc = 0, scale = 1, shape = 0)
Arguments
p |
A numeric vector of probabilities in |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
shape |
A numeric value specifying the shape parameter. |
Details
The quantile function is computed as
Q(p) = \mu - \sigma \log \left( \frac{1 - p^h}{h} \right), \quad h \neq 0,
with the limiting case
Q(p) = \mu - \sigma \log(-\log p), \quad h = 0,
where \mu is the location parameter, \sigma > 0 is the
scale parameter, and h is the shape parameter.
Value
A numeric vector of quantiles corresponding to p.
References
Jeong, B.-Y., Murshed, M. S., Seo, Y. A., and Park, J.-S. (2014). A three-parameter kappa distribution with hydrologic application: a generalized Gumbel distribution. Stochastic Environmental Research and Risk Assessment, 28(8), 2063–2074.
Examples
qggd(0.5, loc = 0, scale = 1, shape = 0.1)
qggd(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape = 0.1)
Quantile Function of the Generalized Logistic Distribution
Description
Computes the quantiles of the generalized logistic distribution
with location parameter loc, scale parameter scale,
and shape parameter shape.
Usage
qglo(p, loc = 0, scale = 1, shape = 0)
Arguments
p |
A numeric vector of probabilities in |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
shape |
A numeric value specifying the shape parameter. |
Details
The quantile function is computed as
Q(p) = \mu + \frac{\sigma}{\xi}\left[1 - \left(\frac{1-p}{p}\right)^{\xi}\right], \quad \xi \neq 0,
with the limiting case
Q(p) = \mu - \sigma \log\left(\frac{1-p}{p}\right), \quad \xi = 0,
where \mu is the location parameter, \sigma > 0 is the
scale parameter, and \xi is the shape parameter.
Value
A numeric vector of quantiles corresponding to p.
References
Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7
Examples
qglo(0.5, loc = 0, scale = 1, shape = 0.1)
qglo(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape = 0.1)
Quantile Function of the Four-Parameter Kappa Distribution
Description
Computes the quantiles of the four-parameter kappa distribution
with location parameter loc, scale parameter scale,
and shape parameters shape1 and shape2.
Usage
qk4d(p, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
Arguments
p |
A numeric vector of probabilities in |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
shape1 |
A numeric value specifying the first shape parameter. |
shape2 |
A numeric value specifying the second shape parameter. |
Details
The quantile function of the four-parameter kappa distribution is
Q(p) = \mu + \frac{\sigma}{\xi}\left[1 - \left(\frac{1-p^h}{h}\right)^\xi \right],
where \mu is the location parameter, \sigma > 0 is the
scale parameter, and \xi and h are shape parameters.
For numerical stability, the limiting cases \xi = 0 and/or
h = 0 are handled separately.
Value
A numeric vector of quantiles corresponding to p.
References
Shin, Y., and Park, J.-S.(2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533
Hosking, J. R. M. (1994). The four-parameter Kappa distribution. Cambridge University Press.
Examples
qk4d(0.5, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
qk4d(c(0.1, 0.5, 0.9), loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
Quantile Function of the Logistic Distribution
Description
Computes the quantiles of the logistic distribution with location
parameter loc and scale parameter scale.
Usage
qld(p, loc = 0, scale = 1)
Arguments
p |
A numeric vector of probabilities in |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
Details
The quantile function of the logistic distribution is
Q(p) = \mu + \sigma \log\left(\frac{p}{1-p}\right),
where \mu is the location parameter and \sigma > 0
is the scale parameter.
Value
A numeric vector of quantiles corresponding to p.
Examples
qld(0.5, loc = 0, scale = 1)
qld(c(0.1, 0.5, 0.9), loc = 0, scale = 1)
Fit the Gumbel Distribution to r-Largest Order Statistics
Description
Fits the Gumbel distribution to r-largest order statistics using
maximum likelihood estimation. Stationary and non-stationary models are
supported through generalized linear modelling of the location and scale
parameters.
Usage
rgd.fit(
xdat,
r = NULL,
ydat = NULL,
mul = NULL,
sigl = NULL,
mulink = identity,
siglink = identity,
num_inits = 100,
muinit = NULL,
siginit = NULL,
show = TRUE,
method = "Nelder-Mead",
maxit = 10000,
...
)
Arguments
xdat |
A numeric vector, matrix, or data frame of observations.
Each row should contain decreasing order statistics for a given year
or block. The first column therefore contains block maxima. Only the
first |
r |
The number of largest order statistics to use in the fitted model.
If |
ydat |
A matrix or data frame of covariates for non-stationary modelling
of the parameters, or |
mul, sigl |
Integer vectors indicating which columns of |
mulink, siglink |
Inverse link functions for the location and scale parameters, respectively. |
num_inits |
The number of initial parameter sets used in the optimization. |
muinit, siginit |
Numeric vectors giving initial values for the location
and scale parameters. If |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for |
... |
Additional control arguments passed to the optimizer. |
Value
A list with components including:
trans |
Logical; |
model |
A list containing |
link |
A character string describing the inverse link functions. |
conv |
The convergence code returned by the optimizer. A value of 0
indicates successful convergence for |
nllh |
The negative log-likelihood evaluated at the fitted parameters. |
data |
The data used in the fit. |
mle |
The maximum likelihood estimates. |
cov |
The estimated covariance matrix. |
se |
The estimated standard errors. |
vals |
A matrix containing fitted values of the location and scale parameters at each observation. |
r |
The number of order statistics used in the fitted model. |
References
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y
See Also
Examples
x <- rgdr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rgd.fit(x$rmat)
Profile Likelihood for Return Levels under the rGD Model
Description
Computes and plots the profile log-likelihood for a return level under
a stationary r-largest Gumbel distribution model fitted by rgd.fit().
Usage
rgd.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
Arguments
z |
An object returned by |
m |
A return period greater than 1. The profile likelihood is computed
for the corresponding return level exceeded with probability |
xlow, xup |
The lower and upper bounds of the return level grid over which the profile likelihood is evaluated. |
conf |
A numeric vector of confidence levels for profile likelihood confidence intervals. |
nint |
The number of grid points used to evaluate the profile likelihood. |
Details
The function evaluates the profile log-likelihood over a grid of return
level values and plots the resulting curve. Horizontal and vertical lines
are added to indicate profile likelihood confidence intervals for the
confidence levels specified in conf.
Value
A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.
See Also
Examples
x <- rgdr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rgd.fit(x$rmat)
rgd.prof(fit, m = 100, xlow = 12, xup = 25)
Return Levels for the Gumbel Distribution
Description
Computes return levels and their standard errors for a stationary
Gumbel model fitted by rgd.fit.
Usage
rgd.rl(z, year = c(20, 50, 100, 200), show = TRUE)
Arguments
z |
An object returned by |
year |
A numeric vector of return periods for which return levels are to be computed. |
show |
Logical. If |
Details
For a return period T, the return level is defined as the quantile
exceeded with probability 1/T. Under the Gumbel distribution, the
return level is
x_T = \mu - \sigma \log\{-\log(1 - 1/T)\}.
Standard errors are obtained using the delta method.
Value
The input object z with two additional components:
rl |
A numeric vector of estimated return levels. |
rlse |
A numeric vector of standard errors of the estimated return levels. |
See Also
Examples
x <- rgdr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rgd.fit(x$rmat)
out <- rgd.rl(fit, year = c(20, 50, 100, 200))
Summary of Fitted rGD Models over Different Values of r
Description
Summarizes fitted Gumbel distribution models for r-largest order
statistics over r = 1, \dots, R. For each value of r,
the function fits the model using rgd.fit and computes
return levels using rgd.rl.
Usage
rgd.summary(
data,
r = NULL,
ydat = NULL,
mul = NULL,
sigl = NULL,
mulink = identity,
siglink = identity,
num_inits = 100,
muinit = NULL,
siginit = NULL,
show = FALSE,
method = "Nelder-Mead",
maxit = 10000,
...
)
Arguments
data |
A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period. |
r |
Optional integer giving the maximum number of order statistics
to summarize. If |
ydat |
A matrix or data frame of covariates for generalized linear
modelling of the parameters, or |
mul, sigl |
Integer vectors indicating which columns of |
mulink, siglink |
Inverse link functions for the location and scale parameters, respectively. |
num_inits |
Number of initial parameter sets used in optimization. |
muinit, siginit |
Optional initial values for the location and scale parameters. |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for optimization. |
... |
Additional arguments passed to |
Value
A data frame containing:
-
r: number of order statistics used -
nllh: negative log-likelihood -
mu,sigma: parameter estimates -
mu.se,sigma.se: standard errors -
rl20,rl50,rl100,rl200: return levels -
rl20.se,rl50.se,rl100.se,rl200.se: standard errors of return levels
Examples
x <- rgdr(n = 50, r = 2, loc = 10, scale = 2)
rgd.summary(x$rmat)
Random Generation from the Gumbel Distribution for r-Largest Order Statistics
Description
Generates random samples from the Gumbel distribution for
r-largest order statistics.
Usage
rgdr(n, r, loc = 0, scale = 1)
Arguments
n |
A positive integer specifying the number of observations. |
r |
A positive integer specifying the number of order statistics for each observation. |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
Details
The function first generates independent uniform random variables and then
constructs decreasing variables through cumulative products. These are
transformed using the Gumbel quantile function qgd.
Value
A list with components:
umat |
An |
wmat |
An |
rmat |
An |
Examples
x <- rgdr(n=10, r=3, loc = 0, scale = 1)
x$rmat
Fit the Generalized Gumbel Distribution to r-Largest Order Statistics
Description
Fits the generalized Gumbel distribution to r-largest order statistics
using maximum likelihood estimation. Stationary and non-stationary models
are supported through generalized linear modelling of the location, scale,
and shape parameters.
Usage
rggd.fit(
xdat,
r = NULL,
ydat = NULL,
mul = NULL,
sigl = NULL,
hl = NULL,
mulink = identity,
siglink = identity,
hlink = identity,
num_inits = 100,
muinit = NULL,
siginit = NULL,
hinit = NULL,
show = TRUE,
method = "Nelder-Mead",
maxit = 10000,
...
)
Arguments
xdat |
A numeric vector, matrix, or data frame of observations.
Each row should contain decreasing order statistics for a given year
or block. The first column therefore contains block maxima. Only the
first |
r |
The number of largest order statistics to use in the fitted model.
If |
ydat |
A matrix or data frame of covariates for non-stationary modelling
of the parameters, or |
mul, sigl, hl |
Integer vectors indicating which columns of |
mulink, siglink, hlink |
Inverse link functions for the location, scale, and shape parameters, respectively. |
num_inits |
The number of initial parameter sets used in the optimization. |
muinit, siginit, hinit |
Numeric vectors giving initial values for the
location, scale, and shape parameters. If |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for |
... |
Additional control arguments passed to the optimizer. |
Value
A list with components including:
trans |
Logical; |
model |
A list containing |
link |
A character vector describing the inverse link functions. |
conv |
The convergence code returned by the optimizer. |
nllh |
The negative log-likelihood evaluated at the fitted parameters. |
data |
The data used in the fit. |
mle |
The maximum likelihood estimates. |
cov |
The estimated covariance matrix when available. |
se |
The estimated standard errors when available. |
vals |
A matrix containing fitted values of the location, scale, and shape parameters at each observation. |
r |
The number of order statistics used in the fitted model. |
#' @references Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y
See Also
Examples
x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rggd.fit(x$rmat)
Profile Likelihood for Return Levels under the rGGD Model
Description
Computes and plots the profile log-likelihood for a return level under
a stationary r-largest generalized Gumbel distribution (rGGD) model
fitted by rggd.fit.
Usage
rggd.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
Arguments
z |
An object returned by |
m |
A return period greater than 1. The profile likelihood is computed
for the corresponding return level exceeded with probability |
xlow, xup |
Lower and upper bounds of the return level grid over which the profile likelihood is evaluated. |
conf |
A numeric vector of confidence levels for profile likelihood confidence intervals. |
nint |
The number of grid points used to evaluate the profile likelihood. |
Details
The function evaluates the profile log-likelihood over a grid of return
level values and plots the resulting curve. Horizontal and vertical lines
are added to indicate profile likelihood confidence intervals for the
confidence levels specified in conf.
#' @references Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y
Value
A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.
See Also
Examples
x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rggd.fit(x$rmat)
rggd.prof(fit, m = 100, xlow = 12, xup = 30)
Return Levels for the Generalized Gumbel Distribution
Description
Computes return levels and their standard errors for a stationary
generalized Gumbel model fitted by rggd.fit.
Usage
rggd.rl(z, year = c(20, 50, 100, 200), show = TRUE)
Arguments
z |
An object returned by |
year |
A numeric vector of return periods for which return levels are to be computed. |
show |
Logical. If |
Details
For a return period T, the return level is defined as the quantile
exceeded with probability 1/T. Under the generalized Gumbel
distribution, the return level is
x_T = \mu - \sigma \log\left(\frac{1-(1-1/T)^h}{h}\right), \quad h \neq 0.
Standard errors are obtained using the delta method.
Value
The input object z with two additional components:
rl |
A numeric vector of estimated return levels. |
rlse |
A numeric vector of standard errors of the estimated return levels. |
See Also
Examples
x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rggd.fit(x$rmat)
out <- rggd.rl(fit, year = c(20, 50, 100, 200))
Summary of Fitted rGGD Models over Different Values of r
Description
Summarizes fitted generalized Gumbel distribution models for
r-largest order statistics over r = 1, \dots, R. For each value
of r, the function fits the model using rggd.fit
and computes return levels using rggd.rl.
Usage
rggd.summary(
data,
r = NULL,
ydat = NULL,
mul = NULL,
sigl = NULL,
hl = NULL,
mulink = identity,
siglink = identity,
hlink = identity,
num_inits = 100,
muinit = NULL,
siginit = NULL,
hinit = NULL,
show = FALSE,
method = "Nelder-Mead",
maxit = 10000,
...
)
Arguments
data |
A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period. |
r |
Optional integer giving the maximum number of order statistics
to summarize. If |
ydat |
A matrix or data frame of covariates for generalized linear
modelling of the parameters, or |
mul, sigl, hl |
Integer vectors indicating which columns of
|
mulink, siglink, hlink |
Inverse link functions for the location, scale, and shape parameters, respectively. |
num_inits |
Number of initial parameter sets used in optimization. |
muinit, siginit, hinit |
Optional initial values for the location, scale, and shape parameters. |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for optimization. |
... |
Additional arguments passed to |
Value
A data frame containing:
-
r: number of order statistics used -
nllh: negative log-likelihood -
mu,sigma,h: parameter estimates -
mu.se,sigma.se,h.se: standard errors -
rl20,rl50,rl100,rl200: return levels -
rl20.se,rl50.se,rl100.se,rl200.se: standard errors of return levels
Examples
x <- rggdr(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
rggd.summary(x$rmat)
Entropy Difference Test for rGGD Models
Description
Performs the entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized Gumbel distribution (rGGD) model.
Usage
rggdEd(data)
Arguments
data |
A numeric matrix or data frame containing the r-largest order statistics. Each row represents one block or observation, and columns must be ordered from largest to smallest. |
Details
The test compares the entropy of models fitted with r and
r-1 order statistics and evaluates whether the additional order
statistic provides significant information.
This function fits the rGGD model using rggd.fit and then
computes the entropy difference test statistic by comparing the fitted
likelihood contributions from models with r and r-1 order
statistics.
Value
A list containing:
-
statistics: the entropy difference test statistic -
p.value: the two-sided p-value -
theta: the estimated parameter vector of the rGGD model -
ybar: the sample mean entropy difference
References
Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of r for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
Examples
x <- rggdr(n=50, r=3, loc = 10, scale = 2, shape = 0.1)
rggdEd(x$rmat)
Sequential Entropy Difference Test for rGGD Models
Description
Performs the sequential entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized Gumbel distribution (rGGD) model.
Usage
rggdEdtest(data)
Arguments
data |
A numeric matrix or data frame containing the r-largest order statistics. Each row represents one observation (or block), and columns must be ordered from largest to smallest. |
Details
The procedure computes ED tests sequentially for r = 2, \dots, R and
applies the ForwardStop and StrongStop stopping rules to control the
false discovery rate.
The function sequentially applies the entropy difference test
(rggdEd) for increasing values of r.
The columns of data must represent decreasing order statistics
within each row, with the first column containing the block maximum.
The resulting p-values are adjusted using the ForwardStop and StrongStop
procedures to help determine an appropriate value of r.
Value
A data frame containing:
-
rValue ofrtested -
p.valuesRaw p-values from the entropy difference tests -
statisticTest statistics for each value ofr -
est.locEstimated location parameter -
est.scaleEstimated scale parameter -
est.shapeEstimated shape parameter -
ybarMean entropy difference -
ForwardStopAdjusted values from the ForwardStop rule -
StrongStopAdjusted values from the StrongStop rule
References
Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of r for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
See Also
Examples
x <- rggdr(n=50, r=3, loc = 10, scale = 2, shape = 0.1)
rggdEdtest(x$rmat)
#' data(bangkok)
rggdEdtest(bangkok)
Negative Log-Likelihood for the rGGD Model
Description
Computes the negative log-likelihood for the r-largest generalized Gumbel distribution (rGGD) model.
Usage
rggdLh(data, par)
Arguments
data |
A numeric vector, matrix, or data frame of observations. If a vector is supplied, it is treated as a one-column matrix. If a matrix or data frame is supplied, each row is treated as one observation and columns represent decreasing order statistics. |
par |
A numeric vector of length 3 giving the location, scale, and shape parameters, respectively. |
Details
This function is intended for internal likelihood evaluation in optimization.
Invalid parameter combinations return Inf rather than stopping with
an error, which makes the function more robust when used inside optimizers
such as optim.
#' @references Shin, Y., & Park, J.-S. (2025). Generalized Gumbel model for r-largest order statistics with application to peak streamflow. Scientific Reports. doi:10.1038/s41598-024-83273-y
Value
A single numeric value giving the negative log-likelihood.
If the parameter combination is invalid, the function returns Inf.
Examples
x <- rggdr(n=50, r=2, loc = 10, scale = 2, shape = 0.1)
fit <- rggd.fit(x$rmat, num_inits = 5)
rggdLh(data=fit$data,par=fit$mle)
Random Generation from the Generalized Gumbel Distribution for r-Largest Order Statistics
Description
Generates random samples from the generalized Gumbel distribution for
r-largest order statistics.
Usage
rggdr(n, r, loc = 0, scale = 1, shape = 0.1)
Arguments
n |
A positive integer specifying the number of observations. |
r |
A positive integer specifying the number of order statistics for each observation. |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
shape |
A numeric value specifying the shape parameter. |
Details
The function first generates independent uniform random variables and then
constructs decreasing variables through recursive transformations depending
on the shape parameter. These are transformed using the generalized Gumbel
quantile function qggd.
For valid generation, the shape parameter must satisfy
1 - (j-1)h > 0 for j = 2, \dots, r, which implies
h < 1/(r-1) when r > 1.
Value
A list with components:
umat |
An |
wmat |
An |
rmat |
An |
Examples
x <- rggdr(n=10, r=3, loc = 10, scale = 2, shape = 0.1)
x$rmat
Fit the Generalized Logistic Distribution to r-Largest Order Statistics
Description
Fits the generalized logistic distribution to r-largest order
statistics using maximum likelihood estimation. Stationary and
non-stationary models are supported through generalized linear modelling
of the location, scale, and shape parameters.
Usage
rglo.fit(
xdat,
r = NULL,
ydat = NULL,
mul = NULL,
sigl = NULL,
shl = NULL,
mulink = identity,
siglink = identity,
shlink = identity,
num_inits = 100,
muinit = NULL,
siginit = NULL,
shinit = NULL,
show = TRUE,
method = "Nelder-Mead",
maxit = 10000,
...
)
Arguments
xdat |
A numeric vector, matrix, or data frame of observations.
Each row should contain decreasing order statistics for a given year
or block. The first column therefore contains block maxima. Only the
first |
r |
The number of largest order statistics to use in the fitted model.
If |
ydat |
A matrix or data frame of covariates for non-stationary
modelling of the parameters, or |
mul, sigl, shl |
Integer vectors indicating which columns of
|
mulink, siglink, shlink |
Inverse link functions for the location, scale, and shape parameters, respectively. |
num_inits |
The number of initial parameter sets used in the optimization. |
muinit, siginit, shinit |
Numeric vectors giving initial values for the
location, scale, and shape parameters. If |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for |
... |
Additional control arguments passed to the optimizer. |
Value
A list with components including:
trans |
Logical; |
model |
A list containing |
link |
A character vector describing the inverse link functions. |
conv |
The convergence code returned by the optimizer. |
nllh |
The negative log-likelihood evaluated at the fitted parameters. |
data |
The data used in the fit. |
mle |
The maximum likelihood estimates. |
cov |
The estimated covariance matrix when available. |
se |
The estimated standard errors when available. |
vals |
A matrix containing fitted values of the location, scale, and shape parameters at each observation. |
r |
The number of order statistics used in the fitted model. |
References
Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
See Also
Examples
x <- rglor(n = 50, r = 3, loc = 10, scale = 2, shape = 0.1)
fit <- rglo.fit(x$rmat, num_inits = 5)
Profile Likelihood for Return Levels under the rGLO Model
Description
Computes and plots the profile log-likelihood for a return level under
a stationary r-largest generalized logistic distribution (rGLO) model
fitted by rglo.fit.
Usage
rglo.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
Arguments
z |
An object returned by |
m |
A return period greater than 1. The profile likelihood is computed
for the corresponding return level exceeded with probability |
xlow, xup |
Lower and upper bounds of the return level grid over which the profile likelihood is evaluated. |
conf |
A numeric vector of confidence levels for profile likelihood confidence intervals. |
nint |
The number of grid points used to evaluate the profile likelihood. |
Details
The function evaluates the profile log-likelihood over a grid of return
level values and plots the resulting curve. Horizontal and vertical lines
are added to indicate profile likelihood confidence intervals for the
confidence levels specified in conf.
Value
A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.
References
Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
See Also
Examples
x <- rglor(n = 50, r = 2, loc = 10, scale = 2, shape = 0.1)
fit <- rglo.fit(x$rmat)
rglo.prof(fit, m = 100, xlow = 12, xup = 25)
Return Levels for the Generalized Logistic Distribution
Description
Computes return levels and their standard errors for a stationary
generalized logistic model fitted by rglo.fit.
Usage
rglo.rl(z, year = c(20, 50, 100, 200), show = TRUE)
Arguments
z |
An object returned by |
year |
A numeric vector of return periods for which return levels are to be computed. |
show |
Logical. If |
Details
For a return period T, the return level is defined as the quantile
exceeded with probability 1/T. Under the generalized logistic
distribution, the return level is
x_T = \mu + \frac{\sigma}{\xi} \left[1 - \left(\frac{1 - 1/T}{1/T}\right)^{-\xi}\right],
which is equivalently written in the implementation as
x_T = \mu + \frac{\sigma}{\xi} - \frac{\sigma}{\xi}
\left(\frac{1/T}{1 - 1/T}\right)^{\xi}.
Standard errors are obtained using the delta method.
Value
The input object z with two additional components:
rl |
A numeric vector of estimated return levels. |
rlse |
A numeric vector of standard errors of the estimated return levels. |
References
Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
See Also
Examples
x <- rglor(n = 50, r = 3, loc = 10, scale = 2, shape = 0.1)
fit <- rglo.fit(x$rmat)
out <- rglo.rl(fit, year = c(20, 50, 100, 200))
Summary of Fitted rGLO Models over Different Values of r
Description
Summarizes fitted generalized logistic distribution models for
r-largest order statistics over r = 1, \dots, R. For each value
of r, the function fits the model using rglo.fit
and computes return levels using rglo.rl.
Usage
rglo.summary(
data,
r = NULL,
ydat = NULL,
mul = NULL,
sigl = NULL,
shl = NULL,
mulink = identity,
siglink = identity,
shlink = identity,
num_inits = 100,
muinit = NULL,
siginit = NULL,
shinit = NULL,
show = FALSE,
method = "Nelder-Mead",
maxit = 10000,
...
)
Arguments
data |
A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period. |
r |
Optional integer giving the maximum number of order statistics
to summarize. If |
ydat |
A matrix or data frame of covariates for generalized linear
modelling of the parameters, or |
mul, sigl, shl |
Integer vectors indicating which columns of
|
mulink, siglink, shlink |
Inverse link functions for the location, scale, and shape parameters, respectively. |
num_inits |
Number of initial parameter sets used in optimization. |
muinit, siginit, shinit |
Optional initial values for the location, scale, and shape parameters. |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for optimization. |
... |
Additional arguments passed to |
Value
A data frame containing:
-
r: number of order statistics used -
nllh: negative log-likelihood -
mu,sigma,xi: parameter estimates -
mu.se,sigma.se,xi.se: standard errors -
rl20,rl50,rl100,rl200: return levels -
rl20.se,rl50.se,rl100.se,rl200.se: standard errors of return levels
References
Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
Examples
x <- rglor(n = 50, r = 3, loc = 10, scale = 2, shape = 0.1)
rglo.summary(x$rmat, num_inits = 5)
Entropy Difference Test for rGLO Models
Description
Performs the entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized logistic distribution (rGLO) model.
Usage
rgloEd(data, par = NULL)
Arguments
data |
A numeric matrix or data frame containing the r-largest order statistics. Each row represents one block or observation, and columns must be ordered from largest to smallest. |
par |
An optional numeric vector of length 3 giving the location,
scale, and shape parameters. If |
Details
The test compares the entropy of models fitted with r and
r-1 order statistics and evaluates whether the additional order
statistic provides significant information.
This function applies the entropy difference test to the r-largest
generalized logistic model. If par is not supplied, the model
parameters are first estimated using rglo.fit.
Value
A list containing:
-
statistics: the entropy difference test statistic -
p.value: the two-sided p-value -
theta: the estimated or supplied parameter vector -
ybar: the sample mean entropy difference
References
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of r for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
See Also
Examples
x <- rglor(n=50, r=3, loc = 10, scale = 2, shape = 0.1)
rgloEd(x$rmat)
Sequential Entropy Difference Test for rGLO Models
Description
Performs the sequential entropy difference (ED) test for selecting the number of order statistics in the r-largest generalized logistic distribution (rGLO) model.
Usage
rgloEdtest(data, par = NULL)
Arguments
data |
A numeric matrix or data frame containing the r-largest order statistics. Each row represents one observation (or block), and columns must be ordered from largest to smallest. |
par |
An optional numeric vector of length 3 giving the location,
scale, and shape parameters. If |
Details
The procedure computes ED tests sequentially for r = 2, \dots, R and
applies the ForwardStop and StrongStop stopping rules to control the
false discovery rate.
The function sequentially applies the entropy difference test
(rgloEd) for increasing values of r. The resulting
p-values are adjusted using the ForwardStop and StrongStop procedures
to help determine an appropriate value of r.
Value
A data frame containing:
-
r: value ofrtested -
p.values: raw p-values from the entropy difference tests -
statistic: test statistics for each value ofr -
est.loc: estimated location parameter -
est.scale: estimated scale parameter -
est.shape: estimated shape parameter -
ybar: mean entropy difference -
ForwardStop: adjusted values from the ForwardStop rule -
StrongStop: adjusted values from the StrongStop rule
References
Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of r for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
See Also
Examples
x <- rglor(n=50, r=3, loc = 10, scale = 2, shape = 0.1)
rgloEdtest(x$rmat)
data(bangkok)
rgloEdtest(bangkok)
Log-Likelihood Contributions for the rGLO Model
Description
Computes the observation-wise log-likelihood contributions for the r-largest generalized logistic distribution (rGLO) model.
Usage
rgloLh(data, par)
Arguments
data |
A numeric vector, matrix, or data frame of observations. If a vector is supplied, it is treated as a one-column matrix. If a matrix or data frame is supplied, each row is treated as one observation and columns represent decreasing order statistics. |
par |
A numeric vector of length 3 giving the location, scale, and shape parameters, respectively. |
Details
This function is mainly intended for internal likelihood evaluation.
Invalid parameter combinations return Inf, which is often more
robust than stopping with an error when used inside iterative procedures.
Value
A numeric vector of log-likelihood contributions, one for each row
of data. If the parameter combination is invalid, the function
returns Inf.
References
Ahmad, M. I., Sinclair, C. D., and Werritty, A. (1988). Log-logistic flood frequency analysis. Journal of Hydrology. doi:10.1016/0022-1694(88)90015-7
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
Examples
x <- rglor(n=50, r=3, loc = 10, scale = 2, shape = 0.1)
fit <- rglo.fit(x$rmat, num_inits = 5)
rgloLh(data=fit$data,par=fit$mle)
Random Generation from the Generalized Logistic Distribution for r-Largest Order Statistics
Description
Generates random samples from the generalized logistic distribution for
r-largest order statistics.
Usage
rglor(n, r, loc = 0, scale = 1, shape = 0.1)
Arguments
n |
A positive integer specifying the number of observations. |
r |
A positive integer specifying the number of order statistics for each observation. |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
shape |
A numeric value specifying the shape parameter. |
Details
The function first generates independent uniform random variables and then
constructs decreasing variables through recursive transformations. These
are transformed using the generalized logistic quantile function
qglo.
Value
A list with components:
umat |
An |
wmat |
An |
rmat |
An |
References
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of r for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
Examples
x <- rglor(10, 3, loc = 0, scale = 1, shape = 0.1)
x$rmat
Fit the Four-Parameter Kappa Distribution to r-Largest Order Statistics
Description
Fits the four-parameter kappa distribution to r-largest order
statistics using maximum likelihood estimation. Stationary and
non-stationary models are supported through generalized linear modelling
of the location, scale, and two shape parameters.
Usage
rk4d.fit(
xdat,
r = NULL,
penk = NULL,
penh = NULL,
ydat = NULL,
mul = NULL,
sigl = NULL,
shl = NULL,
hl = NULL,
mulink = identity,
siglink = identity,
shlink = identity,
hlink = identity,
num_inits = 100,
muinit = NULL,
siginit = NULL,
shinit = NULL,
hinit = NULL,
show = TRUE,
method = "Nelder-Mead",
maxit = 10000,
...
)
Arguments
xdat |
A numeric vector, matrix, or data frame of observations.
Each row should contain decreasing order statistics for a given year
or block. The first column therefore contains block maxima. Only the
first |
r |
The number of largest order statistics to use in the fitted model.
If |
penk |
Optional penalty for the first shape parameter. Supported values
include |
penh |
Optional penalty for the second shape parameter. Supported values
include |
ydat |
A matrix or data frame of covariates for non-stationary
modelling of the parameters, or |
mul, sigl, shl, hl |
Integer vectors indicating which columns of
|
mulink, siglink, shlink, hlink |
Inverse link functions for the location, scale, first shape, and second shape parameters, respectively. |
num_inits |
The number of initial parameter sets used in the optimization. |
muinit, siginit, shinit, hinit |
Numeric vectors giving initial values for
the location, scale, first shape, and second shape parameters. If
|
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for |
... |
Additional control arguments passed to the optimizer. |
Value
A list with components including:
trans |
Logical; |
model |
A list containing |
link |
A character vector describing the inverse link functions. |
conv |
The convergence code returned by the optimizer. |
nllh |
The negative log-likelihood evaluated at the fitted parameters. |
data |
The data used in the fit. |
mle |
The maximum likelihood estimates. |
cov |
The estimated covariance matrix when available. |
se |
The estimated standard errors when available. |
vals |
A matrix containing fitted values of the location, scale, first shape, and second shape parameters at each observation. |
r |
The number of order statistics used in the fitted model. |
References
Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.
Martins, E. S., & Stedinger, J. R. (2000). Generalized maximum-likelihood generalized extreme-value quantile estimators for hydrologic data. Water Resources Research, 36(3), 737–744. doi:10.1029/1999WR900330
Coles, S., & Dixon, M. (1999). Likelihood-based inference for extreme value models. Extremes, 2(1), 5–23. doi:10.1023/A:1009905222644
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533
See Also
Examples
x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
fit <- rk4d.fit(x$rmat, num_inits = 5)
Profile Likelihood for Return Levels under the rK4D Model
Description
Computes and plots the profile log-likelihood for a return level under
a stationary r-largest four-parameter kappa distribution (rK4D) model
fitted by rk4d.fit.
Usage
rk4d.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
Arguments
z |
An object returned by |
m |
A return period greater than 1. The profile likelihood is computed
for the corresponding return level exceeded with probability |
xlow, xup |
Lower and upper bounds of the return level grid over which the profile likelihood is evaluated. |
conf |
A numeric vector of confidence levels for profile likelihood confidence intervals. |
nint |
The number of grid points used to evaluate the profile likelihood. |
Details
The function evaluates the profile log-likelihood over a grid of return
level values and plots the resulting curve. Horizontal and vertical lines
are added to indicate profile likelihood confidence intervals for the
confidence levels specified in conf.
Value
A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.
References
Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533
See Also
Examples
x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
fit <- rk4d.fit(x$rmat, num_inits = 100)
rk4d.prof(fit, m = 100, xlow = 12, xup = 25)
Return Levels for the Four-Parameter Kappa Distribution
Description
Computes return levels and their standard errors for a stationary
four-parameter kappa model fitted by rk4d.fit.
Usage
rk4d.rl(z, year = c(20, 50, 100, 200), show = TRUE)
Arguments
z |
An object returned by |
year |
A numeric vector of return periods for which return levels are to be computed. |
show |
Logical. If |
Details
For a return period T, the return level is defined as the quantile
exceeded with probability 1/T. Under the four-parameter kappa
distribution, the return level is
x_T = \mu + \frac{\sigma}{\xi} - \frac{\sigma}{\xi}
\left(\frac{1-(1-1/T)^h}{h}\right)^\xi,
and standard errors are obtained using the delta method.
Value
The input object z with two additional components:
-
rl: a numeric vector of estimated return levels -
rlse: a numeric vector of standard errors of the estimated return levels
References
Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of r for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533
See Also
Examples
x <- rk4dr(n = 50, r = 2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
fit <- rk4d.fit(x$rmat, num_inits = 5)
out <- rk4d.rl(fit, year = c(20, 50, 100, 200))
Summary of Fitted rK4D Models over Different Values of r
Description
Summarizes fitted four-parameter kappa distribution models for
r-largest order statistics over r = 1, \dots, R. For each value
of r, the function fits the model using rk4d.fit
and computes return levels using rk4d.rl.
Usage
rk4d.summary(
data,
r = NULL,
penk = NULL,
penh = NULL,
ydat = NULL,
mul = NULL,
sigl = NULL,
shl = NULL,
hl = NULL,
mulink = identity,
siglink = identity,
shlink = identity,
hlink = identity,
num_inits = 100,
muinit = NULL,
siginit = NULL,
shinit = NULL,
hinit = NULL,
show = FALSE,
method = "Nelder-Mead",
maxit = 10000,
...
)
Arguments
data |
A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period. |
r |
Optional integer giving the maximum number of order statistics
to summarize. If |
penk |
Penalty function for the |
penh |
Penalty function for the |
ydat |
A matrix or data frame of covariates for generalized linear
modelling of the parameters, or |
mul, sigl, shl, hl |
Integer vectors indicating which columns of
|
mulink, siglink, shlink, hlink |
Inverse link functions for the location, scale, first shape, and second shape parameters, respectively. |
num_inits |
Number of initial parameter sets used in optimization. |
muinit, siginit, shinit, hinit |
Optional initial values for the location, scale, first shape, and second shape parameters. |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for optimization. |
... |
Additional arguments passed to |
Value
A data frame containing:
-
r: number of order statistics used -
nllh: negative log-likelihood -
mu,sigma,xi,h: parameter estimates -
mu.se,sigma.se,xi.se,h.se: standard errors -
rl20,rl50,rl100,rl200: return levels -
rl20.se,rl50.se,rl100.se,rl200.se: standard errors of return levels
References
Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of r for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533
Examples
x <- rk4dr(n = 50, r = 3, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
rk4d.summary(x$rmat, num_inits = 5)
# penalty function
rk4d.summary(x$rmat, penk = "CD", penh = "MS", num_inits = 5)
Entropy Difference Test for rK4D Models
Description
Performs the entropy difference (ED) test for selecting the number of order statistics in the r-largest four-parameter kappa distribution (rK4D) model.
Usage
rk4dEd(data)
Arguments
data |
A numeric matrix or data frame containing the r-largest order statistics. Each row represents one block or observation, and columns must be ordered from largest to smallest. |
Details
The test compares the entropy of models fitted with r and
r-1 order statistics and evaluates whether the additional order
statistic provides significant information.
This function fits the rK4D model using rk4d.fit and then
computes the entropy difference test statistic by comparing the fitted
likelihood contributions from models with r and r-1 order
statistics.
Value
A list containing:
-
statistics: the entropy difference test statistic -
p.value: the two-sided p-value -
theta: the estimated parameter vector of the rK4D model -
ybar: the sample mean entropy difference
References
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of r for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
Shin, Y., Park, J.-S., and coauthors (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533
See Also
Examples
x <- rk4dr(n=50, r=2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
rk4dEd(x$rmat)
Sequential Entropy Difference Test for rK4D Models
Description
Performs the sequential entropy difference (ED) test for selecting the number of order statistics in the r-largest four-parameter kappa distribution (rK4D) model.
Usage
rk4dEdtest(data)
Arguments
data |
A numeric matrix or data frame containing the r-largest order statistics. Each row represents one observation (or block), and columns must be ordered from largest to smallest. |
Details
The procedure computes ED tests sequentially for r = 2, \dots, R and
applies the ForwardStop and StrongStop stopping rules to control the
false discovery rate.
The function sequentially applies the entropy difference test
(rk4dEd) for increasing values of r. The resulting
p-values are adjusted using the ForwardStop and StrongStop procedures
to help determine an appropriate value of r.
Value
A data frame containing:
-
r: value ofrtested -
p.values: raw p-values from the entropy difference tests -
statistic: test statistics for each value ofr -
est.loc: estimated location parameter -
est.scale: estimated scale parameter -
est.shape1: estimated first shape parameter -
est.shape2: estimated second shape parameter -
ybar: mean entropy difference -
ForwardStop: adjusted values from the ForwardStop rule -
StrongStop: adjusted values from the StrongStop rule
References
Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of r for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533
See Also
Examples
x <- rk4dr(n=50, r=2, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
rk4dEdtest(x$rmat)
data(bangkok)
rk4dEdtest(bangkok)
Log-Likelihood Contributions for the rK4D Model
Description
Computes the observation-wise log-likelihood contributions for the r-largest four-parameter kappa distribution (rK4D) model.
Usage
rk4dLh(data, par)
Arguments
data |
A numeric vector, matrix, or data frame of observations. If a vector is supplied, it is treated as a one-column matrix. If a matrix or data frame is supplied, each row is treated as one observation and columns represent decreasing order statistics. |
par |
A numeric vector of length 4 giving the location, scale, first shape, and second shape parameters. |
Value
A numeric vector of log-likelihood contributions for each row
of data. If invalid parameter combinations occur, the function
returns a large penalty value.
References
Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533
Examples
x <- rk4dr(n=50, r=3, loc = 10, scale = 2, shape1 = 0.1, shape2 = 0.1)
fit <- rk4d.fit(x$rmat, num_inits = 5)
rk4dLh(data=fit$data,par=fit$mle)
Random Generation from the Four-Parameter Kappa Distribution for r-Largest Order Statistics
Description
Generates random samples from the four-parameter kappa distribution for
r-largest order statistics.
Usage
rk4dr(n, r, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
Arguments
n |
A positive integer specifying the number of observations. |
r |
A positive integer specifying the number of order statistics for each observation. |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
shape1 |
A numeric value specifying the first shape parameter. |
shape2 |
A numeric value specifying the second shape parameter. |
Details
The function first generates independent uniform random variables and then
constructs decreasing transformed variables recursively using the second
shape parameter. These are transformed by the four-parameter kappa quantile
function qk4d.
For valid generation with r > 1, the second shape parameter should
satisfy shape2 < 1/(r-1).
Value
A list with components:
-
umat: ann x rmatrix of independent uniform random numbers -
wmat: ann x rmatrix of transformed uniform variables -
rmat: ann x rmatrix of simulatedr-largest order statistics
References
Hosking, J. R. M. (1994). The four-parameter kappa distribution. IBM Journal of Research and Development, 38(3), 251–258.
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of r for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
Shin, Y., & Park, J.-S. (2023). Modeling climate extremes using the four-parameter kappa distribution for r-largest order statistics. Weather and Climate Extremes. doi:10.1016/j.wace.2022.100533
Examples
x <- rk4dr(n=50, r=3, loc = 0, scale = 1, shape1 = 0.1, shape2 = 0.1)
x$rmat
Fit the Logistic Distribution to r-Largest Order Statistics
Description
Fits the logistic distribution to r-largest order statistics
using maximum likelihood estimation. Stationary and non-stationary models
are supported through generalized linear modelling of the location and
scale parameters.
Usage
rld.fit(
xdat,
r = NULL,
ydat = NULL,
mul = NULL,
sigl = NULL,
mulink = identity,
siglink = identity,
num_inits = 100,
muinit = NULL,
siginit = NULL,
show = TRUE,
method = "Nelder-Mead",
maxit = 10000,
...
)
Arguments
xdat |
A numeric vector, matrix, or data frame of observations.
Each row should contain decreasing order statistics for a given year
or block. The first column therefore contains block maxima. Only the
first |
r |
The number of largest order statistics to use in the fitted model.
If |
ydat |
A matrix or data frame of covariates for non-stationary
modelling of the parameters, or |
mul, sigl |
Integer vectors indicating which columns of
|
mulink, siglink |
Inverse link functions for the location and scale parameters, respectively. |
num_inits |
The number of initial parameter sets used in the optimization. |
muinit, siginit |
Numeric vectors giving initial values for the
location and scale parameters. If |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for |
... |
Additional control arguments passed to the optimizer. |
Value
A list with components including:
-
trans: logical;TRUEif a non-stationary model is fitted -
model: a list containingmulandsigl -
link: a character vector describing the inverse link functions -
conv: the convergence code returned by the optimizer -
nllh: the negative log-likelihood evaluated at the fitted parameters -
data: the data used in the fit -
mle: the maximum likelihood estimates -
cov: the estimated covariance matrix when available -
se: the estimated standard errors when available -
vals: a matrix containing fitted values of the location and scale -
r: the number of order statistics used in the fitted model
References
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
See Also
Examples
x <- rldr(n = 50, r = 3, loc = 10, scale = 2)
fit <- rld.fit(x$rmat, num_inits = 5)
Profile Likelihood for Return Levels under the rLD Model
Description
Computes and plots the profile log-likelihood for a return level under
a stationary r-largest logistic distribution (rLD) model fitted by
rld.fit.
Usage
rld.prof(z, m, xlow, xup, conf = 0.95, nint = 100)
Arguments
z |
An object returned by |
m |
A return period greater than 1. The profile likelihood is computed
for the corresponding return level exceeded with probability |
xlow, xup |
Lower and upper bounds of the return level grid over which the profile likelihood is evaluated. |
conf |
A numeric vector of confidence levels for profile likelihood confidence intervals. |
nint |
The number of grid points used to evaluate the profile likelihood. |
Details
The function evaluates the profile log-likelihood over a grid of return
level values and plots the resulting curve. Horizontal and vertical lines
are added to indicate profile likelihood confidence intervals for the
confidence levels specified in conf.
Value
A data frame containing the return period, estimated return level, confidence level, lower confidence limit, upper confidence limit, and interval width. A profile likelihood plot is also produced.
References
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
See Also
Examples
x <- rldr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rld.fit(x$rmat)
rld.prof(fit, m = 100, xlow = 12, xup = 25)
Return Levels for the Logistic Distribution
Description
Computes return levels and their standard errors for a stationary
logistic model fitted by rld.fit.
Usage
rld.rl(z, year = c(20, 50, 100, 200), show = TRUE)
Arguments
z |
An object returned by |
year |
A numeric vector of return periods for which return levels are to be computed. |
show |
Logical. If |
Details
For a return period T, the return level is defined as the quantile
exceeded with probability 1/T. Under the logistic distribution,
the return level is
x_T = \mu + \sigma \log\left(\frac{1}{\exp(-\log(1-1/T)) - 1}\right),
and standard errors are obtained using the delta method.
Value
The input object z with two additional components:
-
rl: a numeric vector of estimated return levels -
rlse: a numeric vector of standard errors of the estimated return levels
References
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
See Also
Examples
x <- rldr(n = 50, r = 2, loc = 10, scale = 2)
fit <- rld.fit(x$rmat, num_inits = 5)
out <- rld.rl(fit,year= c(20, 50, 100, 200))
Summary of Fitted rLD Models over Different Values of r
Description
Summarizes fitted logistic distribution models for r-largest order
statistics over r = 1, \dots, R. For each value of r,
the function fits the model using rld.fit and computes
return levels using rld.rl.
Usage
rld.summary(
data,
r = NULL,
ydat = NULL,
mul = NULL,
sigl = NULL,
mulink = identity,
siglink = identity,
num_inits = 100,
muinit = NULL,
siginit = NULL,
show = FALSE,
method = "Nelder-Mead",
maxit = 10000,
...
)
Arguments
data |
A numeric vector, matrix, or data frame containing the r-largest order statistics. Each row should contain decreasing order statistics for one block or time period. |
r |
Optional integer giving the maximum number of order statistics
to summarize. If |
ydat |
A matrix or data frame of covariates for generalized linear
modelling of the parameters, or |
mul, sigl |
Integer vectors indicating which columns of
|
mulink, siglink |
Inverse link functions for the location and scale parameters, respectively. |
num_inits |
Number of initial parameter sets used in optimization. |
muinit, siginit |
Optional initial values for the location and scale parameters. |
show |
Logical. If |
method |
Optimization method passed to |
maxit |
Maximum number of iterations for optimization. |
... |
Additional arguments passed to |
Value
A data frame containing:
-
r: number of order statistics used -
nllh: negative log-likelihood -
mu,sigma: parameter estimates -
mu.se,sigma.se: standard errors -
rl20,rl50,rl100,rl200: return levels -
rl20.se,rl50.se,rl100.se,rl200.se: standard errors of return levels
References
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
Examples
x <- rldr(n = 50, r = 2, loc = 10, scale = 2)
rld.summary(x$rmat, num_inits = 5)
Random Generation from the Logistic Distribution for r-Largest Order Statistics
Description
Generates random samples from the logistic distribution for
r-largest order statistics.
Usage
rldr(n, r, loc = 0, scale = 1)
Arguments
n |
A positive integer specifying the number of observations. |
r |
A positive integer specifying the number of order statistics for each observation. |
loc |
A numeric value specifying the location parameter. |
scale |
A positive numeric value specifying the scale parameter. |
Details
The function first generates independent uniform random variables and then
constructs decreasing transformed variables recursively. These are
transformed by the logistic quantile function qld.
Value
A list with components:
-
umat: ann x rmatrix of independent uniform random numbers -
wmat: ann x rmatrix of transformed uniform variables -
rmat: ann x rmatrix of simulatedr-largest order statistics
References
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Bader, B., Yan, J., & Zhang, X. (2017).
Automated selection of r for the r-largest order statistics approach.
Statistics and Computing.
doi:10.1007/s11222-016-9697-3
Shin, Y., & Park, J-S. (2024). Generalized logistic model for r-largest order statistics with hydrological application. Stochastic Environmental Research and Risk Assessment. doi:10.1007/s00477-023-02642-7
Examples
x <- rldr(n=50, r=3, loc = 0, scale = 1)
x$rmat