---
title: "Introduction"
output: rmarkdown::html_vignette
bibliography: references.bib
link-citations: true
vignette: >
%\VignetteIndexEntry{Introduction}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
set.seed(1234) #For reproducibility
```
## Outline
This package implements the anytime-valid sequential p-value estimation procedure as described in @paper. References to equations and sections in this vignette pertain to this paper.
The package is centered around the function `avseqmc()`. The function can be used both to start the sequential procedure, as well as to sharpen an earlier estimate.
This vignette is structured in two parts. We start by showcasing _basic usage_ in Part I, and then detail some _advanced usage_ of the package in Part II.
## Part I: Basic Usage
### Starting Estimation
To start the sequential estimation procedure, the first two arguments of `avseqmc()` are most important:
- a function `sample_G` which, when called, returns a binary sample (or a batch of samples --- see [Part II](#batches)) from the Monte-Carlo distribution $G^*(\mathbf{X})$ as described in Equation (5) in the main paper (@paper).
- a number `epsilon` corresponding to the desired value of the risk of overestimated significance as described in Definition 5 in the main paper.
A self-contained minimal example usage (with default settings) is given below.
```{r example usage}
library(avseqmc)
G1 <- function(){runif(1) < 0.01} # A mock MC function to demonstrate functionality
R1 <- avseqmc(sample_G = G1, epsilon = 0.001)
print(R1)
```
We have not specified a stopping time, prompting `avseqmc()` to use its default stopping criterion: "when inference can be drawn at 5\% significance" as described in Section 4.1 and in Equation (11) in the main paper. As a result, the sequential estimation has terminated automatically after `r R1$n` samples and provided an (intermediate) p-value estimate of $\tilde p_\tau(\mathbf{x}) =$ `r round(R1$ptilde[length(R1$ptilde)],4)`. The output is collected in an object `avseqmc_progress` that has a custom plotting function visualizing the sequential trajectory:
```{r visualization , fig.width=5, fig.height=5, out.width="50%"}
plot(R1)
```
To avoid potentially infinite runtimes, `avseqmc()` also stops after a pre-specified number of samples which defaults at 1000:
```{r maximum samples}
G2 <- function(){runif(1) < 0.05}
R2 <- avseqmc(sample_G = G2, epsilon = 0.001)
print(R2)
```
The maximum number of samples can be adjusted via the `max_samples` argument. A minimum number of samples can also be selected via `min_samples` if preferred.
### Resuming Earlier Estimation
An important advantage of the proposed anytime-valid sequential methodology is the option to resume earlier estimation while retaining important validity guarantees. To facilitate this, the function `avseqmc()` returns an object of class `avseqmc_progress`, which in turn can be used as an _argument_ to function `avseqmc()` to resume sampling based on that earlier progress. In case this `avseqmc_progress` object is available within your own R script resuming is straightforward:
```{r resuming computation}
G3 <- function(){runif(1) < 0.03}
R3a <- avseqmc(sample_G = G3, epsilon = 0.001)
print(R3a)
R3b <- avseqmc(R3a)
print(R3b)
```
The case where the `avseqmc_progress` object is not directly available (for example, because the earlier sequential estimate is computed by a different analyst) is discussed in [Part II](#printedres).
### Using Built-in Stopping Times
The function `avseqmc()` continues sampling until the stopping criterion specified via `stopcrit` is reached (or the maximum `max_samples` is reached). There are two built-in stopping criteria (which are also described in Section 4.1 in the main paper). To use these built-in criteria, `stopcrit` should be specified as a list with the following format: `stopcrit = list("type"=...,"param"=...)`, where the values of the elipses depend on the criterion chosen:
- For the stopping time "when inference can be drawn at level $\alpha$" the `stopcrit` argument should be specified as `list("type"="futility","param"=...)` where `"param"` corresponds to the significance level $\alpha$:
```{r stopping for futility}
G4 <- function(){runif(1) < 0.04}
R4 <- avseqmc(sample_G = G4,
epsilon = 0.001,
stopcrit = list("type" = "futility",
param = 0.1)) # Stop when inference can be drawn at 10% significance
print(R4)
```
- For the stopping time "when estimates are no longer decreasing at a sufficient rate" the `stopcrit` argument should be specified as `list("type"="convergence","param"=...)` where `"param"` corresponds to a vector of two numbers: the first should corresponds to the value $\gamma$, the second to an (integer) value of $n_0$ used in the convergence criterion:
```{r stopping for convergence}
G5 <- function(){runif(1) < 0.04}
R5 <- avseqmc(sample_G = G5,
epsilon = 0.001,
stopcrit = list("type" = "convergence",
param = c(1e-5, 100)))
print(R5)
```
The actual value of $n_0$ used may be slightly different when drawing Monte-Carlo samples in batches: this is discussed in [Part II](#batches).
Custom stopping times can also be used, which is exemplified in [Part II](#customstop).
### Computing Lower Confidence Bounds
It may be informative to also obtain lower bounds of the confidence sequence on which the sequential estimation is based. You can request to compute lower bounds based on the construction by Robbins (1970) by setting `compute_lower=TRUE` in the function `avseqmc()`. The lower bounds are computed by default when the futility stopping criterion is used (i.e. `stopcrit = list("type"="futility","param"=...)`) since the stopping criterion is dependent on this value (serving as $\tilde L_n(\mathbf{X})$ in Equation (11) in the main paper).
The lower bounds are available in the output object `avseqmc_progress`, specifically in the `$Ltilde` element of the returned list.
### Inspecting Output
As stated, the function `avseqmc()` returns an object of class `avseqmc_progress` which is a list with the following elements:
- `$epsilon`: risk of overestimated significance used in the sequential estimation;
- `$sample_G`: function that samples from the Monte-Carlo distribution $G^*(\mathbf{X})$;
- `$ptilde`: sequence of sequential $p$-value estimates. The final value in this sequence is the most recent estimate of the $p$-value;
- `$Ltilde`: sequence of lower bounds of the confidence sequence based on the construction by Robbins (1970). Contains NA values if these were not computed by default through `stopcrit = list("type"="futility","param"=...)` or requested using `compute_lower=TRUE`;
- `$n`: total number of samples drawn from the MC sampler;
- `$S`: total number of ones observed from the MC sampler;
- `$B`: sequence of number of ones observed at each sampling timepoint (which can be greater than 1 if `sample_G` samples in batches: see [Part II](#batches));
- `$Bn`: sequence of number of samples drawn from MC sampler at each timepoint (which can be greater than 1 if `sample_G` samples in batches: see [Part II](#batches)).
As shown earlier, the object can be used as an argument to `avseqmc()` to resume sampling, and the object has a custom printing and plotting function.
## Part II: Advanced Usage
### Sampling Batches {#batches}
Sequentially drawing _single_ samples from the Monte-Carlo distribution $G^*(\mathbf{X})$ may induce computational overhead that can be partly alleviated through sampling in _batches_. The function `avseqmc()` automatically allows for the specified `sample_G` object to return a batch of samples in the form of a vector (of arbitrary length) of zeroes and ones:
```{r batch sampling}
G6 <- function(){runif(15) < 0.04}
R6 <- avseqmc(sample_G = G6, epsilon = 0.001)
print(R6)
```
The function sequentially samples a new batch until the total amount of samples drawn within the run exceeds the `max_samples` argument. As is the case in the above example, if `max_samples` is not a multiple of the batch size, this may cause the final amount of samples to exceed the `max_samples` argument (by at most the batch size minus one).
For the built-in stopping criterion "convergence", the actual value of $n_0$ used in the criterion may slightly differ from the user-specified value when sampling in batches. Specifically, when sampling in batches of size $M$, if $n_0$ is not a multiple of the batch size, then the actual value of $n_0$ used in the stopping criterion is $n_0$ is $\lceil M/n_0 \rceil n_0$.
### Resuming Based on Earlier Printed Results {#printedres}
In case you would like to resume an earlier estimation for which the sequential estimate $\tilde p_\tau(\mathbf{x})$ and the risk of overestimated significance $\varepsilon$ is known, but the `avseqmc_progress` object is not available, you can build the object yourself via the function `init_avseqmc_progress()`:
```{r manual resuming}
G7 <- function(){runif(1) < 0.04}
R7a <- init_avseqmc_progress(sample_G = G7,
epsilon = 0.001,
ptilde = 0.0819,
n = 1000)
R7b <- avseqmc(R7a)
```
The number of previously observed zeroes and ones from $G^*(\mathbf{x})$ can be computed based on the values of $\varepsilon$, $n$, and $\tilde p_n(\mathbf{x})$ under the assumption that $\tilde p_n(\mathbf{x})$ is constructed using the confidence sequence of Definition 6 (as is done in the function `avseqmc()`). However, note that if $\tilde p_n(\mathbf{x})$ is input as rounded value, this may induce errors in the inversion, and a warning message is output to notify the user of this. If available, it is safer to input the total number of ones observed from `sample_G` directly:
```{r safe manual resuming}
G7 <- function(){runif(1) < 0.04}
R7a <- init_avseqmc_progress(sample_G = G7,
epsilon = 0.001,
ptilde = 0.0813,
n = 1000,
S = 44)
```
A warning message indicates that the value of ptilde is not verified based on the value of `n` and `S`.
### Custom Stopping Times {#customstop}
You can also use a fully custom stopping time by supplying `stopcrit` with a function. This function should take a single argument with class `avseqmc_progress` and output either `FALSE` to continue sampling, or `TRUE` to stop sampling. For example, to stop sampling when, in the last 50 batches, at most a single 1 is observed, we may specify a custom stopping rule as:
```{r custom stopping time, message=TRUE}
G8 <- function(){runif(1) < 0.04}
customstop <- function(R){
if(length(R$ptilde) > 50){
if(sum(R$B[(length(R$B)-49):length(R$B)]) <= 1 ){
return(TRUE)
}
}
return(FALSE)
}
R8 <- avseqmc(sample_G = G8, epsilon = 0.001, stopcrit = customstop)
print(R8)
```
When you run this code, a message notifies you that if your custom stopping criterion is based on the lower bound of the confidence sequence R$Ltilde, then this should be requested via setting the argument `compute_lower=TRUE`. In the example above, the lower bound is not required to evaluate the stopping criterion, so we leave it as the default `compute_lower=FALSE`.