---
title: "Using 'buildmer' to automatically find & compare maximal (mixed) models"
author: "Cesko C. Voeten"
date: "19 August 2021"
bibliography: bibliography.bib
csl: apa.csl
output:
  html_document
header-includes:
   - \usepackage{tipa}
vignette: >
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
  %\VignetteIndexEntry{Using 'buildmer' to automatically find & compare maximal (mixed) models}
---

# Introduction
@keepitmaximal suggest that for valid statistical inference, a regression model must control for all possible confounding factors, specifically those coming from random effects such as subjects and items. @parsimonious suggest that this proposed strategy leads to overfitting and that an appropriately-parsimonious model must be chosen, preferably based on theory but possibly also using stepwise elimination [@stepwise]. Both strategies require a maximal model to be identified (for @keepitmaximal, this is the final model; for @stepwise, this is the basis for backward stepwise elimination), but for many psycholinguistic experiments, the _truly_ maximal model will fail to converge and a reasonable subset model needs to be chosen.

The `buildmer` package aims to automate the procedures identifying the maximal model that can still converge & performing backward stepwise elimination based on a variety of criteria (change in log-likelihood, AIC, BIC, change in explained deviance). The package does not contain any model-fitting code, but functions as an administrative loop around other packages by simply building up a maximal formula object and passing it along. Currently, the package supports models that can be fitted by `(g)lm`, `(g)lmer` (package `lme4`), `gls`, `lme` (package `nlme`), `bam`, `gam`, `gamm` (package `mgcv`), `gamm4` (package `gamm4`), `glmmTMB` (package `glmmTMB`), `multinom` (package `nnet`), `lmertree` and `glmertree` (package `glmertree`), `mixed_model` (package `GLMMadaptive`), `clmm` (package `ordinal`), and any other package if you provide your own wrapper functions.

# A vowel study
To illustrate what `buildmer` can do for you, the package comes with a particularly pathological dataset called `vowels`. It looks like this:

```{r,echo=FALSE}
options(width=110)
```

```{r}
library(buildmer)
head(vowels)
```

This is a pilot study that I conducted when I was just starting my PhD, and attempted to analyze in probably the worst way possible. The research question was whether vowel diphthongization in the Dutch vowels /\textipa{e:,\o:,o:,Ei,\oe y}/ was affected by syllable structure, such that an /\textipa{l}/ within the same syllable would block diphthongization but an /\textipa{l}/ in the onset of the next syllable would permit it. In plain English, the question was whether these five vowels in Dutch were pronounced like the vowel in English 'fear', with the tongue held constant for the duration of the vowel, or like the vowel in English 'fade', which has an upward tongue movement towards the position of the vowel in English 'fit'. The position of the tongue can be measured in a simple word-list reading experiment by measuring the speech signal's so-called 'first formant', labeled `f1` in this dataset, where lower F1 = higher tongue. Thus, the research question is if the F1 either changes or remains stable for the duration of each vowel depending on whether the following consonant is an 'l' in the same syllable (coded as `lCda` in column `following`) or in the next syllable (coded as `lOns`). Additionally, I wanted to control for the factors `neighborhood` (a measure of entropy: 'if only one sound is changed anywhere in this word, how many new words could be generated?'), `information` (another measure of entropy derived from the famous Shannon information measure), and `stress` (a dummy encoding whether the vowel was stressed or unstressed).

An entirely reasonable way to analyze these data, and the approach I ultimately pursued later in my PhD, would be to take samples from each vowel at 75\% realization and at 25\% realization, subtract these two, and use this 'delta score' as dependent variable: if this score is non-zero, the vowel changes over time, if it is approximately zero, the vowel was stable. In this dataset, however, I instead took as many samples as were present in the part of the wave file corresponding to these vowels, and wanted to fit a linear regression line through all of these samples as a function of the sample number. This number, scaled from 0 to 1 per token, is listed in column `timepoint`. To make the model even more challenging to fit, only six participants were tested in this pilot study, making it very difficult to find an optimum when including a full random-slope structure.

In `lme4` syntax, the fully maximal model would be given by the following formula:

```{r}
f <- f1 ~ vowel*timepoint*following * neighborhood*information*stress + 
	 (vowel*timepoint*following * neighborhood+information+stress | participant) +
	 (timepoint | word)
```

It should go without saying that this is a completely unreasonable model that will never converge. A first step towards reducing the model structure could be to reason that effects of neighborhood, information, and stress, which are all properties of the individual words in this dataset, could be subsumed into the random effects by words. This reduces the maximal model to:

```{r}
f <- f1 ~ vowel*timepoint*following +
	 (vowel*timepoint*following | participant) +
	 (timepoint | word)
```

This model is still somewhat on the large side, so we will now use `buildmer` to check:
 - if this model is capable of converging at all;
 - if all of these terms are really necessary.

# Finding the maximal _feasible_ model & doing stepwise elimination from it

To illustrate `buildmer`'s modular capabilities, we'll fit this model in two steps. We start by identifying the maximal model that is still capable of converging. We do this by running `buildmer`, including an optional `buildmerControl` argument in which we set the `direction` parameter to `'order'`. We also set `lme4`s optimizer to `bobyqa`, as this manages to get much further than the default `nloptwrap`. Note how control parameters intended for `lmer` are specified in the `args` list inside `buildmerControl`. For backward-compatibility reasons, they can also be passed to `buildmer` itself, but this is deprecated now.

```{r,eval=FALSE}
library(lme4)
m <- buildmer(f,data=vowels,buildmerControl=buildmerControl(direction='order',
	      args=list(control=lmerControl(optimizer='bobyqa'))))
```
```{r,echo=FALSE}
cat('Determining predictor order
Fitting via lm: f1 ~ 1
Currently evaluating LRT for: following, timepoint, vowel
Fitting via lm: f1 ~ 1 + following
Fitting via lm: f1 ~ 1 + timepoint
Fitting via lm: f1 ~ 1 + vowel
Updating formula: f1 ~ 1 + vowel
Currently evaluating LRT for: following, timepoint
Fitting via lm: f1 ~ 1 + vowel + following
Fitting via lm: f1 ~ 1 + vowel + timepoint
Updating formula: f1 ~ 1 + vowel + timepoint
Currently evaluating LRT for: following, vowel:timepoint
Fitting via lm: f1 ~ 1 + vowel + timepoint + following
Fitting via lm: f1 ~ 1 + vowel + timepoint + vowel:timepoint
Updating formula: f1 ~ 1 + vowel + timepoint + vowel:timepoint
Currently evaluating LRT for: following
Fitting via lm: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following
Updating formula: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following
Currently evaluating LRT for: timepoint:following, vowel:following
Fitting via lm: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following + timepoint:following
Fitting via lm: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following + vowel:following
Updating formula: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following + timepoint:following
Currently evaluating LRT for: vowel:following
Fitting via lm: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following + timepoint:following +
    vowel:following
Updating formula: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following + timepoint:following +
    vowel:following
Currently evaluating LRT for: vowel:timepoint:following
Fitting via lm: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following + timepoint:following +
    vowel:following + vowel:timepoint:following
Updating formula: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following + timepoint:following +
    vowel:following + vowel:timepoint:following
Fitting via gam, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following
Currently evaluating LRT for: 1 | participant, 1 | word
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 | participant)
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 | word)
Updating formula: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following + timepoint:following +
    vowel:following + vowel:timepoint:following + (1 | participant)
Currently evaluating LRT for: following | participant, timepoint | participant, vowel |
    participant, 1 | word
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 + following |
    participant)
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 + timepoint |
    participant)
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 + vowel | participant)
boundary (singular) fit: see ?isSingular
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 | participant) + (1 |
    word)
Updating formula: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following + timepoint:following +
    vowel:following + vowel:timepoint:following + (1 | participant) + (1 | word)
Currently evaluating LRT for: following | participant, timepoint | participant, vowel |
    participant, timepoint | word
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 + following |
    participant) + (1 | word)
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 + timepoint |
    participant) + (1 | word)
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 + vowel | participant)
    + (1 | word)
boundary (singular) fit: see ?isSingular
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 | participant) + (1 +
    timepoint | word)
Updating formula: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following + timepoint:following +
    vowel:following + vowel:timepoint:following + (1 | participant) + (1 + timepoint | word)
Currently evaluating LRT for: following | participant, timepoint | participant, vowel |
    participant
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 + following |
    participant) + (1 + timepoint | word)
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 + timepoint |
    participant) + (1 + timepoint | word)
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 + vowel | participant)
    + (1 + timepoint | word)
boundary (singular) fit: see ?isSingular
Updating formula: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following + timepoint:following +
    vowel:following + vowel:timepoint:following + (1 + timepoint | participant) + (1 + timepoint |
    word)
Currently evaluating LRT for: following | participant, vowel | participant
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 + timepoint + following
    | participant) + (1 + timepoint | word)
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 + timepoint + vowel |
    participant) + (1 + timepoint | word)
boundary (singular) fit: see ?isSingular
Updating formula: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following + timepoint:following +
    vowel:following + vowel:timepoint:following + (1 + timepoint + following | participant) + (1 +
    timepoint | word)
Currently evaluating LRT for: timepoint:following | participant, vowel | participant
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 + timepoint + following
    + timepoint:following | participant) + (1 + timepoint | word)
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 + timepoint + following
    + vowel | participant) + (1 + timepoint | word)
boundary (singular) fit: see ?isSingular
Updating formula: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following + timepoint:following +
    vowel:following + vowel:timepoint:following + (1 + timepoint + following + timepoint:following
    | participant) + (1 + timepoint | word)
Currently evaluating LRT for: vowel | participant
Fitting via lmer, with REML: f1 ~ 1 + vowel + timepoint + vowel:timepoint + following +
    timepoint:following + vowel:following + vowel:timepoint:following + (1 + timepoint + following
    + timepoint:following + vowel | participant) + (1 + timepoint | word)
boundary (singular) fit: see ?isSingular
Ending the ordering procedure due to having reached the maximal feasible model - all higher models
    failed to converge. The types of convergence failure are: Singular fit
Finalizing by converting the model to lmerTest')
```

The `order` step is useful if the maximal model includes random effects: `buildmer` will start out with an empty model and keeps adding terms to this model until convergence can no longer be achieved. The `order` step adds terms in order of their contribution to a certain criterion, such that the most important random slopes will be included first; this criterion is controlled by the `crit` argument. The default criterion is the significance of the change in log-likelihood (`LRT`: terms which provide lower chi-square $p$ values are considered more important), but other options are also supported. These are the raw log-likelihood (`LL`: terms which provide the largest increase in the log-likelihood; this measure will favor categorical predictors with many levels), AIC (`AIC`), BIC (`BIC`), explained deviance (`deviance`), and for GAMMs fitted using package `mgcv` the change in R-squared (`F`). You can select among them by passing e.g.\ `crit='LRT'` within the `buildmerControl` argument. The default `direction` is `c('order','backward')`, i.e.\ proceeding directly to backward stepwise elimination, but *for illustration purposes* we separate those steps here. (The `crit` argument also accepts vectors, such that e.g.\ `direction=c('order','backward'),crit=c('LL','LRT')` is allowed.)

After a lot of model fits, the model converges onto the following maximal model:

```{r,include=F}
library(lme4)
#hack for consistency with actual output without actually fitting the model every time I change something in the vignette
m <- buildmer:::mkBuildmer(model=list(formula=(function () as.formula('f1 ~ following + vowel + timepoint + vowel:timepoint + following:timepoint + following:vowel + following:vowel:timepoint + (1 + timepoint + following + timepoint:following | participant) + (1 + timepoint | word)',.GlobalEnv))()))
```
```{r}
(f <- formula(m@model))
```

The maximal _feasible_ model, i.e.\ the maximal model that is actually capable of converging, is one excluding random slopes for vowels by participants. This is not optimal for inference purposes, but for now it will do; we will see below that taking out the correlation parameters in the random effects makes it possible to include random slopes for vowels as well.
We now proceed to the next step: stepwise elimination. This could also be done using e.g.\ `lmerTest`, but since the machinery was needed for `direction='order'` anyway it came at very little cost to also implement stepwise elimination in `buildmer` (both forward and backward are supported). This uses the same elimination criterion as could be specified previously; if left unspecified, it defaults to `crit='LRT'`, for the likelihood-ratio test. This is the preferred test for mixed models in @stepwise.

```{r,eval=FALSE}
m <- buildmer(f,data=vowels,buildmerControl=list(direction='backward',
	      args=list(control=lmerControl(optimizer='bobyqa'))))
```
```{r,echo=FALSE}
cat('Fitting ML and REML reference models
Fitting via lmer, with REML: f1 ~ following + vowel + timepoint + vowel:timepoint +
    following:timepoint + following:vowel + following:vowel:timepoint + (1 + timepoint + following
    + timepoint:following | participant) + (1 + timepoint | word)

Fitting via lmer, with REML: f1 ~ following + vowel + timepoint + vowel:timepoint +
    following:timepoint + following:vowel + following:vowel:timepoint + (1 + timepoint + following
    + timepoint:following | participant) + (1 + timepoint | word)
Testing terms
Fitting via lmer, with ML: f1 ~ 1 + following + vowel + timepoint + vowel:timepoint +
    following:timepoint + following:vowel + (1 + timepoint + following + timepoint:following |
    participant) + (1 + timepoint | word)
Fitting via lmer, with REML: f1 ~ 1 + following + vowel + timepoint + vowel:timepoint +
    following:timepoint + following:vowel + following:vowel:timepoint + (1 + timepoint + following
    | participant) + (1 + timepoint | word)
Fitting via lmer, with REML: f1 ~ 1 + following + vowel + timepoint + vowel:timepoint +
    following:timepoint + following:vowel + following:vowel:timepoint + (1 + timepoint + following
    + timepoint:following | participant) + (1 | word)
      grouping                      term                              block Iteration           LRT
1         <NA>                         1                            NA NA 1         1            NA
2         <NA>                 following                    NA NA following         1            NA
3         <NA>                     vowel                        NA NA vowel         1            NA
4         <NA>                 timepoint                    NA NA timepoint         1            NA
5         <NA>           vowel:timepoint              NA NA vowel:timepoint         1            NA
6         <NA>       following:timepoint          NA NA following:timepoint         1            NA
7         <NA>           following:vowel              NA NA following:vowel         1            NA
8         <NA> following:vowel:timepoint    NA NA following:vowel:timepoint         1  3.609316e-30
9  participant                         1                   NA participant 1         1            NA
10 participant                 timepoint           NA participant timepoint         1            NA
11 participant                 following           NA participant following         1            NA
12 participant       timepoint:following NA participant timepoint:following         1  1.013211e-10
13        word                         1                          NA word 1         1            NA
14        word                 timepoint                  NA word timepoint         1 2.198802e-153
All terms are significant
Finalizing by converting the model to lmerTest')
```
```{r,echo=FALSE,message=FALSE}
f2 <- as.formula('f1 ~ following + vowel + timepoint + vowel:timepoint + following:timepoint + (1 + timepoint | word) + (1 + timepoint + following + timepoint:following | participant)',.GlobalEnv)
m <- buildmer(f2,vowels,buildmerControl=list(direction=NULL,args=list(control=lmerControl(optimizer='bobyqa'))))
```

It appears that in this example, all terms were significant in backward-stepwise elimination.

By default, `buildmer` automatically calculates summary and ANOVA statistics based on Wald $z$-scores (summary) or Wald $\chi^2$ tests (ANOVA). For answering our research question, we look at the summary:

```{r}
summary(m)
```

The significant effect for `followinglOns:timepoint` shows that if the following /l/ is in the onset of the next syllable, there is a much larger change in F1 compared to the reference condition of having the following /l/ in the coda of the same syllable.

# Diagonal random-effect covariances
One hidden feature that is present in `buildmer` but that has not yet been discussed is the ability to group terms together in blocks for ordering and stepwise-elimination purposes. While the first argument to `buildmer` functions is normally a formula, it is also possible to pass a 'buildmer terms list'. This is a data frame as generated by `tabulate.formula`:

```{r}
tabulate.formula(f)
```

This is an internal `buildmer` data structure, but it is rather self-explanatory in how it is used. It is possible to modify the `block` column to force terms to be evaluated as a single group, rather than separately, by giving these terms the same `block` value. These values are not used in any other way than this purpose of selecting terms to be grouped together, which can be exploited to fit models with diagonal random-effect structures. The first step is to create explicit columns for the factor `vowel`; if this is not done, only random-effect correlations between vowels and _other_ random slopes will be eliminated and those between the vowels themselves will remain.

```{r}
vowels <- cbind(vowels,model.matrix(~vowel,vowels))
```

We next create a formula for this modified dataset. To make it easier to type, we do not explicitly diagonalize the formula ourselves, but use `buildmer`'s `diag()` method for `formula` objects. We then call `tabulate.formula()` on the new formula, providing a regular expression that matches terms belonging to the same vowel. Note that we *cannot* use the simple `vowel` factor in the fixed-effects part of the formula, as this will break `buildmer`'s marginality checks when considering which terms are eligible for inclusion or removal.

```{r}
form <- diag(f1 ~ (vowel1+vowel2+vowel3+vowel4)*timepoint*following + 
	     ((vowel1+vowel2+vowel3+vowel4)*timepoint*following | participant) +
	     (timepoint | word))
terms <- tabulate.formula(form,group='vowel[^:]')
```

Finally, we can instruct `buildmer` to use this specially-crafted `terms` object by simply passing it along instead of a regular formula. `buildmer` will recognize what is going on, and use the variable name specified in the `dep` control argument as the dependent variable in the data frame; this variable name should be provided as a character string.

```{r,eval=FALSE}
m <- buildmer(terms,data=vowels,buildmerControl=buildmerControl(dep='f1',
	      args=list(control=lmerControl(optimizer='bobyqa'))))
## (output not shown)
```

This approach allows random slopes for `vowel` and for `vowel:timepoint` to make it in, both of which significantly improve model fit. This model seems much more adequate for statistical inference.

# Other options
Because `buildmer` does not do any model fitting by itself but is only an administrative formula processor around pre-existing modeling fuctions, it was straightforward to extend it beyond its original purpose of mixed-effects models. The logical extension of `buildmer` to GAMMs is fully supported, with appropriate safeguards against using likelihood-based tests for `bam` and `gamm` models in the generalized case, which use PQL (penalized quasi-likelihood). Relevant functions are available as `buildbam`, `buildgam`, `buildgamm`, and `buildgamm4`; for `buildbam` and `buildgam`, random effects in `lme4` form are converted to `s(...,bs='re')` form automatically. `glmmTMB` models are also supported via function `buildglmmTMB`, although their syntax for covariance structures (e.g. `diag(timepoint | participant)`) is not; these models are still useful for their ability to handle autocorrelation, zero-inflation, and to use REML for GLMMs. From package `nlme`, `gls` models are supported via `buildgls`, `lme` models are supported via `buildlme`. At the request of Willemijn Heeren, `buildmer` was also extended to handle multinomial-logistic-regression models fitted by function `multinom` from package `nnet`; see function `buildmultinom`. `buildmertree` makes it possible to do term ordering and backward elimination of the random-effects part of `glmertree` models. `buildGLMMadaptive` works with function `mixed_model` from package `GLMMadaptive`. `buildclmm` uses functions `clm` and `clmm` from package `ordinal`. Finally, `buildcustom` allows you to write your own wrapper functions, making it possible to use the buildmer machinery with anything that accepts a formula.

# References