---
title: "Getting the Most out of DAGassist Using Parameters"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Get Started}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

```

```{r dev-load, include=FALSE}
# Prefer source build when available (works in RStudio, pkgdown, or local render)
if (requireNamespace("devtools", quietly = TRUE) && file.exists(file.path("..","DESCRIPTION"))) {
  # Don't error on CRAN/build machines that don't have devtools or the source path
  try(devtools::load_all("..", quiet = TRUE), silent = TRUE)
}

# If we've already loaded from source, avoid re-attaching a different installed build later
from_source <- try({
  "DAGassist" %in% loadedNamespaces() &&
    grepl(normalizePath(".."), getNamespaceInfo(asNamespace("DAGassist"), "path"), fixed = TRUE)
}, silent = TRUE)
from_source <- isTRUE(from_source)

# Feature gates (computed *after* attempting load_all)
has_show <- tryCatch({
  "show" %in% names(formals(DAGassist::DAGassist))
}, error = function(e) FALSE)

# Robust check: dev build defines a private .report_dotwhisker helper
has_dotwhisker <- tryCatch({
  exists(".report_dotwhisker", envir = asNamespace("DAGassist"), inherits = FALSE)
}, error = function(e) FALSE)
```

```{r ex-dag, include=FALSE}
library(dagitty)
library(ggdag)

dag_model <- dagify(
  Y ~ X + M + Z + A + B,
  X ~ Z,
  C ~ X + Y,
  M ~ X,
  exposure = "X",
  outcome  = "Y"
)

set.seed(42)
n <- 2000

#exogenous variables
A <- rnorm(n, 0, 1)
B <- rnorm(n, 0, 1)
Z <- rnorm(n, 0, 1)

#structural equations
# X ~ Z
beta_zx <- 0.8
X <- beta_zx * Z + rnorm(n, 0, 1)

# M ~ X
beta_xm <- 0.9
M <- beta_xm * X + rnorm(n, 0, 1)

# Y ~ X + M + Z + A + B
bX <- 0.7; bM <- 0.6; bZ <- 0.3; bA <- 0.2; bB <- -0.1
Y <- bX*X + bM*M + bZ*Z + bA*A + bB*B + rnorm(n, 0, 1)

# C ~ X + Y 
bXC <- 0.5; bYC <- 0.4
C <- bXC*X + bYC*Y + rnorm(n, 0, 1)

reg_levels <- c("North", "South", "East", "West")
region <- factor(sample(reg_levels, n, replace = TRUE))

df <- data.frame(A, B, Z, X, M, Y, C, region)
```

# Introduction

`DAGassist()` is meant to be simple and easy to use, and most of its features can be enjoyed via a simple two-parameter argument:
```{r example, eval=FALSE}
library(DAGassist)
library(dagitty)

DAGassist(
  dag = your_dag_model,
  formula = your_regression_call
)
```

But it also offers several parameters for more specific applications. They control how the DAG is evaluated (`imply`, `eval_all`), how results print (`show`, `labels`, `omit_factors`, `omit_intercept`, `verbose`), which modeling engine to use (`engine`, `engine_args`), and which output format to write (`type`, `out`). This vignette walks through each with small examples.

# Core Arguments
## `dag` and `formula`

`formula` can be a standard `formula + data` regression call, from which `DAGassist` will impute the necessary information, or three separate `formula`, `data`, and `engine` arguments. 

```{r formula, eval=FALSE}
#imputed formula
DAGassist(
  #implies the exposure and outcome from the dagitty object
  dag = dag_model, 
  #implies the engine, formula, and data from the regression call
  formula = lm(Y ~ X + C, data=df) 
)

#plain formula
DAGassist(
  dag = dag_model,
  engine = stats::lm, #stats::lm is the default engine arg
  formula = Y ~ X + C,
  data = df,
  exposure = "X",
  outcome = "Y"
)
```
The two formulas above will print identical output. 

# Scope Flags

## `imply`: evaluate on only mentioned variables vs the full DAG

 - `imply = FALSE` (default): prune the DAG to just exposure, outcome, and your RHS variables; roles/sets are computed on this pruned graph.
 - `imply = TRUE`: evaluate on the full DAG and allow DAG-implied controls to enter minimal/canonical sets (you’ll be told what’s added).

```{r imply-demo}
#pruned-to-formula DAG
DAGassist(dag = dag_model, formula = Y ~ X + C, data = df, imply = FALSE, show = "roles")

#full-DAG evaluation
DAGassist(dag = dag_model, formula = Y ~ X + C, data = df, imply = TRUE,  show = "roles")
```

## `eval_all`: keep non-DAG RHS terms in derived models

Sometimes your RHS has terms that aren’t DAG nodes (e.g., fixed effects via `i(region)`, factor expansions, interactions, splines). `eval_all` decides whether these non-DAG terms are kept in minimal/canonical formulas.
 - eval_all = FALSE (default): drop RHS terms not present as DAG nodes from the derived formulas.
 - eval_all = TRUE: keep all original RHS terms that aren’t DAG nodes (e.g., fixed effects), in addition to the DAG-based controls.
 
```{r omit, eval=FALSE}
DAGassist(
    dag = dag_model,
    formula = fixest::feols(Y ~ X + C + fixest::i(region), data = df),
    imply = TRUE,
    eval_all = TRUE
    )
```

# Display and Labeling

## `show`: sub-reports
 - "all" (default): roles grid + model comparison
 - "roles": just the roles/flags table
 - "models": just the model comparison

```{r show-demo, eval=FALSE}
# just the roles table
DAGassist(dag = dag_model, formula = Y ~ X + Z + C, data = df, show = "roles")
#just the model comparison
DAGassist(dag = dag_model, formula = Y ~ X + Z + C, data = df, show = "models")
```

## `labels`: human-readable names

Provide a named character vector or a small data frame. Note that the `label` parameter uses `modelsummary()` `coef_rename` logic, so an incomplete label list will not throw any errors. 

```{r labels}
labs <- list(
  X = "Exposure",
  Y = "Outcome",
  C = "Collider"
)

DAGassist(
  dag = dag_model, formula = lm(Y ~ X + C, data = df),
  show = "roles", labels = labs
)
```

## `omit_intercept` and `omit_factors`: output-only filters

These flags only suppress rows in the printed model comparison. They do not remove terms from estimation. `omit_factors` in particular is useful for conserving space in your report, as 
reports with factors included can be hundreds of rows. 

```{r omit-demo, eval=FALSE}
DAGassist(
    dag = dag_model,
    formula = fixest::feols(Y ~ X + Z + i(region), data = df),
    omit_intercept = TRUE, omit_factors = TRUE # both TRUE by default
  )
```

## `bivariate`: include a no-covariate comparison column

Include a `Y ~ X` column for readers who want the raw association. `bivariate = FALSE` by default.

```{r bivariate}
DAGassist(
  dag = dag_model, 
  formula = lm(Y ~ X + C, data = df),
  show = "models",
  bivariate = TRUE
)
```


## `verbose`: printing formulas & notes

`verbose` = TRUE (default) prints helpful notes (what was added/dropped, derived formulas). Set to FALSE for a quieter console.

```{r verbose-demo, eval=FALSE}
DAGassist(dag = dag_model, formula = Y ~ X + Z + C, data = df, verbose = FALSE)
```

# Parameter Reference Table

| Parameter        | Type                     | Default   | What it does |
|:----------------:|:-------------------------:|:---------:|:--------------|
| `dag`            | dagitty object           | —         | The DAG to validate and evaluate. |
| `formula`        | formula or single call   | —         | Either `Y ~ X + ...` or a single engine call like `feols(...)`. |
| `data`           | data.frame               | —         | Required unless supplied in engine call. |
| `engine`         | function                 | `stats::lm` | Modeling function (ignored if `formula` is a call). |
| `engine_args`    | named list               | `list()`  | Extra args for `engine(...)`; merged with call args (call wins). |
| `verbose`        | logical                  | `TRUE`    | Print formulas & notes in console. |
| `type`           | string                   | `"console"` | One of `"console"`, `"latex"`, `"docx"/"word"`, `"xlsx"/"excel"`, `"text"/"txt"`. |
| `out`            | path                     | —         | Output path for non-console types. |
| `imply`          | logical                  | `FALSE`   | Scope: pruned-to-formula vs full-DAG evaluation. |
| `labels`         | named chr / data.frame   | `NULL`    | Rename coefficients (modelsummary `coef_rename` logic). |
| `omit_intercept` | logical                  | `TRUE`    | Hide intercept in printed comparison. |
| `omit_factors`   | logical                  | `TRUE`    | Hide factor levels in printed comparison. |
| `show`           | string                   | `"all"`   | `"all"`, `"roles"`, or `"models"`. |
| `eval_all`       | logical                  | `FALSE`   | Keep non-DAG RHS terms (FEs, splines, interactions) in derived models. |