--- title: "HD2022" author: "Victor Navarro" output: rmarkdown::html_vignette bibliography: references.bib csl: apa.csl vignette: > %\VignetteIndexEntry{HD2022} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- # The mathematics behind HeiDI The HeiDI model has four major components: 1) the acquisition of reciprocal associations between stimuli, 2) the pooling of those associations into stimulus activations, 3) the distribution of those activations into stimulus-specific response units, and 4) the generation of responses. ## 1 - Acquiring reciprocal associations Whenever a trial is given, HeiDI learns associations among stimuli. The association between two stimuli, $i$ and $j$ is denoted via $v_{i,j}$. The association $v_{i,j}$ represents a directional expectation: the expectation of $j$ after being presented with $i$. Furthermore, its value represents the nature of the effect that $i$ has over the representation of $j$. If positive, the presentation of $i$ "excites" the representation of $j$. If negative, the presentation of $i$ "inhibits" the representation of $j$. HeiDI not only learns "forward" associations between stimuli, but also their reciprocal, or "backward" associations. Thus, if organisms are presented with $i \rightarrow j$, organisms not only learn about $v_{i,j}$, but also about $v_{j, i}$, or the expectation of receiving $i$ after being presented with $j$. Note that, for the sake of brevity, the learning equations below are only specified for forward associations. ### 1.1 - The stimulus expectation rule HeiDI generates expectations about stimuli. The expectation of stimulus $j$ ($e_j$) is expressed as $$ \tag{Eq. 1} e_j = \sum_{k}^{K}x_kv_{k,j} $$ where $K$ is the set containing all stimuli in the experiment, and $x_k$ is a quantity denoting the presence or absence of stimulus $k$ (1 or 0, respectively)[^note1]. ### 1.2 - Learning rule HeiDI learns the appropriate expectations via error-correction mechanisms. After trial $t$, the association between stimuli $i$ and $j$ is expressed as $$ \tag{Eq. 2} v_{i,j, t} = v_{i,j, t-1} + \Delta v_{i,j, t} $$ where $v_{j,i, t-1}$ is the forward association between $i$ and $j$ on trial $t-1$, and $\Delta v_{i,j, t}$ is the change in that association as a result of trial $t$. That delta term uses a pooled error term and is expressed as $$ \tag{Eq. 3} \Delta v_{i,j} = x_i\alpha_i(x_jc\alpha_j - e_j) $$ where $\alpha_i$ and $\alpha_j$ are parameters representing the salience of stimuli $i$ and $j$, respectively ($0 \le \alpha \le 1$), $c$ is a scaling constant ($c = 1$). Note that the term denoting the trial, $t$ has been omitted here for simplicity. ## 2 - Pooling the strength of associations HeiDI pools its stimulus associations to activate stimulus-specific representations. The activation of the representation for stimulus $j$, $a_j$, is defined as: $$ \tag{Eq. 4} a_{j,M} = o_{j,M} + h_{j,M} $$ where $o_{j,M}$ denotes the c**o**mbined associative strength towards stimulus $j$ in presence of stimuli $M$, and $h_{j,M}$ denotes the c**h**ained associative strength towards stimulus $j$ in presence of stimuli $M$. ### 2.1 - Combined associative strength The quantity $o_{j,M}$ is the result of combining the associative strength of forward and backward associations to and from stimulus $j$ as $$ \tag{Eq. 5} o_{j,M} = \sum_{m \neq j}^{M}v_{m,j} + \left(\frac{\sum_{m \neq j}^{M}v_{m,j} \sum_{m \neq j}^{M}v_{j,m}}{c}\right) $$ where each of the sums above run over all stimuli $M$ presented in the trial, different from stimulus $j$.^[An alternative formulation of this equation could be $\sum_{m \neq j}^{M} v_{m,j} + (v_{m,j} v_{j,m})$ but, although this alternative formulation is positively related to Eq. 5, we have not compared their behavior exhaustively.] The left-hand term describes how the forward associations from stimuli $M$ to $j$ affect the representation of $j$, whereas the right-hand term describes how the backward associations that $j$ has with stimuli $M$ affect its representation (although these are modulated by the forward associations themselves). ### 2.2 - Chained associative strength The quantity $h_{j,M}$ captures the indirect associative strength that the stimuli $M$ have with $j$, via absent stimuli. As such, $h_{j,M}$ is defined as $$ \tag{Eq. 6a} h_{j,M} = \sum_{m \neq j}^{M} \sum_{n}^{N}\frac{v_{m,n}o_{j,n}}{c} $$ where N are the stimuli not presented on the trial (i.e., K-M). Note the re-use of $o$, the quantity defined in Eq. 5. This equation allows absent stimuli $N$ to influence the representation of stimulus $j$, as long as they have an association with present stimuli $M$. In Honey and Dwyer (2022), the authors specify a similarity-based mechanism that modulates the effect of associative chains according to the similarity of the salience of nominal and retrieved stimuli^[This mechanism is in model `HD2022` but not in model `HDI2020`]. As such, Eq. 6a is expanded as: $$ \tag{Eq. 6b} h_{j,M} = \sum_{m \neq j}^{M} \sum_{n}^{N}S(\alpha_{n}, \alpha'_n)\frac{v_{m,n}o_{j,n}}{c} $$ where $S$ is a similarity function that takes the nominal salience of stimulus n, $\alpha_n$ (as perceived when $n$ is presented on a trial) and its retrieved salience, $\alpha'_n$ (as perceived when $n$ is retrieved via other stimuli M, see ahead). This function is defined as: $$ \tag{Eq. 7} S(\alpha_n, \alpha'_n) = \frac{\alpha_n}{\alpha_n + |\alpha_n-\alpha'_n|} \times \frac{\alpha'_n}{\alpha'_n+ |\alpha_n-\alpha'_n|} $$ Notably, whenever there is more than one nominal salience for a given stimulus, then $\alpha_n$ is the arithmetic mean among all nominal values (see "heidi_similarity" vignette). ## 3 - Distributing strength into stimulus-specific response units HeiDI then distributes the pooled stimulus-specific strength among all $K$ stimuli, according to their relative salience. The activation of response unit $j$, $R_j$ is expressed as $$ \tag{Eq. 8} R_{j,k} = \frac{\theta(j)}{\sum_{k}^{K}\theta(k)}a_{k,M} $$ where $j \in K$. As $K$ can include both present and absent stimuli, the $\theta$ function above depends on whether the stimulus $k$ is absent (i.e., $k \in N$) or not (i.e., $k \in M$), as: $$ \tag{Eq. 9} \theta(k) = \begin{cases} \left |\sum_{m}^{M}\left( v_{m,k}+\sum_{n \neq k}^{N}\frac{v_{m,n}v_{n,k}}{c}\right) \right|,& \text{if } k \in N\\ \alpha_k, & \text{otherwise} \end{cases} $$ Note that the quantity for absent stimuli is absolute, to prevent negative $\theta$ values due to inhibitory associations^[An alternative and perhaps more naturalistic parametrization of this rule would be to use $min[0,\theta(n)]$, where $min$ is the minimum function and $n$ is an absent stimulus; ReLUs are extensively used in neural networks. Another alternative that avoids the use of absolute values or a rectifying mechanism would be to use quantities of $e^{\theta(k)}$ instead of $\theta(k)$.]. Also, note a summation term is used on the left-hand side of the expression for an absent stimulus. It implies that all the present stimuli $M$ contribute to the salience of stimulus $k$. Finally, note on the right-hand side of the same expression that the present stimuli contribute not only via the direct association each of them has with $k$, $v_{m,k}$ but also through associative chains with other absent stimuli (c.f., Eq. 6a). ## 4 - Generating responses Finally, HeiDI responds. The response-generating mechanisms in HeiDI are currently underspecified. In its current version, HeiDI's responses are the product of the activation of stimulus-specific response units and the connection that those units have with specific motor units. As such, the activation of motor unit $q$, $r_q$, is given by $$ \tag{Eq. 10} r_q = R_jw_{j,q} $$ where $w_{j,q}$ is a weight representing the association between stimulus-specific unit $j$ and motor unit $q$. [^note1]: We go the extra length of specifying $x$ quantities because the stimulus expectation and learning rules can be vectorized, as $\textbf{e} = \textbf{x}V$ and $\Delta V = (\textbf{x}\odot\textbf{a})' (c(\textbf{x}\odot\textbf{a})-\textbf{e})$, respectively. Here, the matrix $V$ contains all associations between each pair of stimuli, the row vectors $\textbf x$ and $\textbf a$ denote the presence and salience of all stimuli $K$, the $\odot$ symbol specifies element-wise multiplication, and the $'$ symbol denotes transposition. Note further that the $\Delta V$ matrix must be made hollow before summing it to $V$.