At Mid-Phon 17, I was talking to a colleague about mixed-effects (i.e., multilevel) modeling, and he stated very matter-of-fact-ly that when a random-intercepts-by-subjects model is estimated, there are no intercepts estimated for each individual subject. Rather, he continued, the model specifies a probability distribution over the subject-specific intercepts, so you get an estimate of the variance of the intercepts (with, though it wasn’t mentioned explicitly during this conversation, the group-level intercept providing the mean of the distribution).
I didn’t know how to respond, since this seemed obviously wrong to me. It seemed wrong to me because it is wrong. The next talk was starting, though, so there was but a brief moment of awkward silence as I tried to process these assertions prior to returning to my seat.
The topic didn’t come up again at the meeting, though shortly thereafter I talked to a (different) colleague (and friend) of mine who is knowledgable about such models. He immediately asked if the first colleague uses SPSS. I didn’t know the answer, but it’s certainly possible, even plausible. Colleague #2 asked because, apparently, SPSS makes it exceptionally difficult (maybe impossible) to see the estimated subject-specific (or item-specific, or whatever-grouping-variable-specific) parameters in a fitted model.
So maybe SPSS users have a strange, and limited, view of how random effects work, I thought. Until today.
Today, I was finishing up reading Florian Jaeger’s 2008 paper (pdf) arguing that mixed-effects logistic regression models are superior to ANOVA when doing categorical data analysis. I came to more or less the same conclusion sometime around 2008, though without reading Jaeger’s paper. I say more or less the same conclusion because while I agree with Jaeger that (the traditional conception of) ANOVA isn’t appropriate for categorical data analysis, I think that logistic regression is just one of a number of more appropriate models (GRT being another such model, at least for certain cases).
In any case, I finally got around to starting Jaeger’s paper recently, and I picked it back up today only to find this quote (p. 443):
The only parameter the model fits for the random effects is their variance (see also Baayen et al., 2008; for details on the implementation, see Bates & Sarkar, 2007).
Jaeger is discussing these models in the context of R and lme4, so I am kind of flabbergasted at this assertion. I’m flabbergasted in part because when you fit a multilevel regression model (logistic or otherwise) using lmer, the function with the lmer output object as the argument returns the estimated random effects parameters. In fact, I just happened to have used to make a figure illustrating the distribution of random item intercepts in a mixed-effects logistic regression model a few days ago (for one of my ICA/ASA/CAA proceedings papers):
When a mixed-effects model is fitted to a data set, its set of estimated parameters includes the coefficients for the fixed effects on the one hand, and the standard deviations and correlations for the random effects on the other hand. The individual values of the adjustments made to intercepts and slopes are calculated once the random-effects parameters have been estimated. Formally, these adjustments, referenced as Best Linear Unbiased Predictors (or BLUPs), are not parameters of the model.
I don’t know much about BLUPs, so maybe I’m way off-base to say that it strikes me as rather silly to treat estimated random effects as substantively distinct from parameters. They’re certainly not data; we haven’t, indeed we can’t in principle, observe them. And they enter into the equation from which estimated predicted dependent variables are calculated, and in, in every relevant respect that I can think of, in exactly the same way that the ‘fixed effects’ parameters do. In addition, statistical bias is a property of a model’s parameter(s), so to call the random effects unbiased non-parameters seems paradoxical. All of which makes them sound an awful lot like parameters to me. The fact that the variance and covariance parameters governing the random effects are estimated prior to estimating the random effects themselves is neither here nor there with respect to what we call the latter.
Indeed, it seems very odd to me that the variance and covariance parameters are estimated first, since my intuition is that there would need to be something varying and covarying in order to estimate these. I come at all of this from a Bayesian perspective, by which I mean that the first few multilevel models I fit to data, I built and estimated using BUGS and JAGS. In these cases, I don’t see any way that the ‘random effects’ could not be parameters while the means, variances, and covariances governing them are. These bits aren’t data, but they are part of the model equation – without them, you can’t calculate the likelihood function (see, e.g., the ‘theta construction’ section of this model or the role that the Bsp and Bsf arrays play in this model).
So, okay, it’s still possible that SPSS users have a weird, limited understanding of mixed-effects modeling. But there’s something else going on, too, and it’s not entirely clear to me what it is. I assume that the D. M. Bates who is the third author on the Baayen et al. paper is the same D. Bates that co-developed lme4, so I’m perfectly willing to grant that, with respect to (penalized maximum likelihood) estimation, the random effects, on the one hand, and the fixed effects and variances/covariances governing the random effects, on the other, are treated differently. But it seems confusing and confused, at best, to insist that the former are not parameters.