Hierarchical Linear Modeling

Date Posted:

Friendly reminder that I am a student, and not a professional in the field. Please review the resources below before implementing the model. If there are any errors, please leave a comment/contact me and I will get it resolved immediately.

Introduction

Data often contains an inherent hierarchical nature, where some variables are nested within others.

A common example is student education. The data could be broken up into the following levels (example from (Woltman et al., 2012)):

  1. Individual students (IQ, GPA, Gender)
  2. Classroom (Teacher experience, class size)
  3. School (Geographical location, budget)

Here, students (level 1) are nested within classrooms (level 2), which are then nested within schools (level 3). One can immediately imagine this extending to an even greater number of levels if we consider school district, country region, or other grouping variables.

From the paper, the following sample hypothesis was provided:

What school-, classroom-, and student-related factors influence students’ Grade Point Average?

With Hierarchical Linear Modeling (HLM), we can account for and investigate the relationships between and within hierarchical levels.

Naive approach

What if we just pretend the hierarchical nature doesn’t exist, and we just continue with a basic regression?

If there are substantial differences at higher levels, it seems that ignoring the hierarchical nature may lead to behavior extremely similar to the Simpson’s Paradox. Naive models may show a strong positive correlation, yet including the multilevel nature may show a strong negative correlation. Similar to including an additional variable to show the true nature of a variable in Simpson’s paradox, HLM creates new intercepts for each higher-level grouping (at least in 2-level models).

Additionally, suppose that individuals within higher-level groups are correlated with each other. Immediately, this breaks the independence condition of simple regressions.

If one decides to aggregate the data to a higher level (i.e. instead of taking individual data we average it to the classroom level), we risk losing variability caused by individuals, potentially completely changing relationships between variables.

We do note that if the higher level variables are non-significant, simple regressions should be sufficient. This condition can be tested for via ANOVA.