Tipultech logo

When to apply multi-level modelling

Author: Dr Simon Moss


Multi-level modelling refers to a set of techniques in which the data can be measured at several levels, such as individuals, classes, teams, organizations, and so forth. To illustrate, consider a researcher who wants to identify the main predictors of burnout. Suppose the researcher assesses three predictors:

  1. Work hours: the average number of hours the person works every day
  2. Laughter: the average number of times the person laughs every day
  3. Size of organization: the number of employees who work at the organization.

The researcher distributes surveys to 500 employees across 25 organizations. In this instance, the variables are measured at two 'levels': level 1 and level 2:

  1. Level 1 variables include burnout, work hours, and laughter - in which each individual generates his or her own score
  2. Level 2 variables include size of organization - in which each organization, rather than each individual, generates its own score.
Person Organization Burnout Work Hours Laughter Size of Organization
1 1 9 53 6 1943
2 1 9 56 7 1943
3 1 8 51 6 1943
4 2 1 38 1 55
5 2 2 32 4 55
6 2 1 31 3 55
.. .. .. .. .. ..
100 10 5 45 1 10

Hierarchical linear models (Raudenbush & Bryk, 2002), random coefficient models (Kreft & de Leeuw, 1998), variance component models (Longford, 1989), or multilevel random coefficient models (Nezlek, 2001, 2003) are all forms of multi-level modelling and are, essentially, equivalent to one another.

Other examples

Typically, multi-level modelling is applied when the participant correspond to specific groups, such as organizations. However, multi-level modelling might need to be applied in many other instances as well. Richter (2006) provided an excellent example in the literature on text comprehension. Often, several participants must read a few sentences and the reading time for each sentence is recorded. In this instance, the sentences represent Level 1, whereas the participants represents Level 2.

Alternatives to multi-level modelling

Neglect of Level 2 variables

In some instances, researchers do not want to examine Level 2 variables, such as size of organizations. They might, for example, merely want to examine Level 1 variables, such as the relationship between burnout and work hours. In this instance, many researchers merely disregard the organizations from which the individuals were selected.

Two problems arise in this instance. First, an aggregation bias might emerge. To illustrate, perhaps when organizations are large, burnout and work hours tends to be elevated. When organizations are small, burnout and work hours might be reduced. The following table epitomizes this possibility.

Person Organization Burnout Work Hours Laughter Size of Organization
1 1 9 53 6 1943
2 1 9 56 7 1943
3 1 8 51 6 1943
4 2 1 38 1 55
5 2 2 32 4 55
6 2 1 31 3 55
.. .. .. .. .. ..
100 10 5 45 1 10

Hence, if both sets of organizations are included, the sample will include a set of individuals with elevated levels of burnout and work hours and a set of individuals with reduced levels of burnout and work hours. These two variables will seem to be related. In this example, for instance, high levels of burnout seem to coincide with high levels of work hours.

However, at any specific organization, these two variables might not be related. In other words, although the variables seem to coincide with each other, work hours might not increase burnout within one organization.

Second, standard errors tend to be underestimated, and thus Type I errors often arise, especially if intra-class correlations are elevated. Intra-class correlations, in this instance refer to the extent to which burnout, work hours, or laugher differs across the organizations. If these variables differ across the organizations, the standard error will be underestimated (Kreft & de Leeuw, 1998; Snijders & Bosker, 1999).

Least-squares dummy variable approach

Rather than conduct multi-level modelling, some individuals use a technique called the least-squares dummy variable approach or fixed-effects approach to clustering (see Richter, 2006). In essence, individuals conduct typical regression analyses. However, in addition, they might include dummy variables to represent each organization.

That is, each dummy variable corresponds to one organization. The variable is coded as 1 if that person corresponds to that organization and 0 otherwise. In this instance, persons 1, 2 and 3 are employed at Company 1. Persons 4, 5 and 6 are employed at Company 2, and so forth; more dummy variables would need to be included to represent the other companies.

Person Burnout Work Hours Laughter Company 1 Company 2
1 9 53 6 1 0
2 9 56 7 1 0
3 8 51 6 1 0
4 1 38 1 0 1
5 2 32 4 0 1
6 1 31 3 0 1
.. .. .. .. .. ..
100 5 45 1 0 0

This technique accommodates differences in the mean value of the criterion, such as burnout, across organizations. This technique is possible if researchers want to examine Level 1 variables, such as burnout and work hours, that are nested within Level 2 units, such as organizations. This technique is not possible, however, if researchers want to examine Level 2 variables, such as size of organizations. This technique is also not possible to examine whether the relationship between Level 1 variables, such as burnout and work hours, varies across Level 2 units, such as organizations, because too many interaction terms would need to be included.

Two-step procedures

Another alternative to multi-level modelling is to undertake a regression analysis for each Level 2 unit, such as organization, separately. Then, additional analyses are conducted to ascertain whether the B values that emerge differ across these Level 2 units.

The researcher, for example, could undertake one regression analysis to examine whether burnout is related to work hours and laughter in the first organization. They could then conduct another regression analysis to examine whether burnout is related to work hours and laughter in the second organization, and so forth. They might generate the following equations

For Company 1: Burnout = 4.4 + 1.4 x Work hours - 0.5 x Laugher + Error
For Company 2: Burnout = -1.3 + 0.8 x Work hours - 0.2 x Laugher + Error

Subsequently, they can examine whether the B coefficients that emerge differ across the organizations. An ANOVA, for example, could be conducted to explore whether or not the B coefficients associated with work hours or laughter vary across the organizations. Alternatively, a multiple regression could be conducted to examine whether size of organizations is related to the B coefficients associated with work hours or laughter.

Conceptually, this approach resembles multi-level modelling. Nevertheless, this two step approach is not as effective. First, because each organization is examined separately, the standard errors are often elevated and hence the estimates of B values are not as reliable (Richter, 2006).

Second, the standard error or reliability of these B values differs across Level 2 units - in this instance, across the organizations. Unlike multi-level modelling, this two step approach seldom accommodates the variability across organizations. That is, the second step should assign more weight to the B values that correspond to lower standard errors. Conventional regression analysis at the second step, however, does not fulfil this objective. Fortunately, weighted least squares, a variation of regression analyses, can solve this problem.

Third, in contrast to multi-level modelling, this two-step approach does not separate the variance associated with the two levels - in this instance, individuals and organizations, appropriately. This problem arises because these variances are considered in sequence, not in parallel (van der Leeden, 1998).

Benefits of multi-level modelling

In short, multi-level modelling presents several benefits over conventional forms of regression analysis. In particular, multi-level models can be applied to examine whether the relationship between variables at one level, such as burnout and work hours depends on a variable at another level, such as the size of organizations. Perhaps burnout and work hours are related only when the organizations are large. Conventional techniques cannot examine interactions between variables measured at different levels.

Criteria that indicate that multi-level modelling is suitable

Intra-class correlations

Some researchers argue that multi-level modelling might not be applicable when the intra-class correlations are low, approaching zero. When the ICC is low, the variables do not differ across the higher-level units, the teams, organizations, and so forth. For instance, in the previous example, perhaps burnout, work hours, and laughter do not differ across the organizations. Hence, according to some researchers, multi-level modelling would not be regarded as essential in this circumstance. To calculate the intra-class coefficient in this instance, conduct an ANOVA. The dependent variable could be burnout, work hours, or laughter. The random factor is the organization. A significant p value at a lenient alpha, such as .25, indicates the variables might differ across the organizations and thus multi-level modeling is essential. To compute the intra-class coefficient, the numerator is the between-subject mean square minus the within-subject mean square. The denominator is the between-subject mean square plus (n - 1) x the within-subject mean square, where n is the number of individuals in each group (see also Snijders & Bosker, 1999). In SPSS, the line associated with the organization and Hypothesis is called the between-subject mean square. The line associated with the organization and error is called the within-subject mean square. Unfortunately, the intra-class coefficient is difficult to calculate when the number of individuals varies across groups.

In addition, according to Nezlek (2008), even when the ICC approaches 0, multi-level modelling might still be essential. For example, even when the ICC approaches 0, the relationship between two variables - such as burnout and work hours - might vary across the groups or organizations. Yet, if the researchers do not conduct multi-level modelling, the analyses they conduct tend to assume these relationships do not vary across groups or organizations.

Design effect

Rather than focus on the interaction effect, many researchers calculate the design effect. Conceptually, the design effect is the squared standard error when a multilevel design is used divided by the squared standard error when a standard design is used. In this instance, the design effect equals (Snijders & Bosker, 1999):
DE = 1 + (Average number of individuals in each group - 1) x Intra-class correlation.
Many researchers undertake multi-level modelling when the design effects exceed 2 (for Monte Carlo data, see Muthen & Satorra, 1995).

Number of groups and individuals in each group

Multi-level modelling might not be applicable if the number of groups - that is, teams, organizations, regions, and so forth -is too low. For example, suppose the researcher had examined only 3 organizations. In this instance, the researcher has not collected enough information to form inferences that generalize across organizations. Accordingly, multi-level modelling is clearly unsuitable.

Nezlek (2008) suggests that 10, or even fewer, groups or organizations might be sufficient to generalize across organizations. This number, however, might not be sufficient to ensure that power is adequate.

Indeed, several authors discuss the minimum number of groups, as well as the minimum individuals that should be assessed in each group, to ensure power is sufficient (e.g., Maas & Hox, 2005; Richter, 2006). Richter (2006), for example, refers to the 30-30 rule, in which at least 30 groups, each comprising at least 30 individuals, is suitable to ensure sufficient power. This rule is applicable when the design comprises two levels and interactions across levels, such as between work hours and size of organizations, need to be examined.

More specifically, Mok (1995) examined whether researchers should attempt to maximize number of Level 1 or Level 2 units - that is, whether should maximize number of individuals in each group or number of groups, for example. As simulation studies demonstrated, researchers should specifically attempt to maximize the number of groups rather than individuals in this instance. This approach is especially likely to increase power, particularly for estimates of variance components rather than fixed effects.

As a consequence of these observations, the 30-30 rule has often been challenged. Hox (1998), for example, champions the 50-20 rule - at least 50 organizations or units at Level 2 and at least 20 individuals or units at Level 1 within each organization. Indeed, according to Hox (1998), if researchers need to estimate variance components accurately, a 100-10 rule might apply.


Goldstein, H. (2003). Multilevel statistical models (3rd ed.). New York: Oxford University Press.

Hox, J. J. (1995). Applied multilevel analysis. Amsterdam: TT-Publikaties.

Hox, J. J. (1998). Multilevel modeling: When and why. In I. Balderjahn, R. Mathar, & M. Schader (Eds.), Classification, data analysis and data highways (pp. 147-154). New York: Springer.

Kenny, D. A., Manetti, L., Pierro, A., Livi, S. & Kashy, D. A. (2002). The statistical analysis of data from small groups. Journal of Personality and Social Psychology, 83, 126-137.

Kreft, I. G. G., & de Leeuw, J. (1998). Introducing multilevel modeling. Newbury Park, CA: Sage Publications.

Maas, C. J. M., & Hox, J. J. (2005). Sufficient sample sizes for multilevel modeling. Methodology, 1, 86-92.

Mok, M. (1995). Sample-size requirements for 2-level designs in educational research.Multilevel Modelling Newsletter, 7, 11-15.

Muthen, B. O., & Satorra, A. (1995). Complex sample data in Structural Equation Modeling. Sociological Methodology, 25, 267-316.

Nezlek, J. B. (2001). Multilevel random coefficient analyses of event and interval contingent data in social and personality psychology research. Personality and Social Psychology Bulletin, 27, 771-785.

Nezlek, J. B. (2003). Using multilevel random coefficient modeling to analyze social interaction diary data. Journal of Social and Personal Relationships, 20, 437-469.

Nezlek, J. B. (2008). An introduction to multilevel modeling for Social and Personality Psychology. Social and Personality Psychology Compass, 2, 842-860.

Nezlek, J. B., & Zyzniewski, L. E. (1998). Using hierarchical linear modeling to analyze grouped data. Group Dynamics: Theory, Research, and Practice, 2, 313-320.

Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical Linear Models (2nd ed.). Newbury Park, CA: Sage Publications.

Richter, T. (2006). What is wrong with ANOVA and multiple regression? Analyzing sentence reading times with hierarchical linear models. Discourse Processes, 41, 221-250.

Snijders, T., & Bosker, R. (1999). Multilevel Analysis. London: Sage Publication.

van der Leeden, R. (1998). Multilevel analysis of repeated measures data. Quality and Quantity, 32, 15-29.

Academic Scholar?
Join our team of writers.
Write a new opinion article,
a new Psyhclopedia article review
or update a current article.
Get recognition for it.

Last Update: 6/17/2016