Anova With Repeated Measures In R

ANOVA with Repeated Measures in R: A Comprehensive Guide

Analyzing data with repeated measures is crucial in many research fields. Repeated measures ANOVA (rANOVA) is a powerful statistical technique used when the same subjects are measured multiple times under different conditions or at different time points. This guide provides a comprehensive walkthrough of performing repeated measures ANOVA in R, covering everything from the underlying assumptions to interpreting the results and handling violations.

Understanding Repeated Measures ANOVA

Repeated measures ANOVA is a statistical test that examines the differences between the means of three or more groups when the same participants are measured repeatedly. Unlike independent measures ANOVA, which compares independent groups, repeated measures ANOVA accounts for the correlation between repeated measurements from the same subject. This correlation improves the statistical power of the test, making it more sensitive to detecting real effects.

Key Advantages of Repeated Measures ANOVA:

Increased Power: By controlling for individual differences, it's more likely to detect significant effects compared to independent measures ANOVA.
Reduced Error Variance: The within-subject variability is reduced because individual differences are accounted for.
Efficiency: Fewer participants are needed compared to independent measures ANOVA to achieve the same level of power.

When to Use Repeated Measures ANOVA:

Longitudinal Studies: Tracking changes in a variable over time within the same individuals.
Pre-Post Designs: Comparing measurements before and after an intervention.
Within-Subjects Experiments: Testing the same subjects under different conditions or treatments.

Assumptions of Repeated Measures ANOVA

Before conducting a repeated measures ANOVA, it's essential to check if the data meets the following assumptions:

Normality: The data within each group should be approximately normally distributed. This can be checked using histograms, Q-Q plots, and Shapiro-Wilk tests. However, repeated measures ANOVA is relatively robust to violations of normality, especially with larger sample sizes.
Sphericity: This is the most crucial assumption. Sphericity implies that the variances of the differences between all pairs of levels of the within-subject factor are equal. Mauchly's test of sphericity is used to assess this assumption. If sphericity is violated, adjustments to the degrees of freedom (e.g., Greenhouse-Geisser or Huynh-Feldt corrections) are necessary.
Independence: Observations within each subject should be independent of each other. This is typically violated if there is a strong temporal autocorrelation (e.g., measurements are very close together in time).

Performing Repeated Measures ANOVA in R

Let's demonstrate how to conduct a repeated measures ANOVA in R using a simulated dataset. We'll create a dataset with three repeated measures (time1, time2, time3) for 20 subjects.

# Set seed for reproducibility
set.seed(123)

# Simulate data
data <- data.frame(
  subject = 1:20,
  time1 = rnorm(20, mean = 10, sd = 2),
  time2 = rnorm(20, mean = 12, sd = 2),
  time3 = rnorm(20, mean = 15, sd = 2)
)

# Reshape the data for repeated measures ANOVA
library(tidyr)
data_long <- data %>%
  pivot_longer(cols = c(time1, time2, time3),
               names_to = "time",
               values_to = "score")

library(dplyr)
data_long$time <- factor(data_long$time, levels = c("time1", "time2", "time3"))

# Perform repeated measures ANOVA
library(ez)
model <- ezANOVA(
  data = data_long,
  dv = score,
  wid = subject,
  within = time
)

print(model)

This code uses the ez package, which simplifies the process significantly. The ezANOVA function takes the long-format data as input, specifying the dependent variable (score), subject ID (subject), and the within-subject factor (time). The output provides the F-statistic, p-value, and degrees of freedom for the within-subject effect.

Interpreting the Results

The output of ezANOVA will provide information on the significance of the within-subject effect. A significant p-value (typically < 0.05) indicates that there is a significant difference between the means of the repeated measures. However, this only indicates an overall effect. Post-hoc tests are needed to determine which specific time points differ significantly from each other.

Post-Hoc Tests

To perform post-hoc tests, we can use the pairwise.t.test function with a correction for multiple comparisons (e.g., Bonferroni).

# Perform pairwise t-tests with Bonferroni correction
pairwise.t.test(data$time1, data$time2, data$time3, p.adjust.method = "bonferroni")

This will provide pairwise comparisons between all three time points, adjusted for multiple comparisons.

Handling Violations of Sphericity

If Mauchly's test of sphericity indicates a violation (p < 0.05), the degrees of freedom need to be adjusted. The ezANOVA function automatically provides corrected p-values using the Greenhouse-Geisser and Huynh-Feldt corrections. These corrections are more conservative than the uncorrected p-value. Choose the correction method based on the severity of the sphericity violation.

More Complex Designs: Multiple Within-Subject Factors

Repeated measures ANOVA can easily handle more complex designs with multiple within-subject factors. For instance, you might have a study design involving both time and treatment as within-subject factors. The ezANOVA function can accommodate this with minimal changes:

# Simulate data with two within-subject factors
data2 <- data.frame(
  subject = rep(1:20, each = 4),
  treatment = rep(rep(c("A", "B"), each = 2), 20),
  time = rep(c("time1", "time2"), 40),
  score = rnorm(80, mean = 10, sd = 2) # adjust mean and sd as needed for effect
)

model2 <- ezANOVA(
  data = data2,
  dv = score,
  wid = subject,
  within = c(treatment, time)
)

print(model2)

This will test the main effects of treatment and time, as well as their interaction.

Beyond ANOVA: Mixed-Effects Models

For more complex designs or when assumptions are severely violated, mixed-effects models are a powerful alternative. Mixed-effects models offer greater flexibility and can handle missing data more gracefully. The lme4 package in R provides tools for fitting linear mixed-effects models:

library(lme4)

model_mixed <- lmer(score ~ time + (1|subject), data = data_long)
summary(model_mixed)

This fits a mixed-effects model with time as a fixed effect and subject as a random effect. The (1|subject) term accounts for the correlation between repeated measures from the same subject.

Conclusion

Repeated measures ANOVA is a valuable tool for analyzing data with repeated measurements. R, with packages like ez and lme4, offers efficient and versatile methods for performing these analyses. Remember to carefully check the assumptions of the ANOVA and consider alternatives like mixed-effects models if assumptions are violated. By understanding the principles and applying the appropriate techniques, you can effectively analyze your repeated measures data and draw meaningful conclusions from your research. Remember to always interpret your results in the context of your research question and design. Careful consideration of effect sizes and confidence intervals adds depth and robustness to your analyses.

Anova With Repeated Measures In R

Table of Contents