Let’s say you were doing an experiment that tested whether there were any sex differences between the blink reaction times to an air puff between males and females, and you needed a total sample size of n=10. But instead of getting a single, independent observation from 10 different subjects, you decided to record the same male and female’s individual reactions 5 times.
#Set the values for each of the parameters we'll use
I= 2 # number of groups (subjects, in this case)
J= 5 # number of observations (or trials) per group
diff= 10 # Setting how different the group means should be from each other
sd.group= 4 # Standard deviation of scores between different subjects
sd.observation= 1.5 # Standard deviation of scores between trials of the same individual
#Create each variable
subject.ID = rep( c(1:I), each=J*I)
trials.ID = rep( sequence(J), times=I)
sex= rep( c(0,1), each=J)
#Generate the mean for each group
subject.1= rep( rnorm( n= 0.5*I, mean= 40, sd= sd.group), times= J) #Female mean
subject.2= rep( rnorm (n= 0.5*I, mean= 40+diff, sd= sd.group), times= J) #Male mean
subject.1
## [1] 43.80214 43.80214 43.80214 43.80214 43.80214
subject.2
## [1] 45.83761 45.83761 45.83761 45.83761 45.83761
#Use the group means we just generated to sample again. This will generate the y values (eyeblink response times).
y.blink= rnorm(n= J*I, mean= c(subject.1, subject.2), sd= sd.observation)
y.blink
## [1] 43.59599 42.42552 44.59928 43.51229 42.18575 45.52974 45.36387
## [8] 45.84809 47.82000 43.21430
#Combine into a single dataframe
blink.data <- as.data.frame( cbind( subject.ID, trials.ID, sex, y.blink), row.names = NULL)
For consistency, I’m going to use a dataset I already have. But these data were simulated using the code above.
blink.data <- read.csv(file = "~/Documents/Stats/Pseudoreplication/eyeblink_sim.csv")
blink.data
## subject.ID trial.ID sex y.blink
## 1 1 1 0 44.05611
## 2 1 2 0 42.25860
## 3 1 3 0 41.04987
## 4 1 4 0 45.07797
## 5 1 5 0 44.80733
## 6 2 1 1 59.30283
## 7 2 2 1 57.63582
## 8 2 3 1 58.51698
## 9 2 4 1 60.00853
## 10 2 5 1 56.72297
We can plot these data.
# Plot the data
plot(y.blink ~ jitter(sex, 0.05), data= blink.data,
xlab= "Sex", ylab= "Reaction time (ms)",
cex= 1.5, cex.lab= 1.5, cex.axis= 1.75, #Text and circle size
xlim= c(-0.5,1.5), ylim= c(min(y.blink) -5, max(y.blink) +5),
xaxt= "n") #Range of axes #Removes default x-axis text
axis(side= 1, at= 0:1, labels= c("Female", "Male")) #Replaces with custom label
If we ran a linear regression using these data, assuming that each Male and Female data point was independent from each other, we would be committing a very egregious case of pseudoreplication. Any p-value that results from such a linear regression analysis will be unreliable if this analysis were treated as N=10.
Note that according to this model’s p-value, we would see a significant effect of sex on y.blink times. But what we really have is a much smaller effective number of independent observations, and if we correctly acknowledge this much smaller sample size, our test statistic and our p-value would be very different.
From this example it’s obvious to see that we have only two independent subjects, but we can confirm this in a mathematical way by calculating what’s called the intraclass correlation coefficient (ICC). The ICC tells how similar or correlated observations from the same group are. To calculate the ICC, we calculate the proportion of total variance that is due to group membership.
mean.1 <- rep(mean(blink.data$y.blink[1:5]), times=J)
mean.2 <- rep(mean(blink.data$y.blink[6:10]), times=J)
group.mean <- blink.data$group.mean <- c(mean.1,mean.2) #group mean: means for male subject, mean for female subject
group.mean
## [1] 43.44998 43.44998 43.44998 43.44998 43.44998 58.43743 58.43743
## [8] 58.43743 58.43743 58.43743
within.group.resid <- blink.data$within.group.resid <- (blink.data$y.blink-group.mean) #deviance between each observation value and its group mean
blink.data
## subject.ID trial.ID sex y.blink group.mean within.group.resid
## 1 1 1 0 44.05611 43.44998 0.60613486
## 2 1 2 0 42.25860 43.44998 -1.19137501
## 3 1 3 0 41.04987 43.44998 -2.40010968
## 4 1 4 0 45.07797 43.44998 1.62799653
## 5 1 5 0 44.80733 43.44998 1.35735331
## 6 2 1 1 59.30283 58.43743 0.86540135
## 7 2 2 1 57.63582 58.43743 -0.80160192
## 8 2 3 1 58.51698 58.43743 0.07955104
## 9 2 4 1 60.00853 58.43743 1.57110267
## 10 2 5 1 56.72297 58.43743 -1.71445313
(between.group.resid <- (blink.data$group.mean - mean(blink.data$y.blink)))
## [1] -7.493725 -7.493725 -7.493725 -7.493725 -7.493725 7.493725 7.493725
## [8] 7.493725 7.493725 7.493725
between.group.resid <- between.group.resid[c(1,6)] #The "[c(1,6)]" part subsets just the 1st and 6th terms in the list, so that we only get one difference value for each group mean.
#sum of squares
within.group.sumsquares <- sum(within.group.resid^2)
within.group.sumsquares
## [1] 18.84561
between.group.sumsquares <- sum(between.group.resid^2)
between.group.sumsquares
## [1] 112.3118
within.group.var <- within.group.sumsquares / ((J*I)-1) #within group variance, n-1
within.group.var
## [1] 2.093956
between.group.var <- between.group.sumsquares / (I-1) #Here, the n is the number of groups
between.group.var
## [1] 112.3118
ICC.blink = between.group.var / (between.group.var + within.group.var)
ICC.blink
## [1] 0.9816971
Compare this answer to results from ICC function (which uses a slightly different method of calculating the ICC but gives a close enough approximation).
library(ICC)
ICCest(x=as.factor(subject.ID), y=y.blink, data=blink.data, alpha=0.05)
## $ICC
## [1] 0.9793715
##
## $LowerCI
## [1] 0.8591027
##
## $UpperCI
## [1] 0.9999781
##
## $N
## [1] 2
##
## $k
## [1] 5
##
## $varw
## [1] 2.355701
##
## $vara
## [1] 111.8407
Now that we know the ICC, we can figure out what our effective, or corrected, sample size would be. By plugging in our value of the ICC into the design effect equation, we can figure out what factor our initial N should be adjusted by.
design_effect = 1 + (J-1) * ICC.blink # The equation is 1 + (n-1) * ICC. Must be calculated per group, so n is per group
design_effect
## [1] 4.926788
N_effective <- (J[1]*I)/design_effect
N_effective
## [1] 2.02972
Next we will see why pseudoreplication is problematic by doing a simulation. The point of simulation is to generate data with known parameters (we set the mean, variance, etc.) and to then give it to our statistical model and see how good of a job it does at giving us back the right answer. Let’s walk through the framework of a simulation, setting aside pseudoreplication for a moment.
We will simulate a scenario where there is no true treatment effect between two groups. To make this example more concrete, let’s say that this happens when we mistakenly both our control group and experimental groups saline, instead of giving the experimental group a drug treatment.
Since there is no treatment effect, to simulate the data for each group, we will sample N=10 twice from the same population. Let’s create a function that will independently sample two groups from the same population, run a statistical test between the two groups, and return the p-value. Again, in this first example we are sampling independently, so there is no pseudoreplication going on.
I= 2 #number of groups
J= 10 #number of observations/group
sd.observation = 1 # standard deviation of observations
animal.ID= sequence(rep(J, times= I))
treat.ID= rep(c(0,1), each= J) # i.e. Pink vs. Blue, or alternatively saline vs. the treatment we think we gave
y = rnorm(n= J*I, mean= 0, sd= sd.observation) # This simulates the data (y-values)
#Combines into a dataframe
data <- as.data.frame( cbind(animal.ID, treat.ID, y))
data
## animal.ID treat.ID y
## 1 1 0 -0.53726389
## 2 2 0 1.69756758
## 3 3 0 -1.10468293
## 4 4 0 -1.13749795
## 5 5 0 0.73370918
## 6 6 0 -0.78506314
## 7 7 0 -2.14458702
## 8 8 0 -0.38639560
## 9 9 0 -1.49755488
## 10 10 0 2.31776871
## 11 1 1 -0.03509527
## 12 2 1 0.88896966
## 13 3 1 -1.06351420
## 14 4 1 -1.49102429
## 15 5 1 0.86401135
## 16 6 1 -0.07455637
## 17 7 1 -0.02307524
## 18 8 1 0.80287129
## 19 9 1 0.40933857
## 20 10 1 0.30849836
We can run a statistical test on these data to see whether the test (erroneously) detects a difference between the two treatment groups (remember that in this example we don’t realize we’ve given both of our treatment groups saline). We check the p-value.
p<- summary(lm( y ~ treat.ID, data= data))$coefficient[8] # Run linear model, but only output the p-value
p
## [1] 0.5163407
This p-value is the outcome of a single simulated study. We want to repeat these steps 1000 times, so let’s package all of the above steps into a single function to make the process more efficient.
sample.independently <- function(x){
animal.ID= sequence(rep(J, times= I))
treat.ID= rep(c(0,1), each= 0.5*length(animal.ID))
y = rnorm(n= J*I, mean= 0, sd= sd.observation)
data <- as.data.frame(cbind(animal.ID, treat.ID, y))
p<- summary(lm( y ~ treat.ID, data= data))$coefficient[8]
return(p)
}
Everytime we run this function, the final output will be a single p-value from a single simulated study. Since there is no true difference between the two groups we’re sampling from, we should get a non-significant p-value for 95% of our simulations. But this also means that 5% of the time (our alpha=0.05), our simulated studies will produce a “false positive” result by chance alone— a significant p-value indicating a difference between the two groups even though we know there is not a real difference.
Let’s run this function 1000 times, simulating 1000 of these studies. And we will save all 1000 p-values from each study in a new dataframe.
sim.results<- do.call(rbind, lapply(1:1000, function(x) sample.independently())) #dataframe of 1000 pvalues
colnames(sim.results) <- "P-value"
head(sim.results)
## P-value
## [1,] 0.7702979
## [2,] 0.6540471
## [3,] 0.9457944
## [4,] 0.3836666
## [5,] 0.3882660
## [6,] 0.7087659
Using all 1000 p-values, we can see how many of them ended up being “significant” with values of less than 0.05. Let’s look at this with a histogram:
hist(sim.results, 20, main="1000 P-values in studies with independent sampling", xlab="P-values", ylim=c(0,300))
Good! Here we see that the “significant” p-values less than 0.05 (the leftmost histogram bar) only occur about 50 times out of 1000 studies. So that’s 5% of the time— exactly the false positive rate we’d expect. This shows us that if we sample independently, our statistical model will work the way we expect it to.
Now let’s bring pseudoreplication back into the story. If we do another simulation, but this time have our observations be clustered in some way (i.e. introduce pseudoreplication), while continuing to have no real difference between the means of the two groups, we will get a very different amount of “significant” p-values.
We will use a similar process as before, but this time we will introduce some non-independence among the observations in the blue and pink groups.
I=4 # 2 clusters x 2 "treatment" groups
J=5 # number of observations per cluster
sd.observation=1 # same as before
sd.group= 0.7 #This introduces the pseudoreplication (between-group σ)
sim.diff=0 #We make explicit here that there will be no difference between treatment groups
group.ID= rep( c(1:I), each= J) #4 pseudoreplicated groups
animal.ID= sequence( rep( J, times= I))
treat.ID= rep( c(0,1), each= I*J*0.5) #Assumes balanced design
(mean.0= rep( rnorm( n= 0.5*I, mean= 0, sd= sd.group), each= J)) # sd.group will specify the clustering
## [1] 0.2753621 0.2753621 0.2753621 0.2753621 0.2753621 -0.1435665
## [7] -0.1435665 -0.1435665 -0.1435665 -0.1435665
(mean.1= rep( rnorm( n= 0.5*I, mean= sim.diff, sd= sd.group), each= J)) # Note population means are the same
## [1] -0.5568677 -0.5568677 -0.5568677 -0.5568677 -0.5568677 -0.6511525
## [7] -0.6511525 -0.6511525 -0.6511525 -0.6511525
y=rnorm(n= J*I, mean= c(mean.0, mean.1), sd= sd.observation) #Creates y-values, sampling from means in previous step. See dataframe
#Combine into dataframe
data <- as.data.frame(cbind(group.ID, animal.ID, treat.ID, y))
data
## group.ID animal.ID treat.ID y
## 1 1 1 0 3.46989962
## 2 1 2 0 -0.98351120
## 3 1 3 0 0.90586512
## 4 1 4 0 0.49729416
## 5 1 5 0 -0.34850503
## 6 2 1 0 0.14269855
## 7 2 2 0 -2.48370999
## 8 2 3 0 0.23941201
## 9 2 4 0 -0.71198957
## 10 2 5 0 0.65969464
## 11 3 1 1 -0.45891437
## 12 3 2 1 0.04882456
## 13 3 3 1 -1.62790685
## 14 3 4 1 -0.18586058
## 15 3 5 1 -1.83085959
## 16 4 1 1 -0.44113010
## 17 4 2 1 -0.69872667
## 18 4 3 1 -1.73597722
## 19 4 4 1 -0.27176441
## 20 4 5 1 -0.55308577
#Run statistical model, get back only the p-value
p<- summary(lm(y~treat.ID, data=data))$coefficient[8]
pseud <- function(x){
group.ID= rep( c(1:I), each= J)
animal.ID= sequence( rep( J, times= I))
treat.ID= rep( c(0,1), each= I*J*0.5)
mean.0= rep( rnorm( n= 0.5*I, mean= 0, sd= sd.group), times= J)
mean.1= rep( rnorm( n= 0.5*I, mean= sim.diff, sd= sd.group), times= J)
y=rnorm(n= J*I, mean= c(mean.0, mean.1), sd= sd.observation)
data <- as.data.frame(cbind(group.ID, animal.ID, treat.ID, y), row.names = NULL)
p<- summary(lm(y~treat.ID, data=data))$coefficient[8]
return(p)
}
Now let’s see what the histogram of p-values looks like when we run that function 1000 times:
sim.results<- do.call( rbind, lapply(1:1000, function(x) pseud(J))) #dataframe of 1000 pvalues
hist( sim.results, 20, main="1000 P-values in studies with Pseudoreplication", xlab="P-values", ylim=c(0,300))
We get many more false positive p-values in the scenario where we introduce pseudoreplication. This is bad! We know there is no true population difference between the blue and pink treatment groups, but the pseudoreplication tricks our statistical model into reporting significant differences back to us. Instead of a false-positive rate of 5%, we instead have a false positive rate of 22.4% .
proportion <- function(x){
pr <- length(x[x<0.05])/1000 #1000--> number of rows
return(pr)
}
false.positive.rate <- apply(sim.results, MARGIN=2, FUN=proportion) #Take the dataframe of p-values, apply the proportion function across the rows
false.positive.rate *100
## [1] 22.4
If we increase the between-group variance while keeping everything else constant, the false-positive rate will get even worse. For example, when sd.group
is increased to sd.group = 0.9
, the false positive rate becomes 31.4%.
Alternatively, if we changesd.group
to = 0
. We eliminate any clustering or similarity between the groups, and we are back in the situation of each sample being independent. Let’s illustrate this as well. Our false-positive rate should go back to being right around the expected 5%.
## [1] False-positive rate (%) = 5.6
In some cases, pseudoreplication cannot be avoided via experimental design. An example is the VPA model of autism, in which VPA is given to pregnant dams but the effects are actually measured in their offspring.
Let’s imagine a scenario in which we have 6 litters that have come from control dams, and 6 litters that have come from dams treated with VPA. The litters have a varying number of pups.
Below I show how you can simulate this data, but for consistency I will again read in a dataset that I’ve already simulated.
#To simulate data:
I= 12 #total number of litters
J= rep(c(2,2,4,5,6,7), 2) #number of individuals/litter
diff= 1 #treatment effect
sd.group= 1 #std deviation between litters
sd.observation= 0.5 #std deviation within litter
litter.ID= rep(c(1:I), times=J)
animal.ID= sequence(J)
treat.ID= rep(c(0,1), each=0.5*length(animal.ID)) #each group with parallel litter sizes
mean.0= rep( rnorm(n= 0.5*I, mean= 0, sd= sd.group), times= J[1:c(0.5*I)])
mean.1= rep( rnorm(n= 0.5*I, mean= diff, sd= sd.group), times= J[1:c(0.5*I)]) # we ARE simulating a treatment effect
y= rnorm (n= sum(J), mean= c(mean.0, mean.1), sd= sd.observation) #sample
#Create Dataframe
data <- as.data.frame(cbind(litter.ID, animal.ID, treat.ID, y))
#Read in my data:
data <- read.csv(file="VPA1.csv")
str(data)
## 'data.frame': 52 obs. of 4 variables:
## $ litter.ID: int 1 1 2 2 3 3 3 3 4 4 ...
## $ animal.ID: int 1 2 1 2 1 2 3 4 1 2 ...
## $ treat.ID : int 0 0 0 0 0 0 0 0 0 0 ...
## $ y : num 0.19 0.705 -1.733 -1.513 -0.938 ...
head(data, 10) #Show first 10 observations
## litter.ID animal.ID treat.ID y
## 1 1 1 0 0.1904203
## 2 1 2 0 0.7052822
## 3 2 1 0 -1.7328777
## 4 2 2 0 -1.5128727
## 5 3 1 0 -0.9384431
## 6 3 2 0 -2.0351094
## 7 3 3 0 -2.0317466
## 8 3 4 0 -1.7764983
## 9 4 1 0 0.1225263
## 10 4 2 0 1.2790136
We have 52 observations (or pups in this case), with 26 in the control group, and 26 in the treatment group. Let’s check the ICC of this data set.
ICCest(x= as.factor(litter.ID), y= y, data= data)
## $ICC
## [1] 0.784802
##
## $LowerCI
## [1] 0.5881513
##
## $UpperCI
## [1] 0.9209382
##
## $N
## [1] 12
##
## $k
## [1] 4.258741
##
## $varw
## [1] 0.2239834
##
## $vara
## [1] 0.8168414
The ICC is pretty high. The wrong way to analyze these data would be to run a regular model (as below) by treat all 52 observations as independent. This would be pseudoreplication.
#Plot the data
plot(litter.ID, y,
xlab="Litter ID, N=52", ylab="Mean", main= "Control vs Treated",
ylim=range(c(data$y-1), c(data$y+1)),
cex=1.5, cex.lab=1.5, cex.axis=1.5, xaxt="none") #formats axes
axis(1, seq(from = 1, to = 12, by = 1), cex.axis=1.5) #Adds custom x-axis labels
abline(v=6.5) #Adds divider line to plot
#Run the linear model
results.wrong.way <- lm(y~treat.ID, data=data)
summary(results.wrong.way)
##
## Call:
## lm(formula = y ~ treat.ID, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.9738 -0.4019 0.1424 0.4880 2.1359
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.06129 0.16738 -0.366 0.716
## treat.ID 1.01020 0.23672 4.268 8.79e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8535 on 50 degrees of freedom
## Multiple R-squared: 0.267, Adjusted R-squared: 0.2523
## F-statistic: 18.21 on 1 and 50 DF, p-value: 8.794e-05
We get a pretty low p-value, indicating a strong effect. This p-value is not correct because the observations from each litter are not independent observations.
Instead, it would be better to use the average of each litter, using the litter as the unit of analysis. Taking the average per group is the easiest way to correct for any pseudoreplication. In our case this will reduce our sample to an N=12. Let’s calculate the average and place it in a new dataframe.
new<- aggregate(y ~ litter.ID, data= data, FUN= mean) #Computes the average value of y, grouped by litter
#Let's also compute the standard error of the mean, and add it as a column as well.
SEM <- function(x){
sd(x)/sqrt(length(x))
}
SEM <- aggregate( y ~ litter.ID, data= data, FUN= SEM) #Computes SEM
new <- merge(new, SEM, by="litter.ID") #Adds it to dataframe
new$treat.ID <- rep(c(0,1), each=I*0.5) #add column for treatment ID
colnames(new) <- c("litter.ID", "litter.avg", "SEM", "treat.ID") #Adds column names
new
## litter.ID litter.avg SEM treat.ID
## 1 1 0.44785128 0.25743095 0
## 2 2 -1.62287521 0.11000248 0
## 3 3 -1.69544935 0.25950146 0
## 4 4 0.44631236 0.23765140 0
## 5 5 -0.26525749 0.20157541 0
## 6 6 0.98545292 0.20351607 0
## 7 7 0.86661073 0.20298345 1
## 8 8 0.04961348 0.02149549 1
## 9 9 0.83713233 0.27125806 1
## 10 10 1.22128520 0.08868026 1
## 11 11 1.09066291 0.22386374 1
## 12 12 0.97716489 0.17452916 1
Let’s plot the averages and run the linear regression model.
plot(new$litter.ID, new$litter.avg,
xlab="Litter ID, N=12", ylab="Mean", main= "Control vs Treated",
ylim=range(c(data$y-1), c(data$y+1)),
pch=20, cex=1.5, cex.lab=1.5, cex.axis=1.5, xaxt="none") #formats axes
axis(1, seq(1,12,1), cex.axis=1.5) #Adds custom x-axis
abline(v=6.5)
results.better.way <- lm(litter.avg~litter.ID, data=new) #The dependent variable is now the set of 12 average litter values, and not the 52 independent observations.
summary(results.better.way)
##
## Call:
## lm(formula = litter.avg ~ litter.ID, data = new)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.31966 -0.37379 0.03169 0.53004 1.19736
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.93637 0.48139 -1.945 0.0804 .
## litter.ID 0.18686 0.06541 2.857 0.0171 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7822 on 10 degrees of freedom
## Multiple R-squared: 0.4494, Adjusted R-squared: 0.3943
## F-statistic: 8.161 on 1 and 10 DF, p-value: 0.01705
Note that now our p-value and the effect size that corresponds with it are much more modest than when we ran our statistical model with uncorrected pseudoreplication.
When we distill the observations from each litter down to a single average value, we lose information about the litter size and the variability in that litter. This is the information that the SEM captures, which we’ve already calculated. Let’s plot the averages along with the SEMs below.
plot(new$litter.ID, new$litter.avg,
xlab="Litter ID", ylab="Mean +/- SEM", main= "Control vs Treated",
ylim=range(c(data$y-1), c(data$y+1)),
pch=20, cex=1.5, cex.lab=1.5, cex.axis=1.5, xaxt="none")
axis(1, seq(1,12,1), cex.axis=1.5) #custom x-axis
abline(v=6.5)
arrows(new$litter.ID, new$litter.avg-new$SEM,
new$litter.ID, new$litter.avg+new$SEM,
length=0.05, angle=90, code=3) #attributes of the "arrowhead"
The best way to analyze these data is by using a mixed model. The mixed model will include the information about group (e.g. litter) size and variability when it tests for an effect of treatment, without committing pseudoreplication. A mixed model will take the group level information and shrink those estimates towards the grand mean. Groups with more uncertainty (smaller groups, or ones with high amounts of variability) will shrink relatively more than groups with less uncertainty.
The specific mechanics of the mixed model are out of the scope of this tutorial. But we can use the function below for a linear mixed model to compute the best estimate of the VPA treatment effect from our example.
library(nlme)
best <- lme(fixed= y ~ treat.ID, random= ~1 | litter.ID, data= data) #the group-level info is always entered as the random variable
summary(best)
## Linear mixed-effects model fit by REML
## Data: data
## AIC BIC logLik
## 106.6686 114.3167 -49.33432
##
## Random effects:
## Formula: ~1 | litter.ID
## (Intercept) Residual
## StdDev: 0.809853 0.4736819
##
## Fixed effects: y ~ treat.ID
## Value Std.Error DF t-value p-value
## (Intercept) -0.2669854 0.3464424 40 -0.7706488 0.4454
## treat.ID 1.1197823 0.4899435 10 2.2855335 0.0454
## Correlation:
## (Intr)
## treat.ID -0.707
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -1.8757994 -0.5728163 -0.1409780 0.5607931 2.4225246
##
## Number of Observations: 52
## Number of Groups: 12
Using the mixed model, we can be confident that our estimate and our p-value are accurately reflecting the data because the mixed model has taken into account that observations in the same group will not be completely independent from one another.
We can plot the final results:
#First extract "shrunken" mean values from model:
d<- coef(best)
shrunken.litter.means <- c(d$`(Intercept)`[1:6], d$`(Intercept)`[7:12] + d$treat.ID[7:12])
new$shrunken.litter.means <- shrunken.litter.means #adds column of shrunken means to dataframe
ctrl.mean <- mean(shrunken.litter.means[1:6]) #to use for plotting
vpa.mean <- mean(shrunken.litter.means[7:12])
#Using shrunken litter means, final results
plot(new$litter.ID, shrunken.litter.means,
xlab="Litter ID, N=12", ylab= "Shrunken Mean", main= "After Shrinkage",
ylim=range(-2, 1.5),
pch= 20, cex= 1.5, cex.lab= 1.5, cex.axis= 1.5, xaxt= "none")
axis(1, seq(1,12,1), cex.axis= 1.5)
abline(v= 6.5)
segments(x0= 0.75, y0= ctrl.mean, x1= 6.25, y1= ctrl.mean, col= "red", lty= 2, lwd= 2)
segments(x0= 6.75, y0= vpa.mean, x1= 12.25, y1= vpa.mean, col= "red", lty= 2, lwd= 2)