Horned lizards have a problem. They get eaten by birds. BUT they are horned for a reason. Those horns might well protect them from being eaten. We have a data set of lizards with their aquamosal horn length. Lizards were either sampled live from the wild (Survive = 1), or from the corpses of those killed by birds (Survive = 0). Load up the data (noting that you’ll have to deal with a non-standard na.strings
) and using `ggplot2
plot a histogram of the horn length of living and dead lizards.
Let’s try the simplest two-sample unpaired t-test with this data. But to satisfy that, we’ll have to make one change first…
A basic two-sample unpaired t-test needs sample sizes that are the same for both treatments. Using dplyr, make a data set that has 30 entries for both survival classes. You’ll need to use both group_by
and either slice
or sample_n
depending on your approach.
With this new data set, use t.test
to run a t-test on the data. Note, the default values for the t.test
function do not do a simple unpaired t-test with equal variances. You’ll have to look at the help file to make sure you set the arguments properly. What are the results?
OK, now that you have an answer, let’s make sure it’s the right one. Evaluate the normality of the residuals of this t-test, and make sure the residuals for each group are normal centered on 0. Do this by visualizing histograms of residuals overall and by treatment. dplyr
might help you here.
Is this test OK? Do you believe it? If not, what do you need to do to the data to meet the assumptions of the t-test?
OK, we actually have a lot more data for surviving than dead horned lizards. How do the results of your t-test differ if you use all of the data, again, assuming that each population has the same variance. Apply any transformation to the data you feel appropriate given your tests of assumptions.
Now, how do the results differ if you DON’T assume equal variances in addition to using unequal sample sizes? Apply any transformation to the data you feel appropriate given your tests of assumptions.