Larry,
> I question, though, whether A-C and B-C differ so little, perhaps more than the .9% and .8% that you suggest.
> Given the substantial differences between A and B in the number of voxels that they activate above the active
> baseline (C), it might well be the case that the difference between these effects is quite a bit larger.
Those numbers were simply made up for the convenience of discussion.
> I might disagree with is your phrase “artificially dichotomizing the evidence” in the sentence, “furthermore, you
> do see a bigger cluster for effect A than B, relative to C, when artificially dichotomizing the evidence with a
> preset threshold.”
The popularly adopted approach of identifying a spatial cluster based on the overall FWE threshold of 0.05 is arbitrary in the following sense: 1) why is 0.05 so special but not 0.04 or 0.06? 2) The current correction methods may be rigorous under one particular framework, but their efficiency is debatable if we take a different perspective. What I'm trying to say here is this: you may feel comfortable enough about the strength of statistical evidence for each surviving cluster; however, I would not take seriously the boundary, extent (or breadth) or number of voxels of each cluster because of the arbitrariness involved in the whole process. In other words, it might well be the case that most or even all of the involved regions in your experiment are pretty much activated under both conditions A and B relative to C. However, you only showed those colored clusters in the attachment simply because you have to present the results within that "comfort zone" per the current publication filtering system. Still, artificially dichotomizing the evidence may be convenient for results reporting, but we should not forget about this fact: statistical evidence (e.g., t-statistic) is a continuum in essence.
> I increasingly wonder whether staying in this comfort zone is causing us to miss all sorts of important
> information about what’s happening in our experiments.
I do agree with this assessment of yours because the current correct methods tend to be overly penalizing to me.
> breadth of activation may be a more sensitive measure of differences between A and B than differences
> in activation intensity.
Again, the breadth of those clusters should not be taken seriously because of 1) artificial dichotomization, and 2) arbitrariness involved.
> when we compute the voxel counts for individual participants (measuring activation breadth) and submit
> them to mixed-effects modeling in R (using lmer), we find large significant effects across participants (as
> random effects) ...
> Perhaps the best approach is to pull out voxel counts from individual participants and then test them
> externally in lmer analyses, as just described above
Same problem for the number of voxels within each cluster as the breadth of the cluster.
As far as I can tell, A-B is still the robust way to go. Sample size (number of subjects) is probably too costly for you at the moment. When you set a voxel-wise two-sided p-threshold to 0.1 (and forget about FWE correction) for A-B, do you see at least half of the voxels surviving the thresholding among those anatomical regions (not statistically defined clusters) you're interested?
Gang
Edited 1 time(s). Last edit at 05/17/2019 06:03PM by Gang.