The difference I had noticed was actually in the stats for the individual conditions, not in the pairwise contrasts. For the pairwise contrasts I have used the -paired flag, and yes the stats are identical to the 3dANOVA2 stats (when there are only two conditions). But the major difference I found was in the one sample 3dttest for one of the conditions versus the 3dANOVA2 amean t-stat for the same condition.
This is beginning to make some sense to me now (although it is still cloudy), in that the additional information -- knowing that the two different conditions were measured in the same subjects -- gives greater statistical power, even when we only want to know about one of the conditions. And adding more conditions, going from 2 to 6 but maintaining the number of subjects, seems to improve the significance further (not a huge amount, but noticeable).
I had another thought about anova. Why not treat each voxel as fixed levels of a third factor in a three way anova? That way you can take into account the inter-voxel variance, inter-condition variance, and inter-subject variance. Doing the statistical tests separately on each voxel seems to disregard the information we have about how noisy a single subjects data is. Some subjects will have better signal to noise but the test can't take that into account unless you measure the inter-voxel variance. So is there some reason why this would be wrong to do? Or would it just be impractical and computationally exhaustive?