It is not clear to me what this sentence means:
The averaged results are not what I expected and they look similar to me.
I take the last clause means that the averaged results from fnirt and 3dQwarp are similar. But does the first clause mean that you don't expect them to be similar, or that the results are not what you expect for some other reason?
Assuming that it is the similarity between fnirt results and 3dQwarp results that puzzles you, I can suggest 2 reasons for this. The first would be the use of the MNI152 template. This template shows very little detail outside of the major sulci (such as the Sylvian fissure), and so there is not much in the way of information present to help align finer structures. Almost any nonlinear registration method should end up giving similar average results when trying to align to such an ambiguous target.
If you
did use a template target with more detail, then you might also want to add options to 3dQwarp (or fnirt) to tell it how detailed a warp to use. In your case, you have a 2 mm grid template, and the default refinement level of 3dQwarp stops at 25 voxels = 50 mm patches. This is fairly crude, and cannot register finer details. You would get more precise alignments if you used a non-blurry template at a 1 mm resolution
and continued down to '-minpatch 19'.