These kinds of differences are somewhat expected. Differences in the acquisition protocols make the contrasts different, but the anatomical variability is a bigger effect here, and more than the nonlinear warping can accommodate. While the overall alignment is very good in your images, you can see lots of small areas that don't match completely. The regions transformed from the meticulously placed locations in the native space won't end up exactly on the right locations. There are a couple things you can try though to improve the match.
1. Add -nopenalty option to the nonlinear warping. That allows for more distortion, so that can allow for a better match, but it also allows for more distortion you don't want. @animal_warper uses -extra_qw_opts to pass extra options like that to 3dQwarp to do nonlinear warping.
2. Limit the alignment to the small area you want. This will probably work best as a second set of alignment.The initial affine alignment fits the overall volumes together, so local differences aren't as important. Nonlinear warping is smooth across the whole dataset, so alignment in one part of the brain affects alignment in another part. The neighborhoods used in the nonlinear warping limit part of that interaction, so it's not too bad of an effect, but you can work around this by just aligning a subsection of the dataset. If you start off close from a previous alignment, you should be able to align the data if there is sufficient contrast in both the template and in the subject's image, and you have the approximate part of the template you want.