Intraclass correlation coefficients (ICCs) are a reasonable way of measuring test-retest reliability. The maths underlying these isn't too difficult and could be programmed up voxelwise using more basic commands like 3dcalc. Alternatively, if you have specific regions of interest (ROIs) you could extract % signal change from these ROIs and calculate ICCs offline using standard stats software (an approach we used with the amygdala).
A more general form of ICC is to perform a variance components analysis, which show just how much variance in a given contrast is attributable to the various factors and interactions in an experiment or multiple experiments, including the session and session by subject factors, which both impact test-retest reliability.
The count of suprathreshold voxels is perhaps not such a good measure, because of the arbitrary, categorical thresholding, which means that two sessions could give almost identical activation, but if that fell 0.000001% above threshold in one session, and 0.000001% below threshold in the next, the apparent repeatability in terms of number of "activated" voxels would be very low.