I think that averaging (or summing) across the lags that you are most interested in (i.e. typically 4-8 seconds poststimulus) is pretty safe, especially if you confirm it by looking at the deconvolved waveforms. The issue of negative betas cancelling out is more relevant if you are averaging over the entire duration of a hemodynamic response, in which case you might combine the positive peak and the poststimulus undershoot (and possibly the seldom-seen initial dip). Of course, if the task has a complex temporal design, such as a delayed-match-to-sample, then all bets are off, you really need to take a look at the waveforms before making your decisions about the most important parameter of the response to measure. It's not cheating, it's a reasonable approach to analysis of complex quantitative data.
Can't resist the temptation to plug my own work - we recently published a paper on exactly this issue, showing that the poststimulus undershoot can cancel out the positive peak of the BOLD response, even in a block design. There are some illustrations of possible data analysis approaches that may be helpful to you.
-Jed
The reference:
Meltzer, J.A., Negishi, M., Constable, R.T. Biphasic hemodynamic responses influence deactivation and may mask activation in block-design fMRI paradigms. Human Brain Mapping, 2008 Apr; 29(4):385-99.