If subjects are speaking in the scanner, you would expect large movements at specific times. If using the movement parameters as regressors isn't effective, you could try censoring out the timepoints where the subjects are speaking. Since the hemodynamic response is delayed, the speech usually happens before the response, and might constitute a large part of what you're calling "baseline". You might see deactivation if your baseline is actually a motion artifact. Rasmus Birn has done work separating the motion due to speech from the fmri signal.
Sally