Scaling and Percent Signal Change
Whether to run scaling before individual subject analysis with 3dDeconvolve or convert regression coefficients to percent signal change afterwards is a recurring issue in fMRI analysis. There is probably no ideal definition of percent signal change that everybody can agree upon. However, since the process is usually scaling, the effect of different approach is very minimal. One can adopt whatever approach s/he feels comfortable with.
Practically it would be easier and simpler to scale the signal by its mean before individual subject analysis instead of dividing the regression coefficients by the baseline after the regression analysis with 3dDeconvolve. If there are multiple runs of data, it would be vulnerable to cross-run variation in baseline when done after individual subject analysis. For example, with two runs of unscaled data there are two different baseline estimates. The problem is, how would the percent signal change be calculated based on the two baseline values? Even more troublesome is that cross-run variation could render inaccurate regression modeling of unscaled signal. The original BOLD signal values do not carry any absolute meaning or scale, so if multiple runs of unscaled EPI data are combined, an analysis without proper scaling would be pretty shaky to say the least.
On the other hand, it can be argued that it is not accurate to scale the signal by its mean especially when the BOLD response to the stimuli are strong. Here I will use a concrete example to demonstrate what and how big the difference is.
Suppose the signal intensity at a voxel has a mean value of 2400 (average signal intensity across time in one run of data), and at some time point corresponding to a condition the intensity is 2450.
First let's scale the data before individual subject analysis: The scaled value of the above voxel would be 2450/2400 ~ 102.083. For simplification let's assume there is no drifting across time. Suppose the real baseline value is 2350 (which is understandably a little bit below the mean value of 2400). The corresponding scaled baseline would be (2350/2400)*100 = 97.917. Then percent signal change of the condition is estimated as
(102.083-97.917)/100 ~ 4.1%,
which is presumably the regression coefficient you would get out of 3dDeconvolve.
Now we analyze the data without scaling. The percent signal change of the condition under this approach would be calculated as
(2450-2350)/2350 ~ 4.255%
Considering the accuracy of fMRI analysis, the difference between 4.1% and 4.255% is really neglible.
Keep in mind if there is only one run of data, the fitting in regression analysis has nothing to do with whether you scale your data or not. Such a scaling process has no effect at all on statistics either because t or F statistic is dimensionless, but it does have a slight effect on the accuracy of percent signal change calculation. Such discrepancy is really small in most cases. The following theoretical analysis provides a closer look at the situation.
Assume that the BOLD signal intensity is y(t) at a voxel and, for simplicity, there is only one stimulus s(t) and its BOLD response is y(t). We can model it with the following regression equation:
y(t) = a0 + a1t + b s(t) + e(t)
where a0 is baseline constant and a1 is linear trend slope while b is regression coefficient corresponding to the stimulus s(t), and e(t) is some white noise.
One approach to obtaining the percent signal change due to stimulus s(t) at the above voxel is without any normalization process before regression analysis, and
100y(t)/a0 = 100 + (100a1/a0)t + (100b/a0) s(t) + e1(t)
p1 = 100b/a0
It might be more accurate if we consider the linear trend a1, but, under the new version of 3dDeconvolve, it is not necessary any more because the default baseline is Legendre polynomial with its constant well-centered.
Another approach is scaling before regression, and the corresponding percent signal change would be calculating its mean for each run, m, and then
100y(t)/m = 100a0/m + (100a1/m)t + (100b/m) s(t) + e2(t)
p2 = 100b/m = (100b/a0) (a0/m) = a0/m p1
As seen above, the difference between the two approaches is a factor of a0/m. Usually such a factor would be around 1. Only if it is too far away from 1 (e.g., the baseline constant, 100*a0/m, of the 2nd approach is 120) should it raise some concern, and in that case the 2nd approach usually underestimates percent signal changes by a factor of m/a0.
So if you scale your data before running 3dDeconvolve, a cautionary examination of those baseline constants (how far away they are from 100) should give you some idea about the accuracy of those regresson coefficients serving as percent signal changes. And if you are really concerned, the safest and most accurate approach is indeed to run regression analysis without normalization and convert regression coefficients to percent signal change afterwards (caveat: this might be true only when there is one run of data; see below for more details).
Regarding scaling before 3dDeconvolve, refer to HowTo#5
The following script can be employed in converting regression coefficients into percent signal change after 3dDeconvolve if you are concerned about the accuracy of the percent signal change due to the scaling. It would make adjustment to the beta value
-a "FileName+orig" \ %baseline constant OR averaged constant if multiple runs exist
-b "FileName+orig" \ %sub-brick with regression coef
-expr "100 * b/a * step (1- abs(b/a))" \ %step function controls outflow if baseline is close to 0
With the default Legendre polynomial fitting for the baseline and drifting effect in 3dDeconvolve, the baseline coefficient is essentially the mean of the whole baseline plus drifting since all the drifting effects are centered around 0. However, if there are multiple runs of data, the above 3dcalc script has to be modified by averaging all baselines, and the accuracy would still be unfortunately further compromised.
Last modified 2008-11-10 09:22