Group Analysis with AFNI - Hands On
The following sample group analysis comes from ÒHow-to #5 -- Group Analysis: AFNI 3dANOVA2Ó, described in full detail on the AFNI website:
http://afni.nimh.gov/pub/dist/HOWTO/howto/ht05_group/html
Brief description of experiment :
Design:
Rapid event-related
Independent Variable Þ ÒStimulus ConditionÓ with 4 levels:
TM = Tool Movie
HM = Human Movie
TP = Tool Point Light
HP = Human Point Light

"Data Collected:"
Data Collected:
1 Anatomical (SPGR) dataset for each subject
124 sagittal slices
10 Time Series (EPI) datasets for each subject
23 axial slices x 138 volumes = 3174 volumes/timepoints per run
note: each run consists of random presentations of rest and all 4 stimulus condition levels
TR = 2 sec; voxel dimensions = 3.75 x 3.75 x 5 mm
Sample size, n=3 (subjects ED, EE, EF)
 note: original study had n=10
Analysis Steps:
 Part I: Process data for each subject first
Pre-process subjectsÕ data Þ many steps involved hereÉ
Run deconvolution analysis on each subjectÕs dataset --- 3dDeconvolve
 Part II: Run group analysis
2-way Analysis of Variance (ANOVA) --- 3dANOVA2
i.e., Stimulus Condition (4) x Subjects (3) = 2-way ANOVA

"PART I Þ Process Data..."
PART I Þ Process Data for each Subject First:
Hands-on example: Subject ED
We will begin with EDÕs anat dataset and 10 time-series (3D+time) datasets:
EDspgr+orig, EDspgr+tlrc, ED_r1+orig, ED_r2+orig É ED_r10+orig
Below is EDÕs ED_r1+orig (3D+time) dataset.  Notice the first two time points of the time series have relatively high intensity*.  We will need to remove them later:

"STEP 1:"
 STEP 1: Volume Register and time shift the voxel time series for each     3D+time dataset using AFNI program 3dvolreg
We will also remove the first 2 time points at this step
Use the ÒforeachÓ loop to do all 10 of EDÕs 3D+time datasets at once:
foreach run (1 2 3 4 5 6 7 8 9 10)
3dvolreg  -base ED_r1+origÕ[2]Õ \
     -tshift 0 \
     -prefix ED_r{$run}_vr     \
     ED_r{$run}+origÕ[2..137]Õ
end
-base: Timepoint 2 is our base/target volume to which the remaining timepoints (3-137) will be aligned.  We are ignoring timepoints 0 and 1
-tshift 0: For each run, the slices within each volume are being time shifted prior to volume registration.  This is done to correct for the different sampling times for each slice
-prefix gives our output files a new name, e.g., ED_r1_vr+orig
ED_r{$run}+origÕ[2..137]Õ refers to our input datasets (runs 1-10) that will be time shifted and volume registered.  Notice that we are removing timepoints 0 and 1

"Subject EDÕs newly created time..."
Subject EDÕs newly created time shifted and volume registered datasets:
ED_r1_vr+orig.HEAD ED_r1_vr+orig.BRIK
        É         É
ED_r10_vr+orig.HEAD ED_r10_vr+orig.BRIK
Below is run 1 of EDÕs time shifted and volume registered datasets, ED_r1_vr+orig:

"STEP 2:"
 STEP 2: Smooth 3D+time datasets with AFNI 3dmerge
The result of spatial blurring (filtering) is somewhat cleaner, more contiguous activation blobs
Spatial blurring will be done on EDÕs time shifted, volume registered datasets:
foreach run (1 2 3 4 5 6 7 8 9 10)
3dmerge -1blur_rms 4          \
  -doall           \
  -prefix ED_r{$run}_vr_bl   \
  ED_r{$run}_vr+orig
end
-1blur_rms 4 sets the Gaussian filter to have a root mean square deviation of 4mm (You decide the width of the filter)
-doall applies the editing option (in this case the Gaussian filter) to all sub-bricks uniformly in each dataset

"Result from 3dmerge:"
Result from 3dmerge:

"STEP 3:"
 STEP 3: Normalizing the Data - Calculating Percent Change
This particular step is a bit more involved, because it is comprised of three parts.  Each part will be described in detail:
Set all background values (outside of the volume) to zero with 3dAutomask
Do a voxel-by-voxel calculation of the mean intensity value with 3dTstat
Do a voxel-by-voxel calculation of the percent signal change with 3dcalc
Why should we normalize our data?
Normalization becomes an important issue when comparing data across subjects, because baseline/rest states will vary from subject to subject
The amount of activation in response to a stimulus event will also vary from subject to subject
As a result, the baseline Ideal Response Function (IRF) and the stimulus IRF will vary from subject to subject -- we must account for this variability
By converting to percent change, we can compare the activation calibrated with the relative change of signal, instead of the arbitrary baseline of FMRI signal

"For example:"
For example:
Subject 1 - Signal in hippocampus goes from 1000 (baseline) to 1050     (stimulus condition)
Difference = 50 IRF units
Subject 2 - Signal in hippocampus goes from 500 (baseline) to 525     (stimulus condition)
Difference = 25 IRF units
Conclusion:
Subject 1 shows twice as much activation in response to the stimulus condition than does Subject 2 --- WRONG!!
If ANOVA were run on these difference scores, the change in baseline from subject to subject would add variance to the analysis
We must control for these differences in baseline across subjects by somehow normalizing the baseline so that a reliable comparison between subjects can be made

"Solution:"
Solution:
Compute Percent Signal Change
 i.e., by what percent does the Ideal Response Function increase with presentation of the stimulus condition, relative to baseline?
Percent Change Calculation:
If A = Stimulus IRF
If B = Baseline IRF
Percent Signal Change  =  (A/B) * 100%

"Subject 1 -- Stimulus (..."
Subject 1 -- Stimulus (A) = 1050, Baseline (B) = 1000
(1050/1000) * 100%  = 105% or   5% increase in IRF
Subject 2 -- Stimulus (A) = 525, Baseline (B) = 500
(525/500) * 100%  = 105% or       5% increase in IRF
Conclusion:
Both subjects show a 5% increase  in signal change from baseline to stimulus condition
Therefore, no significant difference in signal change between these two subjects

"STEP 3A:"
 STEP 3A: Ignore any background values in a dataset by creating a         mask with 3dAutomask
Values in the background have very low baseline values, which can lead to artificially large percent signal change values.  LetÕs remove them altogether by creating a mask of our dataset, where values inside the brain are assigned a value of Ò1Ó and values outside of the brain (e.g., noise) are assigned a value of Ò0Ó
This mask will be used later when the percent signal change in each voxel is calculated.  A percent change will be computed only for voxels inside the mask
A mask will be created for each of Subject EDÕs time shifted, volume registered, and blurred 3D+time datasets:
foreach run (1 2 3 4 5 6 7 8 9 10)
3dAutomask -prefix mask_r{$run}   \
    ED_r{$run}_vr_bl+orig
end
Output of 3dAutomask: A mask dataset for each 3D+time dataset:
mask_r1+orig, mask_r2+orig É mask_r10+orig

"Now letÕs take those 10..."
Now letÕs take those 10 masks (we donÕt need 10 separate masks) and combine them to make one master or Òfull maskÓ, which will be used to calculate the percent signal change only for values inside the mask (i.e., inside the brain).
3dcalc -- one of the most versatile AFNI programs -- is used to combine the 10 masks into one:
3dcalc -a mask_r1+orig -b mask_r2+orig -c mask_r3+orig \
-d mask_r4+orig -e mask_r5+orig -f mask_r6+orig \
-g mask_r7+orig -h mask_r8+orig -i mask_r9+orig \
-k mask_r10+orig    \
-expr Ôstep(a+b+c+d+e+f+g+h+i+j)Õ    \
-prefix full_mask
Output: full_mask+orig:

"STEP 3B:"
 STEP 3B: Create a voxel-by-voxel mean for each timeseries dataset with         3dTstat
For each voxel, add the intensity values of the 136 time points and divide by 136
The resulting mean will be inserted into the ÒBÓ slot of our percent signal change equation (A/B*100%)
foreach run (1 2 3 4 5 6 7 8 9 10)
3dTstat -prefix mean_r{$run}  \
ED_r{$run}_vr_bl+orig
end
Unless otherwise specified, the default statistic for 3dTstat is to compute a voxel-by-voxel mean
Other statistics run by 3dTstat include a voxel-by-voxel standard deviation, slope, median, etcÉ

"The end result will be..."
The end result will be a dataset consisting of a single mean value in each voxel.  Below is a graph of a 3x3 voxel matrix from subject EDÕs dataset mean_r1+orig:

"STEP 3C:"
 STEP 3C: Calculate a voxel-by-voxel percent signal change with         3dcalc
Take the 136 intensity values within each voxel, divide each one by the mean intensity value for that voxel (that we calculated in Step 3B), and multiply by 100 to get a percent signal change at each timepoint
This is where the A/B*100 equation comes into play
foreach run (1 2 3 4 5 6 7 8 9 10)
3dcalc -a ED_r{$run}_vr_bl+orig \
-b mean_r{$run}+orig   \
-c full_mask+orig   \
-expr Ò(a/b * 100) * cÓ  \
-prefix scaled_r{$run}
end
Output of 3dcalc: 10 normalized datasets for Subject ED, where the signal intensity value at each timepoint has now been replaced with a percent signal change value
scaled_r1+orig, scaled_r2+orig É scaled_r10+orig

"E.g., Timepoint #..."
E.g., Timepoint #18 above shows a percent signal change value of 101.7501
i.e., relative to the baseline (of 100), the stimulus presentation (and noise too) resulted in a percent signal change of 1.7501% at that specific timepoint

"STEP 4:"
 STEP 4: Concatenate EDÕs 10 normalized datasets into one big      dataset with 3dTcat
3dTcat -prefix ED_all_runs \
scaled_r1+orig \
scaled_r2+orig \
scaled_r3+orig \
scaled_r4+orig \
scaled_r5+orig \
scaled_r6+orig \
scaled_r7+orig \
scaled_r8+orig \
scaled_r9+orig \
scaled_r10+orig \
The output from 3dTcat is one big dataset -- ED_all_runs+orig -- which consists of 1360 volumes (i.e., 10 runs x 136 timepoints).  Every voxel in this large dataset contains percent signal change values
This output file will be inserted into the 3dDeconvolve program

"STEP 5:"
 STEP 5: Perform a deconvolution analysis on Subject EDÕs data with      3dDeconvolve
What is the difference between regular linear regression and deconvolution analysis?
With linear regression, the hemodynamic response is already assumed (we can get a fixed hemodynamic model by running the AFNI waver program)
With deconvolution analysis, the hemodynamic response is not assumed.  Instead, it is computed by 3dDeconvolve from the data
Once the HRF is modeled by 3dDeconvolve, the program then runs a linear regression on the data
To compute the hemodynamic response function with 3dDeconvolve, we include the ÒminlagÓ and ÒmaxlagÓ options on the command line
The user (you) must determine the lag time of an input stimulus
1 lag = 1 TR = 2 seconds
In this example, the lag time of the input stimulus has been determined to be about 15 lags
As such, we will add a ÒminlagÓ of 0 and a ÒmaxlagÓ of 14 in our 3dDeconvolve command

"3dDeconvolve -input ED_all..."
3dDeconvolve -input ED_all_runs+orig -num_stimts 4
-stim_file 1  all_stims.1DÕ[0]Õ -stim_label 1 ToolMovie    \
-stim_minlag 1 0  -stim_maxlag 1 14 -stim_nptr 1 2    \
-stim_file 2  all_stims.1DÕ[1]Õ -stim_label 2 HumanMovie    \
-stim_minlag 2 0  -stim_maxlag 2 14 -stim_nptr 2 2    \
-stim_file 3  all_stims.1DÕ[2]Õ -stim_label 3 ToolPoint    \
-stim_minlag 3 0  -stim_maxlag 3 14 -stim_nptr 3 2    \
-stim_file 4  all_stims.1DÕ[3]Õ -stim_label 4 HumanPoint    \
-stim_minlag 4 0  -stim_maxlag 4 14 -stim_nptr 4 2    \
-glt 4 contrast1.1D -glt_label 1 FullF    \
-glt 1 contrast2.1D -glt_label 2 HvsT    \
-glt 1 contrast2.1D -glt_label 3 MvsP    \
-glt 1 contrast2.1D -glt_label 4 HMvsHP    \
-glt 1 contrast2.1D -glt_label 5 TMvsTP    \
-glt 1 contrast2.1D -glt_label 6 HPvsTP    \ -glt 1 contrast2.1D -glt_label 7 HMvsTM    \
-iresp 1 TMirf -iresp 2 HMirf -iresp 3 TPirf -iresp 4 HPirf  \
-full_first  -fout  -tout  -nobout  -polort 2    \
-concat runs.1D    \
-progress 1000    \
-bucket ED_func

"-iresp 1 TMirf"
 -iresp 1 TMirf
 -iresp 2 HMirf
 -iresp 3 TPirf
 -iresp 4 HPirf
These output files are important because they contain the estimated Impulse Response Function for each stimulus type
The percent signal change is shown at each time lag
Below is the estimated IRF for Subject EDÕs ÒHuman MoviesÓ (HM) condition:

"Focusing on a single voxel..."
Focusing on a single voxel (from EDÕs HMirf+orig dataset), we can see that the IRF is made up of 15 time lags (0-14).  Recall that this lag duration was determined in the 3dDeconvolve command
Each time lag consists of a percent signal change value:

"To run an ANOVA,"
To run an ANOVA, only one data point can exist in each voxel
As such, the percent signal change values in the 15 lags must be averaged
In the voxel displayed below, the mean percent signal change =  1.557%

"Before averaging the values in..."
Before averaging the values in each voxel, one option is to remove any ÒuninterestingÓ time lags and average only those time lag points that matter
In this example, ÒuninterestingÓ refers to time lags whose percent signal change value is close to zero
If we included these uninteresting time lags in our mean calculation, they would pull the mean closer to zero, making it more difficult to pick up any statistically significant results in the ANOVA
In this example, the IRF is fairly small from time lags 0-3, but begins to spike up at time lag 4.  It peaks at time lag 7, declines at 9, drops sharply by 10, and decreases steadily from there.
Given this information, one option                                                             is to compute the mean percent                                                                                   signal change for time lags 4-9                                                              only

"If the mean is computed..."
If the mean is computed only for time lags 4-9, then mean percent signal change in the voxel below increases dramatically, from 1.557% (all lags) to 3.086%:

"STEP 5:"
 STEP 5: Compute a voxel-by-voxel mean percent signal change --      including only time lags 4-9 -- with AFNI 3dTstat
The following 3dTstat command will focus only on time lags 4-9, and compute the mean for those lags
This will be done on a voxel-by-voxel basis, and for each IRF dataset, of which we have four: TMirf, HMirf, TPirf, HPirf
foreach cond (TM HM TP HP)
3dTstat -prefix ED_{$cond}_irf_mean  \
{$cond}irf+origÕ[4..9]Õ
end

"The output from 3dTstat will..."
The output from 3dTstat will be four irf_mean datasets, one for each stimulus type.  Below are subject EDÕs averaged IRF datasets:
ED_TM_irf_mean+orig ED_HM_irf_mean+orig
ED_TP_irf_mean+orig ED_HP_irf_mean+orig
Each voxel will now contain a single number (i.e., the mean percent signal change).  For example:

"STEP 6:"
 STEP 6: Resample the mean IRF datasets for each subject to the same      grid as their Talairached anatomical datasets with adwarp
For statistical comparisons made across subjects, all datasets -- including functional overlays -- should be standardized (e.g., Talairach format) to control for variability in brain shape and size
foreach cond (TM HM TP HP)
adwarp -apar EDspgr+tlrc     \
-dpar ED_{$cond}_irf_mean+orig  \
end
The output of adwarp will be four Talairach transformed IRF datasets.
ED_TM_irf_mean+tlrc ED_HM_irf_mean+tlrc
ED_TP_irf_mean+tlrc ED_HP_irf_mean+tlrc
We are now done with Part 1-- Process Individual SubjectsÕ Data -- for Subject ED
Go back and follow the same steps for Subjects EE and EF
We can now move on to Part 2 -- RUN GROUP ANALYSIS (ANOVA)

"PART 2 Þ"
PART 2 Þ  Run Group Analysis (ANOVA2):
In our sample experiment, we have two factors (or Independent Variables) for our analysis of variance: ÒStimulus ConditionÓ and ÒSubjectsÓ
IV 1: STIMULUS CONDITION Þ 4 levels
Tool Movies (TM)
Human Movies (HM)
Tool Points (TP)
Human Points (HP)
IV 2: SUBJECTS Þ 3 levels
Subject ED
Subject EE
Subject EF
The mean IRF datasets from each subject will be needed for the ANOVA:
ED_TM_irf_mean+tlrc  EE_TM_irf_mean+tlrc  EF_TM_irf_mean+tlrc
ED_HM_irf_mean+tlrc  EE_HM_irf_mean+tlrc  EF_HM_irf_mean+tlrc
ED_TP_irf_mean+tlrc  EE_TP_irf_mean+tlrc  EF_TP_irf_mean+tlrc
ED_HP_irf_mean+tlrc  EE_HP_irf_mean+tlrc  EF_HP_irf_mean+tlrc

"3dANOVA2 -type 3 -..."
    3dANOVA2 -type 3 -alevels 4 -blevels 3 \
-dset 1 1 ED_TM_irf_mean+tlrc \
-dset 2 1 ED_HM_irf_mean+tlrc \
-dset 3 1 ED_TP_irf_mean+tlrc \
-dset 4 1 ED_HP_irf_mean+tlrc \
-dset 1 2 EE_TM_irf_mean+tlrc \
-dset 2 2 EE_HM_irf_mean+tlrc \
-dset 3 2 EE_TP_irf_mean+tlrc \
-dset 4 2 EE_HP_irf_mean+tlrc \
-dset 1 3 EF_TM_irf_mean+tlrc \
-dset 2 3 EF_HM_irf_mean+tlrc \
-dset 3 3 EF_TP_irf_mean+tlrc \
-dset 4 3 EF_HP_irf_mean+tlrc \
-amean 1 TM   \
-amean 2 HM   \
-amean 3 TP   \
-amean 4 HP   \

"-acontr"
-acontr  1  1  1  1 AllAct \
-acontr -1  1 -1  1 HvsT \
-acontr  1  1 -1 -1 MvsP \
-acontr  0  1  0 -1 HMvsHP \
-acontr  1  0 -1  0 TMvsTP \
-acontr  0  0  1 -1 TPvsHP \
-acontr -1  1  0  0 TMvsHM \
-fa StimEffect \
-bucket AvgANOVAv1

"Explanation of 3dANOVA2 options:"
Explanation of 3dANOVA2 options:
-type 3: Tells 3dANOVA2 what type of ANOVA model is to be used.  Type 3 refers to a mixed effects model (a = fixed, b=random)
Type 1 = fixed effects model (a and b are fixed)
Type 2 = random effects model (a and b are random)
-alevels 4: Our first IV ÒStimulus ConditionÓ has 4 levels -- TM,HM,TP,HP
-blevels 3: Our second IV ÒSubjectsÓ has 3 levels -- ED, EE, EF
-dset a b filename: Sets up all the factor level combinations
For example:
             Subject (b)
ED EE EF
TM 1,1 1,2 1,3
Stim Cond (a) HM 2,1 2,2 2,3
TP 3,1 3,2 3,3
HP 4,1 4,2 4,3

"-amean:"
-amean: Estimates the mean for every level of factor ÔaÕ (collapsed across factor ÔbÕ) and provides a t-statistic for the null hypothesis, Ho: mean = 0
In this example, the mean percent signal change for each stimulus condition type is averaged across subjects on a voxel-by-voxel basis
For example:
Mean Percent Signal Change at Voxel ÒxÓ
            Subject (b)
ED EE EF
TM 4.1% 3.8% 4.5%  Þ M = 4.13%
Stim Cond (a) HM 3.2% 2.5% 2.8%  Þ M = 2.83%
TP 4.6% 4.1% 4.9%  Þ M = 4.53%
HP 1.7% 2.0% 1.1%  Þ M = 1.60%
Allows you to determine if the percent signal change in a voxel is significantly different from zero.  If so, appears as colored activation  blobs

"In class -- LetÕs run..."
In class -- LetÕs run the ANOVA together:
cd AFNI_data2/group_data
the group_data/ directory contains a script called @anova2 that will run 3dANOVA2
This script can be viewed with a text editor, e.g., emacs @anova2
./@anova2
execute the ANOVA script from the command line
ls
result from ANOVA script is a bucket dataset AvgANOVAv1+tlrc, stored in the group_data/ directory
afni &
launch AFNI to view the results

"Brain areas corresponding to Ò..."
Brain areas corresponding to ÒHuman MoviesÓ
 -amean 2 HM

"-acontr:"
-acontr: Estimates contrasts among levels of factor ÔaÕ
E.g., t-tests are conducted among levels of the factor ÒStimulus ConditionÓ to determine if their percent signal changes differ significantly from each other throughout the brain
Here are some contrasts we performed:
TM    HM TP       HP         voxel-by-voxel t-tests:
0      1 0         -1      Þ Human Movies vs. Human Points
1      0 -1          0      Þ Tool Movies vs. Tool Points
0      0 1         -1      Þ    Tool Points vs. Human Points
1     -1 0          0       Þ    Tool Movies vs. Human Movies
-1      1         -1         1      Þ Tools vs. Humans
1      1 -1         -1      Þ Movies vs. Points
For simple contrasts, you could also use the -adiff option                    (e.g., -adiff 2 4)
 Tools vs. Humans contrast collapses across Animation Type               (i.e., Movies and Points)
 Movies vs. Points contrast collapses across Object Type                      (i.e., Tools and Humans)

"Brain areas corresponding to Ò..."
Brain areas corresponding to ÒMoviesÓ (reds) vs. ÒPoint-light displaysÓ (blues)
 -acontr 1 1 -1 -1

"-fa:"
-fa: Produces a main effect for factor ÔaÕ
In this example, -fa determines which voxels show a percent signal change that is significantly different from zero when any level of factor ÒStimulus ConditionÓ is presented
 -fa StimEffect:

"Many thanks to Mike Beauchamp..."
Many thanks to Mike Beauchamp for donating the data used in this lecture and in the how-to#5
For a full review of the experiment described in this lecture, see
Beauchamp, M.S., Lee, K.E., Haxby, J.V., & Martin, A. (2003). FMRI responses to video and point-light displays of moving humans and manipulable objects.  Journal of Cognitive Neuroscience, 15:7, 991-1001.
For more information on AFNI ANOVA programs, download Doug WardÕs paper on Analysis of Variance for FMRI data, located on the AFNI website at:
http//afni.nimh.gov/pub/dist/doc   Þ  3dANOVA.pdf or 3dANOVA.ps