Chen et al. (2018). A tail of two sides: Artificially doubled false positive rates in neuroimaging due to the sidedness choice with t-tests¶
Chen GC, Cox RW, Glen DR, Rajendra JK, Reynolds RC, Taylor PA (2018). A tail of two sides: Artificially doubled false positive rates in neuroimaging due to the sidedness choice with t-tests
See also
Tag |
Label |
---|---|
FMRI paradigm: |
task-block |
FMRI dset: |
EPI |
Anatomical dset: |
MPRAGE |
Subject population: |
human |
Subject characteristic: |
control |
Subject age: |
adult |
Template space: |
MNI |
Template align method: |
nonlinear |
Tissue segmentation: |
|
Tissue regression: |
|
Comments: |
An example of group level analyses with two-tailed testing (using 3dMEMA, 3dClustSim and 3dClusterize, among others) |
These scripts describe different approaches for processing FMRI data with AFNI. Please read the comments at the tops of the scripts carefully, as well as the bioRxiv papers associated with each, in order to understand the steps.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 | #!/bin/tcsh
# ----------------------------------------------------------------------
# Script: s.nimh_group_level_02_mema_bisided.tcsh
# Run on: openfmri/ds001_R2.0.4
# Date : May, 2018
#
# Take a group of subjects processed for task-based FMRI, run 3dMEMA
# on their final statistical results. Clustering is also performed
# here, using 3dClustSim (with the ACF blur estimation). Some
# individual volumes of t-statistics and effect estimate maps are also
# output. This provides an example of using the newest clusterizing
# program in AFNI, 3dClusterize (an update version of 3dclust).
#
# Used as an example of proper two-sided testing in:
#
# A tail of two sides: Artificially doubled false positive rates in
# neuroimaging due to the sidedness choice with t-tests
#
# by Chen G, Glen DR, Rajendra JK, Reynolds RC, Cox RW, Taylor PA.
# Scientific and Statistical Computing Core, NIMH/NIH/DHHS, USA.
#
# ... as applied to an earlier processed dataset from *this* study:
#
# Some comments and corrections on FMRI processing with AFNI in
# "Exploring the Impact of Analysis Software on Task fMRI Results"
#
# by Taylor PA, Chen G, Glen DR, Rajendra JK, Reynolds RC, Cox RW.
# Scientific and Statistical Computing Core, NIMH/NIH/DHHS, USA.
#
# NOTE: The processing described in "Some comments..." includes sets
# of processing commands correcting/improving upon some sets of
# commands run in an earlier study by a different group (see therein
# for more details). However, even the "NIMH" set of commands still
# includes some non-ideal features, as described there, for the
# purposes of comparison with the other group's processing stream.
# Please read the associated text carefully before using/adapting
# those scripts.
#
# ----------------------------------------------------------------------
#
# To run for a single subject, for example:
#
# tcsh -ef s.nimh_group_level_02_mema_bisided.tcsh
#
# On a cluster you might want to use scratch disk space for faster
# writing, and a lot of memory and CPUs, depending on the
# data/processing choices. Syntax might be:
#
# sbatch \
# --partition=SOME_NAMES \
# --cpus-per-task=32 \
# --mem=64g \
# --gres=lscratch:80 \
# s.nimh_group_level_02_mema_bisided.tcsh
#
# ----------------------------------------------------------------------
# ================ Set up paths and many params ======================
set here = $PWD
set p_0 = /data/NIMH_SSCC
set path_main = ${p_0}/openfmri/ds001_R2.0.4
set path_proc = ${path_main}/data_proc_nimh
set odir = GROUP_LEVEL_nimh
set path_mema = ${path_proc}/${odir}
# Running 3dMEMA
set script_mema = do_group_level_nimh.tcsh # generated command file name
set omema = mema_results.nii.gz # output effect+stats filename
set groupid = controls # volume label for stat result
# Cluster parameters
set csim_neigh = 1 # neighborhood; could be NN=1,2,3
set csim_NN = "NN${csim_neigh}" # other form of neigh
set csim_sided = "bisided" # test type; could be 1sided, 2sided or bisided
set csim_pthr = 0.001 # voxelwise thr (was higher, 0.01, in orig study)
set csim_alpha = 0.05 # nominal FWE
set csim_pref = "clust_${csim_sided}" # prefix for outputting stuff
# ================== Optional: Biowulf cluster =========================
# This section for possibly running on Biowulf cluster.
#### The following 2 lines could be uncommented to load these modules;
#### that would likely not be necessary outside of NIH's Biowulf cluster.
# source /etc/profile.d/modules.csh
# module load afni R
# set thread count if we are running SLURM
if ( $?SLURM_CPUS_PER_TASK ) then
setenv OMP_NUM_THREADS $SLURM_CPUS_PER_TASK
endif
# Set temporary output directory; then requires using something like
# this on the swarm command line: --sbatch '--gres=lscratch:50'.
# These variables used again *after* afni_proc.py command, if running
# on Biowulf.
if( $?SLURM_JOBID )then
set tempdir = /lscratch/$SLURM_JOBID/${odir}
set usetemp = 1
else
set tempdir = $path_mema
set usetemp = 0
endif
mkdir -p $tempdir
# ====================== Voxelwise modeling ==========================
# Create and execute 3dMEMA command. Note the inclusion of the
# "-options -missing_data 0" flags, to avoid some numerical badness of
# possible FOV edge effects, etc.
gen_group_command.py \
-command 3dMEMA \
-write_script ${tempdir}/${script_mema} \
-prefix ${tempdir}/${omema} \
-dsets ${path_proc}/sub*/stats.sub*_REML+tlrc.HEAD \
-set_labels "$groupid" \
-subs_betas 'pumps_demean_vs_ctrl_demean#0_Coef' \
-subs_tstats 'pumps_demean_vs_ctrl_demean#0_Tstat' \
-options -missing_data 0
tcsh -ef ${tempdir}/${script_mema}
echo "++ MEMAed!"
cd ${tempdir}
echo "++ Now in the group stat dir: $PWD"
# ====================== Make group mask ==========================
# Make a group level mask for reporting results. This also determines
# the volume over which multiple comparisons corrections for voxelwise
# stats are done.
# The "-frac 1.0" is to make an intersection mask.
3dmask_tool \
-prefix mask.nii.gz \
-input `ls ${path_proc}/sub-*/mask_epi_anat.*.HEAD` \
-frac 1.0
# ==================== Calc average smoothness =======================
# Get the line from each blur estimate file that has our desired
# parameters; we are running REML here and using the ACF smoothness
# estimation, so the gpattern variable reflects both. For N subjects,
# the output *1D file contains N rows of the 3 ACF parameters.
set gpattern = 'err_reml ACF'
grep -h "$gpattern" ${path_proc}/sub*/blur_est* \
| cut -d\ -f1-3 \
> group_ACF_ests.1D
# Get the group average ACF parameters.
set blur_est = ( `3dTstat -mean -prefix - group_ACF_ests.1D\'` )
echo "++ The group average ACF params are: $blur_est"
# ==================== Cluster simulations =======================
# Simulations for FWE corrected cluster-size inference: make a cluster
# table based on the simulations, and *that* gets parsed and applied
# to the actual data.
3dClustSim \
-both \
-mask mask.nii.gz \
-acf $blur_est \
-prefix ClustSim
# Get the volume threshold value from the ClustSim results, based on
# user's choice of parameters as set above. The "-verb 0" means that
# just the threshold number of voxels is returned, so we can save that
# as a variable.
set clust_thrvol = `1d_tool.py -verb 0 \
-infile ClustSim.${csim_NN}_${csim_sided}.1D \
-csim_pthr $csim_pthr \
-csim_alpha "$csim_alpha"`
# Get the statistic value equivalent to the desired voxelwise p-value,
# for thresholding purposes. Using the same p-value and sidedness
# that were selected in the ClustSim results. This program also gets
# the number of degrees of freedom (DOF) from the header of the volume
# containing the statistic. The "-quiet" means that only the
# associated statistic value is returned, so we can save it to a
# variable.
#
# Note: actually, you can do this calculation within 3dClusterize
# directly now, specifying the threshold as a p-value.
set voxstat_thr = `p2dsetstat -quiet \
-pval $csim_pthr \
"-${csim_sided}" \
-inset "${omema}[${groupid}:t]"`
echo "++ The final cluster volume threshold is: $clust_thrvol"
echo "++ The voxelwise stat value threshold is: $voxstat_thr"
# ================== Make cluster maps =====================
# Run the 'clusterize' program to make maps of the clusters (sorted by
# size) and to output a cluster-masked map of the effect estimate (EE)
# data, in this case %BOLD fluctuation values-- i.e., the stuff we
# should report.
#
# The main inputs are: dataset contain the statistic volume to be
# thresholded, a mask within which to find clusters, and the ClustSim
# parameters+outputs. A common addition to this list of inputs is to
# specify the location of the EE volume in the input dset, for
# reporting the EE info within any found clusters.
#
# SPECS:
# 3dClusterize \
# -inset :input dset to get info from; like both stat and eff data
# -ithr :str label of stat vol in inset; or, use vol index: 1
# -idat :(opt) str label EE vol in inset; or, use vol index: 0
# -mask :same mask input to 3dClustSim (here, whole brain)
# -bisided :specify *type* of test; followed by threshold limit info
# -NN :what neighborhood definition was used?
# -clust_nvox :cluster size threshold from 3dClustSim
# -pref_map :name for cluster map output vol
# -pref_dat :name for cluster-masked EE output vol
# > ${csim_pref}_report.txt :dump text table of cluster report to file
#
# Note: one-sided testing would require slightly different syntax:
# '-1sided ...' would not be followed by 2 numbers, as used here for
# '-bisided ...', but instead by specifying a tail (i.e.,
# "RIGHT_TAIL") and then a single number (its threshold value). Also,
# if you use the "p=..." form of specifying a threshold value in
# 3dClusterize, then the syntax would be a bit different, just using
# that single argument after even '-bisided ...', for example. Please
# see the help file for a more full description of the option(s).
3dClusterize \
-inset ${omema} \
-ithr "${groupid}:t" \
-idat "${groupid}:b" \
-mask mask.nii.gz \
-${csim_sided} -$voxstat_thr $voxstat_thr \
-NN ${csim_neigh} \
-clust_nvox ${clust_thrvol} \
-pref_map ${csim_pref}_map.nii.gz \
-pref_dat ${csim_pref}_EE.nii.gz \
> ${csim_pref}_report.txt
echo "++ Clusterizing complete."
cd $here
# ===================================================================
# Again, Biowulf-running considerations: if processing went fine and
# we were using a temporary directory, copy data back.
if( $usetemp && -d $tempdir )then
echo "Copying data from $tempdir to $path_mema"
mkdir -p $path_mema
\cp -pr $tempdir/* $path_mema/
endif
# ===================================================================
echo "\n++ DONE with group level stats + clustering!\n"
time ; exit 0
|