================== Welcome to 3dICC ==================
AFNI Program for IntraClass Correlation (ICC) Analysis
#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Version 1.0, Oct 4, 2023
Author: Gang Chen (gangchen@mail.nih.gov)
Website - ATM
SSCC/NIMH, National Institutes of Health, Bethesda MD 20892, USA
#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Usage:
------
Intraclass correlation (ICC) measures the extent of consistency, agreement or
reliability of an effect (e.g., BOLD respoonse) across two or more measures.
3dICC is a program that computes whole-brain voxel-wise ICC when each subject
has two or more effect estimates (e.g., sessions, scanners, etc. ). All three
typical types of ICC are available through proper model specification:
ICC(1, 1), ICC(2,1) and ICC(3,1). The latter two types are popular in
neuroimaging because ICC(1,1) is usually applicable for scenarios such as twins.
The program can be applied to even wider situations (e.g., incorporation of
confounding effects or more than two random-effects variables). The modeling
approaches are laid out in the following paper:
Chen, G., Taylor, P.A., Haller, S.P., Kircanski, K., Stoddard, J., Pine, D.S.,
Leibenluft, E., Brotman, M.A., Cox, R.W., 2018. Intraclass correlation:
Improved modeling approaches and applications for neuroimaging. Human Brain
Mapping 39, 1187–1206. https://doi.org/10.1002/hbm.23909
Currently it provides in the output the ICC value and the corresponding
F-statistic at each voxel. In future, inferences for intercept and covariates
may be added.
Input files for 3dICC can be in AFNI, NIfTI, or surface (niml.dset) format.
Two input scenarios are considered: 1) effect estimates only, and 2) effect
estimates plus their t-statistic values which are used for weighting based
on the precision contained in the t-statistic.
In addition to R installation, the following R packages need to be installed
in R first before running 3dICC: "lme4", "blme" and "metafor". In addition,
the "snow" package is also needed if one wants to take advantage of parallel
computing. To install these packages, run the following command at the terminal:
rPkgsInstall -pkgs "blme,lme4,metafor,snow"
Alternatively you may install them in R:
install.packages("blme")
install.packages("lme4")
install.packages("metafor")
install.packages("snow")
Once the 3dICC command script is constructed, it can be run by copying and
pasting to the terminal. Alternatively (and probably better) you save the
script as a text file, for example, called ICC.txt, and execute it with the
following (assuming on tc shell),
nohup tcsh -x ICC.txt &
or,
nohup tcsh -x ICC.txt > diary.txt &
nohup tcsh -x ICC.txt |& tee diary.txt &
The advantage of the latter commands is that the progression is saved into
the text file diary.txt and, if anything goes awry, can be examined later.
Example 1 --- Compute ICC(2,1) values between two sessions. With the option
-bounds, values beyond [-2, 2] will be treated as outliers and considered
as missing. If you want to set a range, choose the bounds that make sense
with your input data.
-------------------------------------------------------------------------
3dICC -prefix ICC2 -jobs 12 \
-mask myMask+tlrc \
-model '1+(1|session)+(1|Subj)' \
-bounds -2 2 \
-dataTable \
Subj session InputFile \
s1 one s1_1+tlrc'[pos#0_Coef]' \
s1 two s1_2+tlrc'[pos#0_Coef]' \
...
s21 two s21_2+tlrc'[pos#0_Coef]' \
...
Example 2 --- Compute ICC(3,1) values between two sessions. With the option
-bounds, values beyond [-2, 2] will be treated as outliers and considered
as missing. If you want to set a range, choose the bounds that make sense
with your input data.
-------------------------------------------------------------------------
3dICC -prefix ICC3 -jobs 12 \
-mask myMask+tlrc \
-model '1+session+(1|Subj)' \
-bounds -2 2 \
-dataTable \
Subj session InputFile \
s1 one s1_1+tlrc'[pos#0_Coef]' \
s1 two s1_2+tlrc'[pos#0_Coef]' \
...
s21 two s21_2+tlrc'[pos#0_Coef]' \
...
Example 3 --- Compute ICC(3,1) values between two sessions with both effect
estimates and their t-statistics as input. The subject column is explicitly
declared because it is named differently from the default ('Subj').
-------------------------------------------------------------------------
3dICC -prefix ICC3 -jobs 12 \
-mask myMask+tlrc \
-model '1+age+session+(1|Subj)' \
-bounds -2 2 \
-Subj 'subject' \
-tStat 'tFile' \
-dataTable \
subject age session tFile InputFile \
s1 21 one s1_1+tlrc'[pos#0_tstat]' s1_1+tlrc'[pos#0_Coef]' \
s1 21 two s1_2+tlrc'[pos#0_tstat]' s1_2+tlrc'[pos#0_Coef]' \
...
s21 28 two s21_2+tlrc'[pos#0_tstat]' s21_2+tlrc'[pos#0_Coef]' \
...
Example 4 --- Compute ICC(2,1) values between two sessions while controlling
for age effect. With the option -bounds, values beyond [-2, 2] will be
be treated as outliers and considered as missing. If you want to set a range,
choose the bounds that make sense with your input data.
-------------------------------------------------------------------------
3dICC -prefix ICC2a -jobs 12 \
-mask myMask+tlrc \
-model '1+age+(1|session)+(1|Subj)' \
-bounds -2 2 \
-Subj 'subjct' \
-InputFile 'inputfile' \
-dataTable \
subject age session inputfile \
s1 21 one s1_1+tlrc'[pos#0_Coef]' \
s1 21 two s1_2+tlrc'[pos#0_Coef]' \
...
s21 28 two s21_2+tlrc'[pos#0_Coef]' \
...
Options in alphabetical order:
------------------------------
-bounds lb ub: This option is for outlier removal. Two numbers are expected from
the user: the lower bound (lb) and the upper bound (ub). The input data will
be confined within [lb, ub]: any values in the input data that are beyond
the bounds will be removed and treated as missing. Make sure the first number
less than the second. You do not have to use this option to censor your data!
-cio: Use AFNI's C io functions, which is default. Alternatively -Rio
can be used.
-dataTable TABLE: List the data structure with a header as the first line.
NOTE:
1) This option has to occur last; that is, no other options are
allowed thereafter. Each line should end with a backslash except for
the last line.
2) The first column is fixed and reserved with label 'Subj', and the
last is reserved for 'InputFile'. Each row should contain only one
effect estimate in the table of long format (cf. wide format) as
defined in R. The level labels of a factor should contain at least
one character. Input files can be in AFNI, NIfTI or surface format.
AFNI files can be specified with sub-brick selector (square brackets
[] within quotes) specified with a number or label.
3) It is fine to have variables (or columns) in the table that are
not modeled in the analysis.
4) The context of the table can be saved as a separate file, e.g.,
called table.txt. In the 3dICC script, specify the data with
'-dataTable @table.txt'. Do NOT put any quotes around the square
brackets for each sub-brick; Otherwise, the program cannot properly
read the files. This option is useful: (a) when there are many input
files so that the program complains with an 'Arg list too long' error;
(b) when you want to try different models with the same dataset.
-dbgArgs: This option will enable R to save the parameters in a
file called .3dICC.dbg.AFNI.args in the current directory
so that debugging can be performed.
-help: this help message
-IF var_name: var_name is used to specify the last column name that is designated for
input files of effect estimate. The default (when this option is not invoked
is 'InputFile', in which case the column header has to be exactly as 'InputFile'.
-jobs NJOBS: On a multi-processor machine, parallel computing will speed
up the program significantly.
Choose 1 for a single-processor computer.
-mask MASK: Process voxels inside this mask only.
Default is no masking.
-model FORMULA: Specify the model structure for all the variables. The
expression FORMULA with more than one variable has to be surrounded
within (single or double) quotes. Variable names in the formula
should be consistent with the ones used in the header of -dataTable.
Suppose that each subject ('subj') has two sessions ('ses'), a model
ICC(2,1) without any covariate is "1+(1|ses)+(1|subj)" while one
for ICC(3,1) is "1+ses+(1|subj)". Each random-effects factor is
specified within parentheses per formula convention in R. Any
confounding effects (quantitative or categorical variables) can be
added as fixed effects without parentheses.
-prefix PREFIX: Output file name. For AFNI format, provide prefix only,
with no view+suffix needed. Filename for NIfTI format should have
.nii attached, while file name for surface data is expected
to end with .niml.dset. The sub-brick labeled with the '(Intercept)',
if present, should be interpreted as the effect with each factor
at the reference level (alphabetically the lowest level) for each
factor and with each quantitative covariate at the center value.
-qVarCenters VALUES: Specify centering values for quantitative variables
identified under -qVars. Multiple centers are separated by
commas (,) without any other characters such as spaces and should
be surrounded within (single or double) quotes. The order of the
values should match that of the quantitative variables in -qVars.
Default (absence of option -qVarsCetners) means centering on the
average of the variable across ALL subjects regardless their
grouping. If within-group centering is desirable, center the
variable YOURSELF first before the values are fed into -dataTable.
-qVars variable_list: Identify quantitative variables (or covariates) with
this option. The list with more than one variable has to be
separated with comma (,) without any other characters such as
spaces and should be surrounded within (single or double) quotes.
For example, -qVars "Age,IQ"
WARNINGS:
1) Centering a quantitative variable through -qVarsCenters is
very critical when other fixed effects are of interest.
2) Between-subjects covariates are generally acceptable.
However EXTREME caution should be taken when the groups
differ significantly in the average value of the covariate.
3) Within-subject covariates are better modeled with 3dICC.
-Rio: Use R's io functions. The alternative is -cio.
-show_allowed_options: list of allowed options
-Subj var_name: var_name is used to specify the column name that is designated as
as the measuring entity variable (usually subject). The default (when this
option is not invoked) is 'Subj', in which case the column header has to be
exactly as 'Subj'.
-tStat col_name: col_name is used to specify the column name that is designated as
as the t-statistic. The default (when this option is not invoked) is 'NA',
in which case no t-stat is provided as part of the input; otherwise declare
the t-stat column name with this option.