:orphan: .. _ahelp_3dMSS: ***** 3dMSS ***** .. contents:: :local: | .. code-block:: none ================== Welcome to 3dMSS ================== Program for Voxelwise Multilevel Smoothing Spline (MSS) Analysis #+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Version 0.0.16, Aug 13, 2022 Author: Gang Chen (gangchen@mail.nih.gov) Website - https://afni.nimh.nih.gov/gangchen_homepage SSCC/NIMH, National Institutes of Health, Bethesda MD 20892, USA #+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Introduction ------ Multilevel Smoothing-Spline (MSS) Modeling The linearity assumption surrounding a quantitative variable in common practice may be a reasonable approximation especially when the variable is confined within a narrow range, but can be inappropriate under some circumstances when the variable's effect is non-monotonic or tortuous. As a more flexible and adaptive approach, multilevel smoothing splines (MSS) offers a more powerful analytical tool for population-level neuroimaging data analysis that involves one or more quantitative predictors. More theoretical discussion can be found in Chen et al. (2020). Beyond linearity: Capturing nonlinear relationships in neuroimaging. https://doi.org/10.1101/2020.11.01.363838 To be able to run 3dMSS, one needs to have the following R packaages installed: "gamm4" and "snow". To install these R packages, run the following command at the terminal: rPkgsInstall -pkgs "gamm4,snow" Alternatively you may install them in R: install.packages("gamm4") install.packages("snow") It is best to go through all the examples below to get hang of the MSS scripting interface. Once the 3dMSS script is constructed, it can be run by copying and pasting to the terminal. Alternatively (and probably better) you save the script as a text file, for example, called MSS.txt, and execute it with the following (assuming on tc shell), nohup tcsh -x MSS.txt & or, nohup tcsh -x MSS.txt > diary.txt & or, nohup tcsh -x MSS.txt |& tee diary.txt & The advantage of the latter commands is that the progression is saved into the text file diary.txt and, if anything goes awry, can be examined later. Example 1 --- simplest case: one group of subjects with a between-subject quantitative variable that does not vary within subject. MSS analysis is set up to model the trajectory or trend along age, and can be specified through the option -mrr, which is solved via a model formuation of ridge regression. Again, the following exemplary script assumes that 'age' is a between-subjects variable (not varying within subject): 3dMSS -prefix MSS -jobs 16 \ -mrr 's(age)' \ -qVars 'age' \ -mask myMask.nii \ -bounds -2 2 \ -prediction @pred.txt \ -dataTable @data.txt The function 's(age)' indicates that 'age' is modeled via a smooth curve. No empty space is allowed in the model formulation. With the option -bounds, values beyond [-2, 2] will be treated as outliers and considered as missing. If you want to set a range, choose one that make sense with your specific input data. The file pred.txt lists all the expl1anatory variables (excluding lower-level variables such as subject) for prediction. The file should be in a data.frame format as below: label age t1 1 t2 2 t3 3 ... t8 8 t9 9 t10 10 ... The file data.txt stores the information for all the variables and input data in a data.frame format. For example: Subj age InputFile S1 1 ~/alex/MSS/S1.nii S2 2 ~/alex/MSS/S2.nii ... In the output the first sub-brick shows the statistical evidence in the form of chi-square distribution with 2 degrees of freedom (2 DFs do not mean anything, just for the convenience of information coding). This sub-brick is the statistical evidence for the trejectory of the group. If you want to estimate the trend at the population level, use the option -prediction with a table that codes the ages you would like to track the trend. In the output there is one predicted value for each age plus the associated uncertainty (standard error). For example, with 10 age values, there will be 10 predicted values plus 10 standard errors. The sub-bricks for prediction and standard errors are interleaved. Example 2 --- Largely same as Example 1, but with 'age' as a within-subject quantitative variable (varying within each subject). The model is better specified by replacing the line of -mrr in Example 1 with the following two lines: -mrr 's(age)+s(Subj,bs="re")' \ -vt Subj 's(Subj)' \ The second term 's(Subj,bs="re")' in the model specification means that each subject is allowed to have a varying intercept or random effect ('re'). To estimate the smooth trajectory through the option -prediction, the option -vt has to be included in this case to indicate the varying term (usually subjects). That is, if prediction is desirable, one has to explicitly declare the variable (e.g., Subj) that is associated with the varying term (e.g., s(Subj)). No empty space is allowed in the model formulation and the the varying term. The full script version is 3dMSS -prefix MSS -jobs 16 \ -mrr 's(age)+s(Subj,bs="re")' \ -vt Subj 's(Subj)' \ -qVars 'age' \ -mask myMask.nii \ -bounds -2 2 \ -prediction @pred.txt \ -dataTable @data.txt All the rest remains the same as Example 1. Alternatively, this model with varying subject-level intercept can be specified with -lme 's(age)' \ -ranEff 'list(Subj=~1)' \ which is solved through the linear mixed-effect (lme) platform. The -vt is not needed when making prediction through the option -prediction. The two specifications, -mrr and -lme, would render similar results, but the runtime may differ depending on the amount of data and model complexity. Example 3 --- two groups and one quantitative variable (age). MSS analysis is set up to compare the trajectory or trend along age between the two groups, which are quantitatively coded as -1 and 1. For example, if the two groups are females and males, you can code females as -1 and males as 1. The following script applies to the situation when the quantitative variable does not vary within subject, 3dMSS -prefix MSS -jobs 16 \ -mrr 's(age)+s(age,by=grp)' \ -qVars 'age' \ -mask myMask.nii \ -bounds -2 2 \ -prediction @pred.txt \ -dataTable @data.txt On the other hand, go with the script below when the quantitative variable varies within subject, 3dMSS -prefix MSS -jobs 16 \ -mrr 's(age)+s(age,by=grp)+s(Subj,bs="re")' \ -vt Subj 's(Subj)' \ -qVars 'age' \ -mask myMask.nii \ -bounds -2 2 \ -prediction @pred.txt \ -dataTable @data.txt or an LME version: 3dMSS -prefix MSS -jobs 16 \ -lme 's(age)+s(age,by=grp)' \ -ranEff 'list(Subj=~1)' \ -qVars 'age' \ -mask myMask.nii \ -bounds -2 2 \ -prediction @pred.txt \ -dataTable @data.txt Options in alphabetical order: ------------------------------ -bounds lb ub: This option is for outlier removal. Two numbers are expected from the user: the lower bound (lb) and the upper bound (ub). The input data will be confined within [lb, ub]: any values in the input data that are beyond the bounds will be removed and treated as missing. Make sure the first number less than the second. You do not have to use this option to censor your data! -cio: Use AFNI's C io functions, which is default. Alternatively -Rio can be used. -dataTable TABLE: List the data structure with a header as the first line. NOTE: 1) This option has to occur last in the script; that is, no other options are allowed thereafter. Each line should end with a backslash except for the last line. 2) The order of the columns should not matter except that the last column has to be the one for input files, 'InputFile'. Each row should contain only one input file in the table of long format (cf. wide format) as defined in R. Input files can be in AFNI, NIfTI or surface format. AFNI files can be specified with sub-brick selector (square brackets [] within quotes) specified with a number or label. 3) It is fine to have variables (or columns) in the table that are not modeled in the analysis. 4) When the table is part of the script, a backslash is needed at the end of each line to indicate the continuation to the next line. Alternatively, one can save the context of the table as a separate file, e.g., calling it table.txt, and then in the script specify the data with '-dataTable @table.txt'. However, when the table is provided as a separate file, do NOT put any quotes around the square brackets for each sub-brick, otherwise the program would not properly read the files, unlike the situation when quotes are required if the table is included as part of the script. Backslash is also not needed at the end of each line, but it would not cause any problem if present. This option of separating the table from the script is useful: (a) when there are many input files so that the program complains with an 'Arg list too long' error; (b) when you want to try different models with the same dataset. -dbgArgs: This option will enable R to save the parameters in a file called .3dMSS.dbg.AFNI.args in the current directory so that debugging can be performed. -help: this help message -IF var_name: var_name is used to specify the column name that is designated for input files of effect estimate. The default (when this option is not invoked is 'InputFile', in which case the column header has to be exactly as 'InputFile' This input file for effect estimates has to be the last column. -jobs NJOBS: On a multi-processor machine, parallel computing will speed up the program significantly. Choose 1 for a single-processor computer. -lme FORMULA: Specify the fixed effect components of the model. The expression FORMULA with more than one variable has to be surrounded within (single or double) quotes. Variable names in the formula should be consistent with the ones used in the header of -dataTable. See examples in the help for details. -mask MASK: Process voxels inside this mask only. Default is no masking. -mrr FORMULA: Specify the model formulation through multilevel smoothing splines. Expression FORMULA with more than one variable has to be surrounded within (single or double) quotes. Variable names in the formula should be consistent with the ones used in the header of -dataTable. The nonlinear trajectory is specified through the expression of s(x,k=?) where s() indicates a smooth function, x is a quantitative variable with which one would like to trace the trajectory and k is the number of smooth splines (knots). The default (when k is missing) for k is 10, which is good enough most of the time when there are more than 10 data points of x. When there are less than 10 data points of x, choose a value of k slightly less than the number of data points. -prediction TABLE: Provide a data table so that predicted values could be generated for graphical illustration. Usually the table should contain similar structure as the input file except that 1) reserve the first column for effect labels which will be used for sub-brick names in the output for those predicted values; 2) columns for those varying smoothing terms (e.g., subject) and response variable (i.e., Y) should not be includes. Try to specify equally-spaced values with a small for the quantitative variable of modeled trajectory (e.g., age) so that smooth curves could be plotted after the analysis. See Examples in the help for a couple of specific tables used for predictions. -prefix PREFIX: Output file name. For AFNI format, provide prefix only, with no view+suffix needed. Filename for NIfTI format should have .nii attached (otherwise the output would be saved in AFNI format). -qVars variable_list: Identify quantitative variables (or covariates) with this option. The list with more than one variable has to be separated with comma (,) without any other characters such as spaces and should be surrounded within (single or double) quotes. For example, -qVars "Age,IQ" WARNINGS: 1) Centering a quantitative variable through -qVarsCenters is very critical when other fixed effects are of interest. 2) Between-subjects covariates are generally acceptable. However EXTREME caution should be taken when the groups differ substantially in the average value of the covariate. -ranEff FORMULA: Specify the random effect components of the model. The expression FORMULA with more than one variable has to be surrounded within (single or double) quotes. Variable names in the formula should be consistent with the ones used in the header of -dataTable. In the MSS context the simplest model is "list(Subj=~1)" in which the varying or random effect from each subject is incorporated in the model. Each random-effects factor is specified within paratheses per formula convention in R. -Rio: Use R's io functions. The alternative is -cio. -show_allowed_options: list of allowed options -vt var formulation: This option is for specifying varying smoothing terms. Two components are required: the first one 'var' indicates the varaible (e.g., subject) around which the smoothing will vary while the second component specifies the smoothing formulation (e.g., s(age,subject)). When there is no varying smoothing terms (e.g., no within-subject variables), do not use this option.