Unix help - description of the @DDmb_analyze script
from a Unix perspective
#!/bin/tcsh
#======================================================================
# start with basic variable initializations
#======================================================================
It is common practice to set general variables at the top of a shell
script, making them easy to find. In particular, variables that one
might expect to change from script use to script use should go at the
top, such as a subject's name or ID code, the number of runs, or
perhaps the directory to store the results in.
# ----------------------------------------------------------------------
# save a couple of directory names for future use
set topdir = `pwd`
set outlier_dir = $topdir/afni/outliers
The '$topdir' directory is where the script is run from. Referring to
directories with respect to '$topdir' makes it more clear where they
are located within the directory tree.
The '$outlier_dir' is where the outlier files will be placed, when they
are produced by executions of AFNI's 'to3d' program. Those text files
will then be available to read over, checking for data that might be
invalid.
Variables that are specific to a small part a script are usually
assigned at the beginning of that part. So it would certainly be
reasonable to assign '$outlier_dir' just before running 'to3d',
instead of assigning it here.
# Unix
Recall that when backward quotes are used, as in `pwd`, the command
is executed, with the output actually replacing the command. Thus,
multiple steps are taken for the execution of the line:
set topdir = `pwd`
1. The 'pwd' command is executed, resulting in, for example,
/home/afni_user/howto/ht02_DDmb
2. That output replaces the command in backward quotes.
So the actual shell command reads:
set topdir = /home/afni_user/howto/ht02_DDmb
3. The resulting command is executed. In this case, 'topdir'
is now assigned that directory name.
Now given the example of what '$topdir' might be, the assignment
of 'outlier_dir' ends up looking like:
set outlier_dir = /home/afni_user/howto/ht02_DDmb/afni/outliers
#======================================================================
# now perform various tests to make sure the world is okay
#======================================================================
After initial variables have been assigned, it is good to do some
basic checks, so at least the script won't crash for any obvious
reason. In this example, the following tests will be done:
- Verify that the script being run from the correct directory.
- Verify that the script has not been previously run, or at
least that it will not overwrite any existing files. This
script only puts new data in the 'afni' directory.
- Attempt to create a directory. This may fail, commonly when
the user does not have write permissions here.
- Verify that the AFNI programs are in the user's path. Simply
looking for one of them should be enough ('to3d').
# ----------------------------------------------------------------------
# verify that the script is being run from the proper directory
if ( ! -d DDmb_data ) then
echo ""
echo "error: this script must be run from the 'ht02_DDmb' directory"
echo " under the top level HOWTO directory"
echo "exiting..."
echo ""
exit
endif
The 'if' statement is a check to find whether there is a directory
called 'DDmb_data' here. If there is not, the script cannot be run.
So the user is informed, and the script is terminated.
# Unix
The '!' character is a logical NOT operator, and the '-d' operator is
a "directory exists" test. So if there is no such directory here, then
the script exits.
# ----------------------------------------------------------------------
# if the afni directory is already here, complain and exit
if ( -e afni ) then
echo "failure: the 'afni' directory already exists"
exit
endif
The existence of the 'afni' directory means the script has been
previously run. Since over-writing previous results is considered
unacceptable, the user must either move or remove that directory.
# Unix
Note that the '-e' operator is used here (as opposed to '-d' above).
This is a check for existence in any way, not just as a directory.
# ----------------------------------------------------------------------
# make a directory to store the outlier files from the to3d commands
mkdir -p $outlier_dir
Create the outlier directory. Recall from above that this was assigned
the value of '$topdir/afni/outliers'.
# Unix
The '-p' option is to make parent directories, as needed. It is used
here because the 'afni' directory does not yet exist. With this command,
the 'afni' directory is created, along with the sub-directory 'outliers'.
# ----------------------------------------------------------------------
# verify that the previous 'mkdir' command succeeded
if ( ! -d afni ) then
echo "failure: cannot create 'afni' directory (no write permission?)"
exit
endif
Testing for the existence of the 'afni' directory after the previous
mkdir command will verify that the command succeeded. This is a good
check to make, in case the user does not have write permission in this
directory.
# ----------------------------------------------------------------------
# verify that an AFNI program is in the PATH
which to3d >& /dev/null
if ( $status != 0 ) then
echo "failure: cannot find 'to3d' - is AFNI installed?"
exit
endif
# Unix
The 'which to3d' command requests the shell to output the location
(the full pathname) of the 'to3d' program. This will fail if 'to3d'
is not in the user's path.
In either case (success or failure), the output should not go to
the screen. Send all output to the special Unix file, /dev/null,
never to be seen again.
The second part of this test looks at the value of the '$status'
variable. This is a special shell variable that stores the result
of the previous command. Note that it must therefore be checked
or stored immediately, before any other commands are executed.
If '$status' is 0, the 'which' command must have been successful.
That means 'to3d' is in the user's path. Otherwise, if '$status'
is not 0, we inform the user and terminate the script.
#======================================================================
# begin pre-processing the data
#======================================================================
- create an AFNI dataset from the spoiled grass anatomical images
- create an AFNI dataset from EPI run 1
- create AFNI datasets from the other EPI runs
- concatenate all EPI datasets into 1 AFNI EPI dataset
- register the EPI volumes to the last one
# ----------------------------------------
# Change directories to the location of the SPGR I-files.
cd DDmb_data/SPGR_data
# ----------------------------------------------------------------------
# create the SPGR dataset, putting it in the afni directory
to3d -prefix DDSPGR -session $topdir/afni I.*
Since the input to 'to3d' is a list of I-files, it is not necessary to
specify information like alignment, orientation or grid spacing. This
information is contained in the header portion of each I-file, which is
then followed by the actual image data.
# Unix (see also 'to3d -help')
This command creates 'DDSPGR+orig.HEAD' and 'DDSPGR+orig.BRIK'.
Note the '-session' option here. In this way, 'to3d' knows to place
the resulting dataset files (the HEAD and BRIK files) in the top level
'afni' directory, which was created earlier as part of '$outlier_dir'.
# ----------------------------------------------------------------------
# put run 1 EPI data into a 3D+time AFNI dataset
# - run 1 will be the '-geomparent' for the other run data
# go to the directory containing the run data
cd ../EPI_data
to3d -session $topdir/afni -prefix DDr1 \
-save_outliers $outlier_dir/DDr1.outliers \
-orient SPR -zorigin 69 -epan \
-xSLAB 118.125S-I -ySLAB 118.125P-A -zSLAB 69R-61L \
-time:tz 110 27 2500 alt+z 3Ds:0:0:64:64:1:"DDr1*"
There are many options used here that were not needed for the creation
of the anatomical dataset, specifying the geometry and orientation of
the dataset. The difference here is that the input files, "DDr1_*",
do not contain any such information, where the SPGR data files, the
I-files, do. In fact, these files are just image files, containing
nothing but 4096 2-byte integers (they are 64x64 voxel images).
# Unix (see also 'to3d -help')
This command creates 'DDr1+orig.HEAD' and 'DDr1+orig.BRIK', and
stores them in the top 'afni' directory, where we put the SPGR
dataset, previously.
With the '-save_outliers' option, 'to3d' is told to store the outlier
information in a file under '$outlier_dir' (which was created earlier).
Note also the quotes around 'DDr1*'. Since an input image format
is specified (see 'to3d -help'), starting with '3Ds:0:0:64:64:1:',
the quotes around are filenames are actually required. Without them,
the shell will expand that expression,
3Ds:0:0:64:64:1:DDr1*
trying to find filenames that match. But the '3Ds:...' is the image
format information that we want to pass to 'to3d'. Only the 'DDr1*'
refers to the input files, and so it is put in quotes, in order to
pass that information directly to 'to3d'. The 'to3d' program must
expand the file list, not the shell.
# ----------------------------------------------------------------------
# put the other run data into 3D+time AFNI datasets
foreach run ( 2 3 4 )
to3d -geomparent $topdir/afni/DDr1+orig -session $topdir/afni \
-prefix DDr${run} -save_outliers $outlier_dir/DDr${run}.outliers \
-time:tz 110 27 2500 alt+z \
3Ds:0:0:64:64:1:"DDr${run}*"
end
Now that the geometry information has been provided for the run 1
dataset, that dataset can be used as the geometry parent for the others.
So for each of the other runs, use basically the same command as for
run 1, except replace the '-orient', '-zorigin', '-epan', '-xSLAB',
'-ySLAB' and '-zSLAB' options with '-geomparent'.
# Unix (see also 'to3d -help')
The foreach command takes a variable name and then a list of values within
parentheses. The script will process the body of the foreach loop once
for each value in parentheses. Within a single loop iteration, that
variable will take on the corresponding value.
In this case, 'run' is initially set to the value "2". Then the 'to3d'
command is executed, with 3 occurrences of that variable '$run'. So the
first iteration will execute the command:
to3d -geomparent $topdir/afni/DDr1+orig -session $topdir/afni \
-prefix DDr2 -save_outliers $outlier_dir/DDr2.outliers \
-time:tz 110 27 2500 alt+z \
3Ds:0:0:64:64:1:"DDr2*"
Note that $topdir will also be replaced by its value.
Once this command is finished, the loop starts the next iteration, where
'run' is set to the value "3". That corresponding 'to3d' command is
executed, 'run' is set to "4", the last 'to3d' command is executed,
and the loop terminates.
The body of a 'foreach' loop may contain many statements. In this case,
only a single 'to3d' command is used. The end of the 'foreach' body is
signified by the 'end' statement.
Something new to note here is the pair of '{', '}' characters around
our variable, '$run'. These are used to separate the variable from
surrounding characters. They are not always needed. In this case,
they are more of a visual separator than a syntactical one.
As an example, consider the EPI image file, DDr1_06.058 (this is for
run 1, slice 06, time point 058). Suppose variables were set for
'$run', '$slice', and '$time', and for some reason, it was desired to
print out the name of this file using them. One might try:
echo "file is: DDr$run_$slice.$time"
However, this would actually result in the error message:
run_: Undefined variable.
Since the '_' character is a valid variable name character, the shell
interprets all of '$run_' as the variable. In order to get syntactical
separation, it would be necessary to use '{' and '}':
echo "file is: DDr${run}_$slice.$time"
To get better visual separation, making it more clear, one might use:
echo "file is: DDr${run}_${slice}.${time}"
# ----------------------------------------------------------------------
# concatenate all 4 runs (time points 2 through 109) into a single dataset
# - this makes a dataset with 432 = 108*4 time points (0 through 431)
cd $topdir/afni
3dTcat -prefix DDrall \
DDr1+orig'[2..109]' DDr2+orig'[2..109]' \
DDr3+orig'[2..109]' DDr4+orig'[2..109]'
Go to the to top level 'afni' directory, where all of the datasets are
now located (via '-session' options). This includes the anatomical
dataset, and all 4 run datasets.
Put all the runs into a single 3d+time dataset. Sub-bricks 0 and 1 are
intentionally omitted from each run, leaving 108 sub-bricks from each
of the 4 run datasets. The program '3dTcat' will simply concatenate
the input datasets, resulting in a dataset with 432 time points, called
'DDrall+orig'. The data is unaltered.
# Unix (see also '3dTcat -help')
Note that the input datasets have sub-brick selectors in quotes. These
are to inform 3dTcat that, from 'DDr1+orig' for instance, the sub-bricks
from 2 through 109 should be copied into the resulting dataset, skipping
the first sub-bricks 0 and 1.
Recall that such quotes are necessary any time it is desired to prevent
the shell from processing the special characters like '[', ']' or '*',
as in the 'to3d' command, above.
# review the 3 types of quotes that have been used in this script, so far
The backward quotes, or ``, were used on the command assigning topdir.
Those quotes, and the command they contain, get replaced with the output
from that command, before the rest of the statement gets evaluated.
The single quotes, or '', were used in this '3dTcat' command. Single
quotes prevent the shell from evaluating any special character contained
between them.
The double quotes, or "", were used at the end of the 'to3d' command
within the 'foreach' loop. Double quotes are the same as single quotes,
except that they allow variable substitutions between them. At the end
of the 'to3d' command is the line:
"DDr${run}*"
This is a great example. Between the quotes can be found '${run}' and
'*'. Since '${run}' represents a variable substitution, the shell
replaces this with the corresponding value (either "2", "3" or "4",
depending on the iteration in the 'foreach' loop). However the '*'
character does not get evaluated, not within those quotes.
# ----------------------------------------------------------------------
# register each of the 432 sub-bricks (time points) to the last one
3dvolreg -dfile DDrallvrout -base 431 -prefix DDrallvr DDrall+orig
(see also '3dvolreg -help')
All of the run data is now in 'DDrall+orig', after the use of '3dTcat'.
Since the anatomical data was collected at the end of the scanning
session, after the last run, the data at the end of the final run
should be closely aligned with the anatomical data.
One thing to realize about 3-D volume registration is that it really
is a 3-D registration. In order to align one volume on top of another,
it can be rotated or shifted in any way. But it is merely moving one
3-D object around, to make it most closely aligned with another.
The DDrall+orig dataset has 432 sub-bricks, each of which is a 3-D
volume of the brain, taken within a single TR (2.5 seconds). This is
basically considered a single time point, a "snapshot" of the brain.
Each of the sub-bricks will be registered to the final one, number
431 (starting from 0). That is to say, each 3-D volume "snapshot"
will be shifted and rotated (by very small amounts) so that it is most
closely aligned with sub-brick 431.
The resulting dataset, where all sub-bricks are aligned, is 'DDrallvr',
as is specified by the '-prefix' option. One other output file is
created here, called 'DDrallvrout', which contains the resulting
parameters for every sub-brick registration.
#======================================================================
# We have our AFNI data, now analyze it (produce functional data bricks).
#======================================================================
# ----------------------------------------------------------------------
# make a directory to store the various text files
mkdir $topdir/afni/regressors
This is where the stimulus and regressor files will be stored
and processed.
# ----------------------------------------------------------------------
# go to the directory containing the permanent stimulus files
cd $topdir/stim_files
This directory holds the basic stimulus files. There are stimulus
files for each the 4 stimulus types, for each of the 4 runs (so 16
files).
For each stimulus type, create 1 long stimulus file, spanning all
4 runs. Except for the fact that the baseline may change from 1 run
to the next, the intention is to treat these 4 runs as 1 long run.
The long stimulus file (for each of the 4 stimulus types) will be
stored in the 'regressors' directory.
# ----------------------------------------------------------------------
# begin foreach loop, over the 4 stimulus types
foreach stim_type ( a t h l )
# Unix
While it is not necessary to indent the body of a 'foreach' loop, it
is highly recommended, simply to make the script easier to read.
# ----------------------------------------
# For this stimulus type, concatenate the stimulus timings
# from all 4 runs into a single file.
cat scan1$stim_type.txt scan2$stim_type.txt \
scan3$stim_type.txt scan4$stim_type.txt \
> $topdir/afni/regressors/scan1to4$stim_type.1D
# Unix
This 'cat' command is used to display the stimulus files sequentially.
Normally this data would then appear on the terminal window, except
that the output has been redirected to another file, using the '>'
symbol. So the 'scan1to4X.1D' file will be created, under the
'regressors' directory, where 'X' is the current stimulus type.
# ----------------------------------------
# also, can do:
# cat scan[1-4]$stim_type.txt \
# > $topdir/afni/regressors/scan1to4$stim_type.1D
# Unix
This alternate, commented command matches the one above. This command
shows an example of how to use '[' and ']', and what they mean to the
shell. Note that these characters are not contained within quotes, thus
allowing the shell to interpret them. The shell replaces '[]', and
whatever list is between them, with any single character in that list.
Suppose '$stim_type' is 'h' in this case. Then the command (ignoring
redirection) starts as 'cat scan[1-4]h.txt'. The shell then searches
for any filename that matches this expression, meaning any name
starting with 'scan', followed by a single matching character (either
'1', '2', '3' or '4'), followed by 'h.txt'. So the expression '[1-4]'
is replaced by any single character between '1' and '4'.
Expansion examples:
[1-4] : one of '1', '2', '3', or '4'
[1234] : one of '1', '2', '3', or '4'
[abc] : one of 'a', 'b', or 'c'
[a-c] : one of 'a', 'b', or 'c'
[qf7@] : one of 'q', 'f', '7', or '@'
# ----------------------------------------
# end foreach loop
end
# Unix
The 'end' statement terminates the 'foreach' loop. At this point, the
shell assigns 'stim_type' the next value in the control list (of 'a',
't', 'h' and 'l') and starts the loop body over.
If the entire list has been processed (i.e., if 'l' was the last
'stim_type'), then the shell moves on, past the 'end' statement.
# ----------------------------------------------------------------------
# create ideal reference function, one for each stimulus type
cd $topdir/afni/regressors
Recall that the 4 scan1to4X.1D files were placed under 'regressors'.
# ----------------------------------------
# create 'ideal' reference functions from the stimulus timing files
waver -dt 2.5 -GAM -input scan1to4a.1D > scan1to4a_hrf.1D
waver -dt 2.5 -GAM -input scan1to4t.1D > scan1to4t_hrf.1D
waver -dt 2.5 -GAM -input scan1to4h.1D > scan1to4h_hrf.1D
waver -dt 2.5 -GAM -input scan1to4l.1D > scan1to4l_hrf.1D
# Unix (see also 'waver -help')
It wouldn't save much, but this could be done with a 'foreach' loop:
foreach ST ( a t h l )
waver -dt 2.5 -GAM -input scan1to4${ST}.1D > scan1to4${ST}_hrf.1D
end
# ----------------------------------------------------------------------
# create a functional dataset, according to the stimulus timings
# and the contrasts
# ----------------------------------------
# go to the 'afni' directory, where all of the AFNI datasets are being put
cd $topdir/afni
# Unix
This is the same as 'cd ..', though it is more descriptive.
# ----------------------------------------
# actually create a functional dataset
3dDeconvolve -xout -input DDrallvr+orig -num_stimts 4 \
-stim_file 1 regressors/scan1to4a_hrf.1D -stim_label 1 Actions \
-stim_file 2 regressors/scan1to4t_hrf.1D -stim_label 2 Tool \
-stim_file 3 regressors/scan1to4h_hrf.1D -stim_label 3 HCMS \
-stim_file 4 regressors/scan1to4l_hrf.1D -stim_label 4 LCMS \
-full_first -fout -tout -concat $topdir/contrasts/runs.1D \
-glt 1 $topdir/contrasts/DDcontrv1.txt -glt_label 1 AvsT \
-glt 1 $topdir/contrasts/DDcontrv2.txt -glt_label 2 HvsL \
-glt 1 $topdir/contrasts/DDcontrv3.txt -glt_label 3 ATvsHL \
-bucket DDrallvrMRv1
# Unix (see also '3dDeconvolve -help')
Many of the input files for this command reside in different directories.
This command is being run from the top 'afni' directory, under which is
the regressors directory, which contains each of the ideal reference
function files, one per stimulus type.
The 3 contrast files are located under the 'contrasts' directory, along
with 'runs.1D'. Since this 'contrasts' directory is at the same level
as the 'afni' directory, each of the '$topdir/contrasts' directories
in this '3dDeconvolve' command could be replaced with '../contrasts',
without affecting the script.