# 3dClusterize¶

## PURPOSE¶

```
This program is for performing clusterizing: one can perform voxelwise
thresholding on a dataset (such as a statistic), and then make a map
of remaining clusters of voxels larger than a certain volume. The
main output of this program is a single volume dataset showing a map
of the cluster ROIs.
As of Apr 24, 2020, this program now behaves less (unnecessarily)
guardedly when thresholding non-stat volumes. About time, right?
This program is specifically meant to reproduce behavior of the muuuch
older 3dclust, but this new program:
+ uses simpler syntax (hopefully);
+ includes additional clustering behavior such as the '-bisided ...'
variety (essentially, two-sided testing where all voxels in a
given cluster come from either the left- or right- tail, but not
mixed);
+ a mask (such as the whole brain) can be entered in;
+ voxelwise thresholds can be input as statistic values or p-values.
This program was also written to have simpler/more direct syntax of
usage than 3dclust. Some minor options have been carried over for
similar behavior, but many of the major option names have been
altered. Please read the helps for those below carefully.
This program was cobbled together by PA Taylor (NIMH, NIH), but it
predominantly uses code written by many legends: RW Cox, BD Ward, MS
Beauchamp, ZS Saad, and more.
```

## USAGE¶

### Input:¶

```
+ A dataset of one or more bricks
+ Specify an index of the volume to threshold
+ Declare a voxelwise threshold, and optionally a cluster-volume
threshold
+ Optionally specify the index an additional 'data' brick
+ Optionally specify a mask
```

### Output:¶

```
+ A report about the clusters (center of mass, extent, volume,
etc.) that can be dumped into a text file.
+ Optional: A dataset volume containing a map of cluster ROIs
(sorted by size) after thresholding (and clusterizing, if
specified).
That is, a data set where the voxels in the largest cluster all
have a value 1, those in the next largest are all 2, etc.
+ Optional: a cluster-masked version of an input data set. That is,
the values of a selected data set (e.g., effect estimate) that fall
within a cluster are output unchanged, and those outside a cluster
are zeroed.
+ Optional: a mask.
```

### Explanation of 3dClusterize text report:¶

```
The following columns of cluster summary information are output
for quick reference (and please see the asterisked notes below
for some important details on the quantities displayed):
Nvoxel : Number of voxels in the cluster
CM RL : Center of mass (CM) for the cluster in the Right-Left
direction (i.e., the coordinates for the CM)
CM AP : Center of mass for the cluster in the
Anterior-Posterior direction
CM IS : Center of mass for the cluster in the
Inferior-Superior direction
minRL, maxRL : Bounding box for the cluster, min and max
coordinates in the Right-Left direction
minAP, maxAP : Min and max coordinates in the Anterior-Posterior
direction of the volume cluster
minIS, maxIS : Min and max coordinates in the Inferior-Superior
direction of the volume cluster
Mean : Mean value for the volume cluster
SEM : Standard Error of the Mean for the volume cluster
Max Int : Maximum Intensity value for the volume cluster
MI RL : Coordinate of the Maximum Intensity value in the
Right-Left direction of the volume cluster
MI AP : Coordinate of the Maximum Intensity value in the
Anterior-Posterior direction of the volume cluster
MI IS : Coordinate of the Maximum Intensity value in the
Inferior-Superior direction of the volume cluster
* The CM, Mean, SEM, Max Int and MI values are all calculated using
using the '-idat ..' subvolume/dataset. In general, those peaks
and weighted centers of mass will be different than those of the
'-ithr ..' dset (if those are different subvolumes).
* CM values use the absolute value of the voxel values as weights.
* The program does not work on complex- or rgb-valued datasets!
* SEM values are not realistic for interpolated data sets! A
ROUGH correction is to multiply the SEM of the interpolated data
set by the square root of the number of interpolated voxels per
original voxel.
* Some summary or 'global' values are placed at the bottoms of
report columns, by default. These include the 'global' volume,
CM of the combined cluster ROIs, and the mean+SEM of that
Pangaea.
```

## COMMAND OPTIONS¶

```
-inset III :Load in a dataset III of one or more bricks for
thresholding and clusterizing; one can choose to use
either just a single sub-brick within it for all
operations (e.g., a 'statistics' brick), or to specify
an additional sub-brick within it for the actual
clusterizing+reporting (after the mask from the
thresholding dataset has been applied to it).
-mask MMM :Load in a dataset MMM to use as a mask, within which
to look for clusters.
-mask_from_hdr :If 3dClustSim put an internal attribute into the
input dataset that describes a mask, 3dClusterize will
use this mask to eliminate voxels before clustering,
if you give this option (this is how the AFNI
Clusterize GUI works by default). If there is no
internal mask in the dataset header, then this
doesn't do anything.
-out_mask OM :specify that you wanted the utilized mask dumped out
as a single volume dataset OM. This is probably only
really useful if you are using '-mask_from_hdr'. If
not mask option is specified, there will be no output.
-ithr j :(required) Uses sub-brick [j] as the threshold source;
'j' can be either an integer *or* a brick_label string.
-idat k :Uses sub-brick [k] as the data source (optional);
'k' can be either an integer *or* a brick_label string.
If this option is used, thresholding is still done by
the 'threshold' dataset, but that threshold map is
applied to this 'data' set, which is in turn used for
clusterizing and the 'data' set values are used to
make the report. If a 'data' dataset is NOT input
with '-idat ..', then thresholding, clustering and
reporting are all done using the 'threshold' dataset.
-1sided SSS TT :Perform one-sided testing. Two arguments are required:
SSS -> either 'RIGHT_TAIL' (or 'RIGHT') or 'LEFT_TAIL'
(or 'LEFT') to specify which side of the
distribution to test.
TT -> the threshold value itself.
See 'NOTES' below to use a p-value as threshold.
-2sided LL RR :Perform two-sided testing. Two arguments are required:
LL -> the upper bound of the left tail.
RR -> lower bound of the right tail.
*NOTE* that in this case, potentially a cluster could
be made of both left- and right-tail survivors (e.g.,
both positive and negative values). For this reason,
probably '-bisided ...' is a preferable choice.
See 'NOTES' below to use a p-value as threshold.
-bisided LL RR :Same as '-2sided ...', except that the tails are tested
independently, so a cluster cannot be made of both.
See 'NOTES' below to use a p-value as threshold.
-within_range AA BB
:Perform a kind of clustering where a different kind of
thresholding is first performed, compared to the above
cases; here, one keeps values within the range [AA, BB],
INSTEAD of keeping values on the tails. Is this useful?
Who knows, but it exists.
See 'NOTES' below to use a p-value as threshold.
-NN {1|2|3} :Necessary option to specify how many neighbors a voxel
has; one MUST put one of 1, 2 or 3 after it:
1 -> 6 facewise neighbors
2 -> 18 face+edgewise neighbors
3 -> 26 face+edge+cornerwise neighbors
If using 3dClustSim (or any other method), make sure
that this NN value matches what was used there. (In
many AFNI programs, NN=1 is a default choice, but BE
SURE YOURSELF!)
-clust_nvox M :specify the minimum cluster size in terms of number
of voxels M (such as output by 3dClustSim).
-clust_vol V :specify the minimum cluster size in terms of volume V,
in microliters (requires knowing the voxel
size). Probably '-clust_nvox ...' is more useful.
-pref_map PPP :The prefix/filename of the output map of cluster ROIs.
The 'map' shows each cluster as a set of voxels with the
same integer. The clusters are ordered by size, so the
largest cluster is made up of 1s, the next largest of 2s,
etc.
(def: no map of clusters output).
-pref_dat DDD :Including this option instructs the program to output
a cluster-masked version of the 'data' volume
specified by the '-idat ..' index. That is, only data
values within the cluster ROIs are included in the
output volume. Requires specifying '-idat ..'.
(def: no cluster-masked dataset output).
-1Dformat :Write output in 1D format (now default). You can
redirect the output to a .1D file and use the file
as input to whereami_afni for obtaining Atlas-based
information on cluster locations.
See whereami_afni -help for more info.
-no_1Dformat :Do not write output in 1D format.
-summarize :Write out only the total nonzero voxel count and
volume for each dataset
-nosum :Suppress printout of the totals
-quiet :Suppress all non-essential output
-outvol_if_no_clust: flag to still output an (empty) vol if no
clusters are found. Even in this case, no report is
is produced if no clusters are found. This option is
likely used for some scripting scenarios; also, the
user would still need to specify '-pref_* ...' options
as above in order to output any volumes with this opt.
(def: no volumes output if no clusters found).
-orient OOO :in the output report table, make the coordinate
order be 'OOO' (def: RAI, the DICOM standard);
alternatively, one could set the environment variable
AFNI_ORIENT (see the file README.environment).
NB: this only affects the coordinate orientation in the
*text table*; the dset orientation of the output
cluster maps and other volumetric data will match that
of the input dataset.
-abs_table_data :(new, from Apr 29, 2021) Use the absolute value of voxel
intensities (not the raw values) for calculation of the
mean and Standard Error of the Mean (SEM) in the report
table. Prior to the cited date, this was default behavior
(with '-noabs' switching out of it) but no longer.
### -noabs :(as of Apr 29, 2021, this option is no longer needed)
Previously this option switched from using default absolute
values of voxel intensities for calculation of the mean
and Standard Error of the Mean (SEM). But this has now
changed, and the default is to just use the signed values
themselves; this option will not cause an error, but is not
needed. See '-abs_table_data' for reporting abs values.
-binary :This turns the output map of cluster ROIs into a binary
(0 or 1) mask, rather than a cluster-index mask.
If no clusters are found, the mask is not written!
(def: each cluster has separate values)
```

## NOTES¶

### Saving the text report¶

```
To save the text file report, use the redirect '>' after the
3dClusterize command and dump the text into a separate file of
your own naming.
```

### Using p-values as thresholds for statistic volumes¶

```
By default, numbers entered as voxelwise thresholds are assumed to
be appropriate statistic values that you have calculated for your
desired significance (e.g., using p2dsetstat). HOWEVER, if you
just want to enter p-values and have the program do the conversion
work for you, then do as follows: prepend 'p=' to your threshold
number.
- For one-sided tests, the *_TAIL specification is still used, so
in either case the p-value just represents the area in the
statistical distribution's tail (i.e., you don't have to worry
about doing '1-p'). Examples:
-1sided RIGHT_TAIL p=0.005
-1sided LEFT_TAIL p=0.001
- For the two-sided/bi-sided tests, the a single p-value is
entered to represent the total area under both tails in the
statistical distribution, which are assumed to be symmetric.
Examples:
-bisided p=0.001
-2sided p=0.005
If you want asymmetric tails, you will have to enter both
threshold values as statistic values (NB: you could use
p2dsetstat to convert each desired p-value to a statistic, and
then put in those stat values to this program).
You will probably NEED to have negative signs for the cases of
'-1sided LEFT_TAIL ..', and for the first entries of '-bisided ..'
or '-2sided ..'.
You cannot mix p-values and statistic values (for two-sided
things, enter either the single p-value or both stats).
You cannot use this internal p-to-stat conversion if the volume
you are thresholding is not recognized as a stat.
```

### Performing appropriate testing¶

```
Don't use a pair of one-sided tests when you *should* be using a
two-sided test!
```

## EXAMPLES¶

```
1. Take an output of FMRI testing (e.g., from afni_proc.py), whose
[1] brick contains the effect estimate from a statistical model and
whose [2] brick contains the associated statistic; use the results
of 3dClustSim run with NN=1 (here, a cluster threshold volume of 157
voxels) and perform one-sided testing with a threshold at an
appropriate value (here, 3.313).
3dClusterize \
-inset stats.FT+tlrc. \
-ithr 2 \
-idat 1 \
-mask mask_group+tlrc. \
-NN 1 \
-1sided RIGHT_TAIL 3.313 \
-clust_nvox 157 \
-pref_map ClusterMap
2. The same as Ex. 1, but using bisided testing (two sided testing
where the results of each tail can't be joined into the same
cluster). Note, the tail thresholds do NOT have to be symmetric (but
often they are). Also, here we output the cluster-masked 'data'
volume.
3dClusterize \
-inset stats.FT+tlrc. \
-ithr 2 \
-idat 1 \
-mask mask_group+tlrc. \
-NN 1 \
-bisided -3.313 3.313 \
-clust_nvox 157 \
-pref_map ClusterMap \
-pref_dat ClusterEffEst
3. The same as Ex. 2, but specifying a p-value to set the voxelwise
thresholds (in this case, tails DO have to be symmetric).
3dClusterize \
-inset stats.FT+tlrc. \
-ithr 2 \
-idat 1 \
-mask mask_group+tlrc. \
-NN 1 \
-bisided p=0.001 \
-clust_nvox 157 \
-pref_map ClusterMap \
-pref_dat ClusterEffEst
4. Threshold a non-stat dset.
3dClusterize \
-inset anat+orig \
-ithr 0 \
-idat 0 \
-NN 1 \
-within_range 500 1000 \
-clust_nvox 100 \
-pref_map ClusterMap \
-pref_dat ClusterEffEst
```