AFNI Message Board

Dear AFNI users-

We are very pleased to announce that the new AFNI Message Board framework is up! Please join us at:

https://discuss.afni.nimh.nih.gov

Existing user accounts have been migrated, so returning users can login by requesting a password reset. New users can create accounts, as well, through a standard account creation process. Please note that these setup emails might initially go to spam folders (esp. for NIH users!), so please check those locations in the beginning.

The current Message Board discussion threads have been migrated to the new framework. The current Message Board will remain visible, but read-only, for a little while.

Sincerely, AFNI HQ

History of AFNI updates  

|
September 26, 2022 11:17PM
Hi Anthony,

Sorry for the delay - thanks Gang for the heads-up and Brian for reinstating my message board account!

Quick answer:
------------------
The SVM decision boundary is defined by a weight vector W and a scalar bias term w_0 (svm-light and thus 3dsvm uses b instead of w_0). For the linear kernel case, W is the same size as the input data. So if you are using whole brain data, you can overlay W as a whole-brain map. W is the SVM solution to whatever your labeled training data gave 3dsvm.

A bit more:
---------------
Let’s call your training voxels X_t (Real numbers) and your class labels y_t {-1, 1}. (Note under the hood 3dsvm takes 0, 1 labels and maps them to -1, 1). The weight vector, W = SUM_t(alpha_t * y_t * X_t). (Note * is just simple multiplication). I'm summing over t, but you could be doing other things, like summing over subjects.

NOTE: If you specify –alpha alpha_file_name in 3dsvm, you will get a file that has a value for every t. These values are the non-negative alpha_t * y_t. Because it is multiplied by y_t, for the negative class (smaller class label in 3dsvm), you will get negative values in the alpha file.

For fun and some intuition, what if all of the alphas were equal and we broke up the summation into class 1 and class -1?
W = SUM_+(alpha_+ * X_+) – SUM_-(alpha_- * X_-) (here instead of _t, _+ is supposed to be all the t’s labeled +1 and _- is supposed to be all of the t’s labeled -1).

This is really similar to just taking the average of all of the class 1 volumes and subtracting it from the average of all of the class -1 volumes. You can try this in AFNI (e.g. with 3dcalc), and it will probably give you an OK looking map!

Of course the SVM alphas are generally not the same for every X_t, so in a sense SVM is giving us a smart, weighted average. Larger magnitude alphas make that X_t “count” more. If alpha equals 0, then that X_t doesn’t “count” at all! Any X_t with an alpha > 0 (or abs(y*alpha) > 0) is a support vector.
You could try multiplying each volume by the value in the alpha file outside of 3dsvm to verify.

Even more detail (sure to be either too much or not enough – could it be both simultaneously?):
-----------------------
The alphas are Lagrange multipliers for solving the SVM margin constraints. The SVM approach is to minimize the norm of W subject to y_t*( dot(X_t,W) + b ) ge 1. Here dot(u,v) is the dot product of vectors u and v and ge is “greater than or equal to.” Going back to your question, the exact weighting function comes from this. The alphas*y are the weights. Any X_t that has a non-zero alpha_t is a support vector.

One more try:
-------------------
There are lots of good tutorials that set up the SVM quadratic programming problem, which ultimately lead to the Lagrange multipliers. I’ve never seen anyone call this a weighted sum of the training data, but it is and I hope that gives you some intuition. The other shortcoming of tutorials is that it is convenient to work in 2D and let the reader generalize their thoughts to higher dimensions. In those 2D plots, think of each point as single fMRI volume. Thus you would have tens of thousands of dimensions instead of 2, but each volume would still just be a single point in that N-dimensional space. Now also think about some extremely simple cases in 2D: (with class 0 as "#" and class 1 as "^" as points in 2D space).

Voxel 2 hyperplane
| |
| # | ^ ^ ^
| # # |
| # | ^ ^
___|_________|___________ Voxel 1
| |

For example, the above shows 2 voxels and 9 TRs.

And


Voxel 2
| # # # #
-----|--------------------------------------------- hyperplane
| ^^^ ^ ^
___|__________________________ Voxel 1
|


In the first case, what I am attempting to illustrate is a hyperplane that is perpendicular to voxel 1 and parallel to voxel 2. Voxel 1 is doing all of the work! (Remember you are mapping back to voxels and trying to understand how that relates to the decision boundary). In the second case, the hyperplane is horizontal and voxel 2 should be more prominent in a weight vector brain map.

Hope this helps,
Steve LaConte
Subject Author Posted

3dSVM Weighting Function

AnthonyA September 19, 2022 01:52PM

Re: 3dSVM Weighting Function

slaconte September 26, 2022 11:17PM