If these masks have been generated based on a moving neighborhood, would it be important to have access to the intended "center" of the neighborhood?
It seems likely that if the shapes of the neighborhoods have been processed a lot then the "center" of the neighborhood might not be obvious anymore when working backwards from only the binary mask. This could possibly be a problem when trying to create a map of the output.
So here's a different idea:
If each location in space maps to at most one mask, then another idea
would be to create a BRIK as a 3D image plust trailing binary data. Linearly the file would look like:
[ spatial map of offsets into mask data ][ mask data ]
The first part of the BRIK is a standard 3D AFNI image that stores offsets into the binary data that follows. Voxels that don't have an associated mask would be zero (which is obviously an invalid offset). The trailing binary mask data would be sparse lists of indices.
If the masks are stored in the same linear order as the brik data, then the end of each mask can be found implicitly from the offset of the next non-zero voxel in the map portion of the file. For the last mask in the file either you accept EOF as the end of the mask or you create a phantom voxel at the beginning of the mask data that stores an offset for the end of the file.
In theory you could store any voxel wise data not only masks.
--judd