NIfTI-1 Data Format
Extension and Customization
Robert W Cox, PhD
Director, Scientific and Statistical Computing Core
National Institute of Mental Health
National Institutes of Health
Department of Health and Human Services
United States of America

15 Apr 2004

Introduction
The biggest drawback to the NIfTI-1 format is the complete inability to customize or extend the header. The reasons for doing so are varied and do not need to be discussed here. This has been the major topic of discussion on the NIfTI-1 message board. This document contains my proposal (or rather, three nested proposals) for adding extra header fields to the NIfTI-1 format. At the end, I'll also address briefly some of the other issues that have been raised on the message board.

Extension and Customization of the Header
I propose a standard format for the "extra" header information that can fit between byte offset 348 and vox_offset in a *.nii file. This format is an XML-ish text format. The format described below is simplified so as to be easy to parse without the use of a full XML processor library. It is also close enough to XML to be able to be extracted bodily and incorporated into an XML document, with the addition of an XML prolog and an enclosing XML element (start and end tags).

Character Set:
The data in the extended header space will be in 8-bit text form (no Unicode or UTF-8). The character set will have the first 128 characters (lower 7 bits) be US-ASCII (ISO-646). It would be legal to use one of the ISO-8859-* character sets for non-English languages, but this might create some portability issues.

Elements:
The extended header data will be formatted as a sequential collection of elements. (The term is derived from XML. Unlike XML, an element may not contain another element, and there is no prolog.) The format of an element is

<NAME ATTRIBUTEsDATATEXT </NAME>     or     <NAME ATTRIBUTEs />
The second form can be used if the DATATEXT field is empty. The words in uppercase italics are defined below:

Rationale:
The choices above are guided by the principle that the extended header stuff should be easy to parse. For the truly lazy/amateur/hurried programmer, the following pseudo-algorithm would work to extract the element NAME and DATATEXT (on well-formed elements):

  1. Start at byte offset #348. Don't scan past byte offset vox_offset-1.
  2. Scan forward until the "<" character is found.
  3. From the current scan position, scan until a non-NAME character is found. You now have the element NAME.
  4. Scan forward until the next ">" character is found. If the character before that is "/", the data element is ended and there is no DATATEXT. Go back to 2.
  5. Scan forward until the next "<" character is found. DATATEXT is everything between the last ">" and this "<". Go back to 2.
Parsing attributes and expanding escapes is a little harder. Perhaps attributes should not be included in this revision. On the other hand, attributes can be very useful. See the Recap section, below, for further discussion.

I chose to propose an XML-ish format for these header extensions because I think that being able to extract this information into some external file/database may become important in the future. A simpler idea would be something of the form NAME = STRING, which would be trivial to parse. This was my first idea, and is more-or-less equivalent to the simplest level of this proposal (cf. Recap). But I actually favor including information about how to parse the STRING, which means we need attributes, which leads straight to XML-land.

Structured Elements:
As a higher level of organization, NIfTI-1 elements can contain attributes that define how to interpret the DATATEXT as a 2D table of numbers. To start with an example:

  <diffusion_encoding nifti:type="3*float" nifti:rows="4" >
       2.3 7.2 9.4
       6.2 -3.1 -1.0
       0.2 6.7 9.8
       6.6 7.9 -88.3 </diffusion_encoding>
The nifti:type attribute specifies that DATATEXT is to be interpreted as a sequence of sets of 3 floating point numbers; the nifti:rows attribute specifies that there will be 4 such sets. (The line breaks in the example are not required; arbitrary whitespace can be used to separate numbers inside DATATEXT.) A more complicated example mixes numeric types:
  <node_rgb nifti:type='int,3*float' nifti:rows='5'>
           7 0.8 0.1 0.9  11 0.0 0.1 0.9  17 0.9 0.9 0.0
          23 0.9 0.7 0.7  31 0.2 0.9 0.9
   </node_rgb>
Here, each "row" of values is to comprise 1 int and 3 floats, and there are 5 "rows".

The elementary types allowed in the nifti:type ATTRIBUTE would be int, float, complex, and String. This last type would be a way to encode a integer-string labeling table:

  <label_table nifti:type="int,String" nifti:rows="4>
     1 "Hippocampus"   2 "Amygdala"  3 "Brodmann 42"  4 "Who Knows?" </label_table>
The only reason not to allow a String type would be to minimize the parsing effort. On the other hand, if ATTRIBUTEs are included in NIfTI-1, then the code for parsing them would be reusable for parsing the DATATEXT Strings.

Standard Attribute:
A standard attribute can be defined to incorporate the information from the binary header. For example:

  <nifti_image
    nifti_type = 'NIFTI-1+'
    header_filename = 'elvis.nii'
    image_filename = 'elvis.nii'
    image_offset = '-1'
    ndim = '3'
    nx = '64' ny = '64' nz = '21'
    dx = '4' dy = '4' dz = '6'
    datatype = '4' datatype_name = 'INT16'
    nvox = '86016'
    nbyper = '2'
    byteorder = 'LSB_FIRST'
    cal_min = '3' cal_max = '25500'
    freq_dim = '1' phase_dim = '2' slice_dim = '3'
    slice_start = '0' slice_end = '20' slice_duration = '5.5' slice_code = '3'
    xyz_units = '2' time_units = '8'
    qform_code = '1' quatern_b = '0.7' quatern_c = '0' quatern_d = '0'
    qoffset_x = '0' qoffset_y = '0' qoffset_z = '0' qfac = '1'
  />
Other layouts could be designed. The image_offset value of -1 is intended to signal that the image data occurs flush to the end of the file; that is, the offset should be computed by examining the length of the file, then subtracting the number of bytes that the image data occupies (nvox*nbyper).

Namespaces (of a sort):
To prevent overlap in element names between different developers, there are two non-exclusive possible ways to proceed. Both methods would make a NAME be of the form "prefix:suffix". The prefix part would be used to distinguish the originator of the NAME. (The use of a colon to separate NAMEs into distinct pieces is compatible with the XML namespace idea, and is partly where I got the idea.)

The first method is to have a central "registry" for name prefixes (presumably the nifti.nimh.nih.gov site), and start off with some pre-defined prefixes such as afni:, brainvoyager:, caret:, dicom:, freesurfer:, fsl:, mni:, nifti:, and spm: (and anything else at all popular). Anyone who wants a new prefix assigned can ask. I don't think the demand will be too high, so that maintaining the registry would be a low-key activity.

The second method is for each site to use as a prefix some unique fully qualified domain name from their institution, such as nimh.nih.gov:. The advantage of the second way is that no central registry is needed. The disadvantage is that NAMEs will be longer.

As an example, DICOM attributes could be included in a NIfTI-1 extended header with elements of the form

  <dicom:0020_0037 nifti:type="6*float">
    0.984808 0.173648 0.0 -0.173648 0.984808 0.0 </dicom:0020_0037>
(This particular example is the "Image Orientation (Patient)" DICOM field.)

Recap:
There are 3 levels to this extended header proposition:

  1. Simple elements of the form <NAME> DATATEXT </NAME> or <NAME/>, with no attributes. These are easiest to parse. Namespaces are part of this level of the proposal.
  2. Add attributes, but don't have any NIfTI-1 defined meaning for any of them.
  3. Add the nifti:type and nifti:rows attributes to give a standardized way to specify the interpretation of the DATATEXT part of an element.
I can provide a C implementation of any of these levels. This proposal is a simplified version of the NIML protocol/format that is already implemented in the AFNI+SUMA package, and so incorporating it to the NIfTI-1 sample code will be relatively easy.

Finally, the extended header is designed to supplement the NIfTI-1 binary header, not replace it. Applications should be able to skip the extended header entirely and still be able to read the image data. Applications that do use the extended header should be prepared to deal with (e.g., ignore) elements with which they are not familiar.


Other Issues
Other issues that have come up on the NIfTI-1 message board include: