Flexible Precision Images

December 3, 2000

This document describes the FPBM (Flexible Precision Buffer Map) file format for images and animations introduced with LightWave 6.0.

Introduction

An image is a rectangular array of values. The image data generated by LightWave includes not only the red, green and blue levels of each pixel in the rendered image, but also values at each pixel for the alpha level, z-depth, shading, reflectivity, surface normal, 2D motion, and other buffers used during rendering. Most of these quantities are represented internally as floating-point numbers, and all may change over time. Existing image and animation file formats are inadequate for storing all of this information, which is the motivation for the new FPBM format.

The data channels in an FPBM are called layers, and each layer can store values as 8-bit or 16-bit integers or as 32-bit floating-point numbers. A set of these layers for a given animation time is called a frame. An FPBM containing a single frame is a still image, and one containing a time sequence of frames is an animation.

(The descriptions here of the animation features of FPBM should be considered preliminary. They haven't been implemented in LightWave yet.)

Chunks

The FPBM format is based on the metaformat for binary files described in EA IFF 85 Standard for Interchange Format Files. (See also ILBM, an earlier IFF image format.) The basic structural element in an IFF file is the chunk. A chunk consists of a four-byte ID tag, a four-byte chunk size, and size bytes of data. If the size is odd, the chunk is followed by a 0 pad byte, so that the next chunk begins on an even byte boundary. (The pad byte isn't counted in the size.)

A chunk ID is a sequence of 4 bytes containing 7-bit ASCII values, usually upper-case printable characters, used to identify the chunk's data. ID tags can be interpreted as unsigned integers for comparison purposes. They're typically constructed using macros like the following.

   #define CKID_(a,b,c,d) (((a)<<24)|((b)<<16)|((c)<<8)|(d))
   #define ID_FORM CKID_('F','O','R','M')
   #define ID_FPBM CKID_('F','P','B','M')
   ...

FPBM files start with the four bytes FORM followed by a four-byte integer giving the length of the file (minus 8) and the four byte ID FPBM. The remainder of the file is a collection of chunks containing layer data.

To be read, IFF files must be parsed. FPBM files are pretty uniform, but in general the order in which chunks can occur in an IFF file isn't fixed. You may encounter chunks or layer types that aren't defined here, which you should be prepared to skip gracefully if you don't understand them. You can do this by using the chunk size to seek to the next chunk. And you may encounter chunk sizes that differ from those implied here. Readers must respect the chunk size. Missing data should be given default values, and extra data, which the reader presumably doesn't understand, should be skipped.

Data Basics

The data in an FPBM will be described in this document using C language conventions. Chunks will be represented as structures, and the values within each structure will be defined as the C basic types short or float. As used here, a short is a signed, two's complement, 16-bit integer, and a float is a 32-bit IEEE floating-point number. All data in an FPBM is written in big-endian (Motorola, Internet) byte order. Programs running in environments (primarily Microsoft Windows) that use a different byte order must swap bytes after reading and before writing.

File Structure

Structurally, FPBMs are quite simple.

   FORM formsize FPBM
   FPHD 28 FPHeader
   for each frame
      FLEX 2 numLayers
      for each layer
         LYHD 20 LayerHeader
         LAYR datasize data

The header is followed by one or more frames. Each frame begins with a layer count, and this is followed by the layers. Each layer begins with a header describing the data it contains.

The following sections describe the FPHD, FLEX and LYHD chunks.

FPHD - Flexible Precision Header

The FPHeader contains information that applies globally to all of the frames in the file. It appears first in the file, after the FORM prefix and before the first frame.

   typedef struct st_FPHeader {
      short width;
      short height;
      short numLayers;
      short numFrames;
      short numBuffers;
      short flags;
      short srcBytesPerLayerPixel;
      short pad2;
      float pixelAspect;
      float pixelWidth;
      float framesPerSecond;
   } FPHeader;
width, height
Pixel dimensions of the image.
numLayers
The maximum number of layers per frame. Some layers may not be stored for all frames, since their contents may not differ from frame to frame.
numFrames
Number of frames. For still images, this will be 1.
numBuffers
Number of animation buffers. This affects the interpretation of delta encoded layer data. When the file is written for single buffered playback, the number of buffers is 1 and the deltas are relative to the previous frame. For double buffered playback, the deltas are relative to the frame two frames back, since the new frame is drawn over the contents of the back buffer. This field is ignored for still images.
flags
One or more of the following flag values, combined using bitwise-or.
Source_Int (0 << 0)
Source_FP (1 << 0)
The natural representation of the data, or the way the data was stored before being written to the file, either integer (the default) or floating-point. This can differ from the way the data is actually stored in the file (specified for each layer in the LayerHeader flags field). Readers may wish to restore the data to its original representation using this information. It can also be used to indicate the precision of the source data.
InterlaceFlag (1 << 1)
Scanlines should be interlaced (field rendered) for playback. (The actual interlace state is stored for each layer in the LayerHeader.)
srcBytesPerLayerPixel
Use this in combination with the Source_Int and Source_FP flags to determine the natural data type for the data in the file, or the type in which the data was stored before it was written to the file. The most common values for this field are 1, 2 and 4. The actual size of a pixel in each layer is in the LayerHeader's bytesPerLayerPixel field.
pad2
Reserved for future use.
pixelAspect
Pixel aspect ratio expressed as width divided by height.
pixelWidth
Pixel width in millimeters. This fixes the size of the image for print. To calculate horizontal and vertical DPI (dots per inch) from this value,
hdpi = 25.4 / pixelWidth
vdpi = 25.4 / (pixelWidth * pixelAspect)
framesPerSecond
Number of frames per second for animations. Writers may set this to 0.0 for still images.

FLEX - Frame Header

The FrameHeader appears at the start of each frame. This chunk may grow in the future to include other information.

   typedef struct st_FrameHeader {
      short numLayers;
   } FrameHeader;
numLayers
Number of layers in this frame.

LYHD - Layer Header

The LayerHeader appears at the start of each layer to describe the layer's contents.

   typedef struct st_LayerHeader {
      short flags;
      short layerType;
      short bytesPerLayerPixel;
      short compression;
      float blackPoint;
      float whitePoint;
      float gamma;
   } LayerHeader; 
flags
One or more of the following flags, combined using bitwise-or.
Layer_Int (0 << 0)
Layer_FP (1 << 0)
Data in the layer is integer (the default) or floating-point.
Layer_Interlace (1 << 1)
Scanlines are interlaced (field rendered).
Layer_EvenField (0 << 2)
Layer_OddField (1 << 2)
Field dominance for interlaced layers. This indicates which field is displayed first in time.
layerType
The data channel contained in the layer. Possible values include
Layer_MONO 0
Monochrome (grayscale) image channel.
Layer_RED 1
Layer_GREEN 2
Layer_BLUE 3
Layer_ALPHA 4
Color and alpha channels.
Layer_OBJECT 5
Object ID.
Layer_SURFACE 6
Surface or material ID.
Layer_COVERAGE 7
Object transparency/antialiasing.
Layer_ZDEPTH 8
Layer_WDEPTH 9
The Z depth is the distance from the camera to the nearest object visible in a pixel. Strictly speaking, this is the perpendicular distance from the plane defined by the camera's position and view vector. The W depth buffer contains the inverse of Z.
Layer_GEOMETRY 10
The values in this buffer are the dot-products of the surface normals with the eye vector (or the cosine of the angle of the surfaces to the eye). They reveal something about the underlying shape of the objects in the image. Where the value is 1.0, the surface is facing directly toward the camera, and where it's 0, the surface is edge-on to the camera.
Layer_SHADOW 11
Indicates where shadows are falling in the final image. It may also be thought of as an illumination map, showing what parts of the image are visible to the lights in the scene.
Layer_SHADING 12
A picture of the diffuse shading and specular highlights applied to the objects in the scene. This is a component of the rendering calculations that depends solely on the angle of incidence of the lights on a surface. It doesn't include the effects of explicit shadow calculations.
Layer_DFSHADING 13
Layer_SPSHADING 14
Like the Layer_SHADING buffer, but these store the amount of diffuse and specular shading (highlighting) separately, rather than adding them together.
Layer_TEXTUREU 15
Layer_TEXTUREV 16
Layer_TEXTUREW 17
Texture coordinates.
Layer_NORMALX 18
Layer_NORMALY 19
Layer_NORMALZ 20
Normal vector. This is the geometric normal of the object surface visible in each pixel.
Layer_REFLECT 21
Reflection.
Layer_MOTIONX 22
Layer_MOTIONY 23
Support for 2D vector-based motion blur. These buffers contain the pixel distance moved by the item visible in each pixel. The amount of movement depends on the camera exposure time and includes the effects of the camera's motion.
bytesPerLayerPixel
Number of bytes per pixel, usually 1, 2 or 4.
compression
One of the following compression codes.
NoCompression 0
Data is uncompressed.
HorizontalRLE 1
VerticalRLE 3
Run-length encoding (RLE). The horizontal type is identical to the byteRun1 RLE encoding used in ILBM and the output of the Macintosh PackBits function. The vertical type compresses along columns rather than rows. The compressor treats the data as a sequence of bytes, regardless of the data type of the layer's values.
HorizontalDelta 2
VerticalDelta 4
Delta encoding for animation. Only the parts of the image that differ from a previous frame are written.

The RLE and delta methods are described in more detail below.

blackPoint, whitePoint
The nominal minimum and maximum buffer levels. These define the dynamic range of the data in the layer. Typical values for RGB layers are 0.0 and 1.0.
gamma
Linearity of the data. This and the black and white points are used primarily to encode RGB levels for different display devices. The default is 1.0.

Layer Data

The data for a layer is written in a LAYR chunk that immediately follows the layer's LayerHeader. The data is a rectangular array of values. The origin is the top left corner, and before compression, values are stored from left to right, and rows from top to bottom. No padding is added to the end of any row.

When the compression type is NoCompression, this is also how the layer is written in the file. The number of bytes in one row is

   rowbytes = LayerHeader.bytesPerLayerPixel * FPHeader.width;

The number of rows is FPHeader.Height, and the total number of bytes of layer data (and the LAYR chunk size) is

   layerbytes = rowbytes * FPHeader.height;

RLE Compression

The following psuedocode illustrates how RLE-compressed bytes are unpacked.

   loop
      read the next source byte into n
      if n >= 0
         copy the next n + 1 bytes literally
      else if n < 0
         replicate the next byte -n + 1 times
   until the row or column is full

The unpacker reads from the source (compressed data in a LAYR chunk) and writes to a destination (a memory buffer). For horizontal RLE, the destination pointer is incremented by 1 for each decoded byte, while for vertical RLE, the destination pointer is incremented by rowbytes bytes.

Each row (or column) is separately packed. In other words, runs never cross rows (or columns).

In the inverse routine (the packer), it's best to encode a 2 byte repeat run as a replicate run except when preceded and followed by a literal run, in which case it's best to merge the three into one literal run. Always encode 3 byte repeats as replicate runs.

Delta Compression

The delta compression method uses RLE, but it adds a mechanism for skipping bytes that haven't changed. This is used when storing animation frames. The skipped bytes retain the values stored there by a previous frame.

   loop
      read the next source byte into nc
      if nc < 0
         skip ahead -nc columns
      else
         for i = 0 to nc
            read the next source byte into nr
            if nr < 0
               skip ahead -nr rows
            else
               unpack rle encoded span of size nr + 1
   until the layer is full

Example Code

The unpackRLE function decodes RLE compressed data. psrc points to the source pointer. The function advances the source pointer as it decodes the compressed bytes. dst is the destination buffer where decoded bytes are written. size is the RLE span, or the number of destination bytes that should be produced. This is typically rowbytes for horizontal RLE and FPHeader.Height for vertical RLE. step is the number of bytes that the destination pointer should be moved after each decoded byte is written, typically 1 for horizontal and rowbytes for vertical. The function returns TRUE if it succeeds and FALSE otherwise.

   int unpackRLE( char **psrc, char *dst, int size, int step )
   {
      int c, n;
      char *src = *psrc;

      while ( size > 0 ) {
         n = *src++;

         if ( n >= 0 ) {
            ++n;
            size -= n;
            if ( size < 0 ) return FALSE;
            while ( n-- ) {
               *dst = *src++;
               dst += step;
            }
         }
         else {
            n = -n + 1;
            size -= n;
            if ( size < 0 ) return FALSE;
            c = *src++;
            while ( n-- ) {
               *dst = c;
               dst += step;
            }
         }
      }
      *psrc = src;
      return TRUE;
   }

The packRLE function reads uncompressed bytes from the source buffer and writes encoded bytes to the destination. It returns the number of bytes written to the destination (the packed size of the source bytes).

   #define DUMP    0
   #define RUN     1
   #define MINRUN  3
   #define MAXRUN  128
   #define MAXDUMP 128

   int packRLE( char *src, char *dst, int size, int step )
   {
      char c, lastc;
      int
         mode = DUMP,
         rstart = 0,
         putsize = 0,
         sp = 1,
         i;

      lastc = *src;
      size--;

      while ( size > 0 ) {
         c = *( src + sp * step );
         sp++;
         size--;

         switch ( mode ) {
            case DUMP:
               if ( sp > MAXDUMP ) {
                  *dst++ = sp - 2;
                  for ( i = 0; i < sp - 1; i++ )
                     *dst++ = *( src + i * step );
                  putsize += sp;
                  src += ( sp - 1 ) * step;
                  sp = 1;
                  rstart = 0;
                  break;
               }

               if ( c == lastc ) {
                  if (( sp - rstart ) >= MINRUN ) {
                     if ( rstart > 0 ) {
                        *dst++ = rstart - 1;
                        for ( i = 0; i < rstart; i++ )
                           *dst++ = *( src + i * step );
                        putsize += rstart + 1;
                     }
                     mode = RUN;
                  }
                  else if ( rstart == 0 ) mode = RUN;
               }
               else rstart = sp - 1;
               break;

            case RUN:
               if (( c != lastc ) || ( sp - rstart > MAXRUN )) {
                  *dst++ = rstart + 2 - sp;
                  *dst++ = lastc;
                  putsize += 2;
                  src += ( sp - 1 ) * step;
                  sp = 1;
                  rstart = 0;
                  mode = DUMP;
               }
         }
         lastc = c;
      }

      switch ( mode ) {
         case DUMP:
            *dst++ = sp - 1;
            for ( i = 0; i < sp; i++ )
               *dst++ = *( src + i * step );
            putsize += sp + 1;
            break;

         case RUN:
            *dst++ = rstart + 1 - sp;
            *dst   = lastc;
            putsize += 2;
      }

      return putsize;
   }

The unpackDelta function decodes delta-compressed data. After skipping to a part of the layer containing changes, it calls unpackRLE.

   int unpackDelta( char *src, char *dst, int size, int vstep,
      int hstep )
   {
      int n, nn;

      while ( size > 0 ) {
         n = *src++;
         --size;

         if ( n < 0 )
            dst += -n * vstep;
         else {
            for ( ; n >= 0; n-- ) {
               nn = *src++;
               --size;
               if ( nn < 0 )
                  nn = -nn;
               else {
                  ++nn;
                  if ( !unpackRLE( &src, dst, nn, hstep ))
                     return FALSE;
               }
               dst += nn * hstep;
            }
         }
      }

      return TRUE;
   }