The visualization process begin with reading data on discrete form.
Some transformations are applied to it involving perhaps thresholding, re-scaling, normalization or clipping. Transformations are applied to the whole data set or parts of it, converting in varyous ways the data to geometrical objects in a virtual scene.
The virtual scene is further transformed to an image through a pixelization-rasterization process. Miscellaneous operations like depth sorting or hidden surface removal, clipping, transformations etc... are involved in the rasterization process.
Both the processes of composing the scene and producing the output screen image can be very computing intensive. Composition of the scene is a general process which depend on the data and the user interactions etc..., and it is difficult to design a dedicated processor for that purpose alone. That job is mostly done on the CPU, while the processes involved in the rasterization process is very specialized, the same processes are repeted a huge number of times, and can be done very efficiently on dedicated graphics hardware.
Some technical terms:
As color can be assigned to a geometrical object, transparency or opacity can be assigned to it as well. Opacity and transparency are complements in the sense that high opacity imply low transparency. The opacity a, is a normalized quantity in the range [0,1], or in discrete form as with 8 bit hardware, in the range [0,255]. It is related to transparency t by the expression:
a = 1 - t
If an object has high opacity (a = 1), the objects and light behind it are shielded and not visible. If at the same time the object has a non zero color value, it is "emitting" light so it is visible. On the other hand if a < 1, the object is transparent and objects behind it are visible. If a = 0, the object is invisible whatever the colors are.
Normally we work with (r,g,b,a) tables, or more preferably with (h,s,v,a) tables. For some graphics hardware, the color tables are implemented in hardware, which imply an instant color-opacity update of the rendered scene after altering the tables. Others do not. In the latter case, the scene must be completely re-rendered before the changes are effective, which will take some time. The process can be CPU intensive dependent on render-algorithm selected.
Figure: A transparent isosurface of the pressure field surrounding a delta wing is rendered in pink, also the color editor and material editor widgets are shown (system IRIS Explorer). The light setting for the isosurface is given by the following values: Ambient light is 1/3, diffusive light is 1, specular reflection is 1/2 and the emissivity is close to 0. The shininess is 1/3 and the transparency is 1/3. The color of the ambient, specular and emissive light is white, while saturation and value of the diffusive light is 1/2 and 1.
Exercise: Use IRIS explorer or Inventor to check out the effects of varying the light settings.
As we have seen in previous chapter, there is a rendering model called ray tracing. The color and opacity algorithms used in ray-tracing is discussed here. Imagine a colored background light coming in from right with color (Rb,Gb,Bb), penetrating a surface with color and opacity (Rs,Gs,Bs,as). Integrating from left, the emerging "light" is then given by:
For more than one surface, the abowe formula can be applied recursively which gives explicitly for a three surface scene:
This expression do not commute in the sense that swapping of the planes will give different results. It is then clear that it is important to do depth sorting correctly in right order by software and hardware. Best results are obtained when sorting the objects from front to back.
Several algorithms can be used for hidden surface removal. Ray tracing utilizes a front to back sorting for each ray, another method is to sort the polygons from back to front, the painters algorithm. These algorithms are slow and they can introduce errors. Another technique called Z (depth) buffering, utilizes dedicated hardware for hidden surface removal. Z-buffering can be done in software, but with hardware support it is very efficient. The depth sorting is done during the scan conversion/ resterization process.
The Z-buffer gives a high quality hidden surface removal during rendering without sorting the polygons. The Z-buffer takes advantage of the depth value along the direction of projection. Before a pixel is drawn, its z-value is compared against the current z-value for the particular pixel. If a new pixel is in front of the current pixel, then it is drawn using the color value of that particular pixel and the z-value for that pixel is also updated. Otherwise the current pixel remains and the new pixel values are ignored. The largest value that can be stored in the z-buffer, represents the z of the front clipping plane, the z-value of the back clipping plane is usually set to zero at the same time as the frame buffer is initialized to background color. Polygons are scan-converted into the frame buffer in arbitrary order. During the scan conversion process, if the polygon point being scan-converted is no farther from the viewer than is the point whose color and depth are currently in the buffers, then the new point's color and depth replace the old values. If this point is further away, nothing happens. For more information see Foley et al.
Z-buffer requires a large memory to store the z-value for each pixel. Depth range and resolution are dependent on hardware-configuration and is limited by the number of bits available in the dedicated z-buffer memory. Most systems use 24 or 32 bits. The SGI InfiniteReality offers a 23-bit floating-point z-buffer, using varying resolution with distance from the eye point, utilizing the depth resolution in an optimal way.
For good rendering, z-buffering must be combined with anti-aliasing.
Figure: The z-buffer. A pixel's shade is shown by its color, its z value is shown as a number. The z-buffer is initialized to zero, representing the z-value at the back clipping plane, and the frame buffer is initialized to the background color.
A drawback of raster display systems arises from the nature of the raster itself. The raster system can display mathematically smooth lines, polygons, and boundaries of curved primitives such as circles and ellipses only by approximating them with pixels on the raster grid. This can cause the problem of "jaggies" or "stair-casing" as shown in the figure below. This visual artifact is a manifestation of a sampling error called aliasing. Such artifacts occur when a function of a continuous variable that contains sharp changes in intensity is approximated with discrete samples. Modern computer graphics is concerned with techniques for reducing or removing "jaggies" or "stair-casing" referred to as anti aliasing.
Figure: Aliased lines caused by the finite pixel size (left) and anti-aliased lines (right). Notice that the anti-aliasing algorithm produces a somewhat "blurred" image.
There are several algorithms developed for anti-aliasing, the simplest is to increase the resolution. Doubling the resolution in horizontal and vertical direction, will make the jags half in size and double their numbers. They will look smoother, but this will quadruple the use of graphical memory, which is expensive. There are other and cheaper ways to handle the problem. The simplest is called unweighted area sampling. Consider lines (which are infinitely thin), they can be approximated as a one pixel wide quadrilateral. The pixel values are given by the coverage of the quadrilateral. For each pixel, the pixel value is given by A = A * degree-of coverage.
Weighted area sampling is a slightly better anti-aliasing algorithm. Here a small area closer to the pixel (line) center has greater influence than does one at a greater distance, contrary to the unweighted area sampling, where only contribution is from the degree of coverage. For further reading see Foley et al.
Figure: Intensity proportional to area covered, that is unweighted area sampling.
Anti-aliasing is a compute-intensive operation. To obtain anti-aliasing in real time over the entire computer screen, dedicated hardware is required. In the high-performance anti-aliasing hardware of the SGI InfiniteReality, for the highest available image quality, memory and processors are put in the "Raster Manager" subsystem to give multi-sampled anti-aliasing without the usual performance penalties. This anti-aliasing technique needs no data sorting and works with the Z-buffer for superior hidden surface removal. In this implementation of multi-sampling, subpixel information is retained for all vertices in the pipeline as they are transformed an converted to raster primitives. Each primitive is computed with 8x8 subpixel accuracy. Thus, there are 64 possible subpixel locations within each pixel rendered. When deciding how to color a pixel, the raster system samples each primitive at four, eight or sixteen of the subpixel locations in each pixel it touches. It then calculates color and depth information for the subpixel covered. This information is sent to the frame buffer where Z-buffering is performed for each subpixel, and a final color for each pixel is formed from the accumulated data from the set of subpixels within it. In multi-sample anti-aliasing mode, Z-information is kept for every subsample, giving hidden-surface accuracy down to the subpixel level. When transparency is used, the alpha value is used to determine how much to blend the color of the closer pixel with the color of the farther pixel at the same location in the subpixel grid.
Figure. Showing texture/image map onto a cylinder.
In texture/image mapping algorithms, software and hardware are designed to display complex, texture-mapped scenes at real-time frame rates (60Hz or more). This requires a data channel of very high capacity. "Texture hardware" has a local memory for storage of texture data, the local memory size is typically 16MByte or more. The down-load rate at which the textures can be loaded from the texture memory into the frame buffer can be as high as 240MByte/second, and slightly lower between main memory and the graphical pipeline. A 3-dimensional texture hardware feature is utilized to render 3D voxel data sets in volume visualization. During the designphase of the SGI Reality pipeline hardware, the engineers recognize that there were some leftover area at some of the the silcon chips. They decided then to implement in hardware a possibiity for addressing texture memory as a 3D array. Contrary to 2D textures and addressing where 2D textures are stacked after each other, and one must sample textures in 3-different directions, the 3D texture addressing offers a saving of texture memory of a factor 3 by utilizing only a single texture dataset. 3D textures are three times as efficient as 2D textures when used in volume rendering.
Texture mapping hardware/software is designed to display complex scenes at real-time rates (60Hz). Hardware is designed to down-load textures from main-memory through the graphics pipeline "on the fly". Typical bandwidth is can be as high as 240 MByte/s from host memory, slightly higher from frame buffer memory. Textures of size larger than the texture memory can be stored in main memory and swapped in/out of texture memory as they are needed. It is also possible to do off-screen rendering. Off-screen data might serve as cache for frequently accessed images or textures.
Textures are partitioned in texture elements called texels with depths of 16-bit, 32-bit, or 48-bit. Each size can represent several formats. Texture memory is presently a very expensive resource due to the high memory bandwidth required and has to be utilized efficiently. In volume/voxel visualization textures are often represented with a depth of 16 bits to save memory, packed as 8 bits in color and 8 bits in opacity. A texture memory of typical 16MBytes has a capacity to store 256x128x128 8 bit voxels. A 64MByte texture memory can store 256^3 8 bit voxels. The computer industry is currently talking about building texture systems with a capacity of 512MByte to one giga-byte
Texels of different resolution generated from a single image can be sampled and filtered, stored and loaded as they are needed, freeing up texture memory as they are not used. This is called MIPmapping. Useful for zooming operations, and for swapping textures during rotation, translation, sooming or clipping in volume rendering.
Texture memory is divided into two banks for performance reasons. Voxel-sets have to be "partitioned" into groups with a size of 2^n and then superposed.
Operations like linear interpolation, bilinear interpolation, trilinear and even quadri-linear interpolation can be performed by the texture hardware at real-time rates. There is a variety of operations that can be applied to texels.
Texture/image mapping is a hardware dependent/accelerated feature, used to introduce details without heavy use of CPU-power. Some graphics hardware has support for 2D textures and the more power-full systems do support 3D textures. For 2D textures, a 2D pixel array of data (image), expressing a given pattern is defined at coordinates (s,t): 0 < s < 1, 0 < t < 1, that can be mapped to a surface. For 3D textures, 3D scalar data given in a Cartesian coordinate system, can be mapped through a 3D mapping f(s,t,u) into a volume. There are several good reasons for doing texture mapping, the most important ones are perhaps: 1) To uUtilize dedicated texture hardware, where the local texture memory acts as a graphics data cache. This has the potential to increase rendering speed dramatically. Applied to volume visualization we have seen increases of rendering speed of 10^5. 2) Compared to the treatment of non-textured polygons, where a large set of vertex data has to be considered, the use of texture mapping offers a considerable data reduction, storing an image in dedicated texture memory and mapping it to simple polygons like a rectangle, a square or a polygon surface. This offers a great reduction in number of vertices, and give the same realism when it comes to rendering. 3) Additional functionality like tri-linear interpolation in texture space, is widely used in volume visualization. Re-sampling and filtering of textures, a tecnique called MIPmapping is also feasible with the texture hardware. A set of low pass filtered textures with varying resolution, can be stored in texture memory for instant access to be displayed when needed in the scene. Also for some hardware, color and alpha tables are connected to the raster hardware, so textures need not be regenerated if colors and/or opacity tables are changed.
In the upper figure is a 2D texture map shown. Here a picture is mapped to a rectangle. Expressing the same pattern as multiple triangles would require knowledge of all triangle vertices in additions to the (R,G,B,alpha) values in the vertex points. By using texture mapping to represent the same scene, the image is passed as a bitmap to texture memory and the multiple rectangles or triangles are reduced to single rectangle defined through its 4 vertices. The bitmap image is mapped to that rectangle, giving the same image quality. In 3D texture mapping the data reduction is even greater, storing a series og bitmap pictures in texture memory and just giving the 8 vertices of the entire volume.
The effect of tri-linear interpolation is shown in the figure below. This is an example of the volume rendering of a set of vortex cores in turbulence from a numerical simulation. Apparently the interpolation increases the render quality dramatically. The data shown are in the numerical model expressed through smooth basis-functions, so the smooth picture resulting from the interpolation is actually better then the un-interpolated picture. It gives a better representation of the data. Using 3D texture hardware, the interpolated scene is rendered with the same speed as the non-interpolated case. It offers no "time-penalty". Compare the un-interpolated rendering with the interpolated one shown to the right below.
There is a difference between 2-d and 3-d texture hardware as used in volume visulaization. With 2-d texture hardware, three sets of texture data sets have to be stored, one for each coordinate direction. With 3-d texture hardware only one set is stored, reducing the texture storage requirement by a factor of three. In the figures below, a zoomed view of the texture planes are shown.
The insertion of more planes give a smoother rendering (in this case 8 quads are inserted per voxel)
Real time graphics operations on complex scenes involve a huge number of simple arithmetic operations. In general, supercomputer capacity is needed to keep up with the high data flow rate and the huge number of computations. In many cases, even the fastest micro-processor is far from being fast enough for real time rendering. It has then been necessary to custom build the graphics hardware to reach the required bandwidth and arithmetic capacity. It is possible to custom build hardware since the computer graphics operations ,consist of a limited set of relatively simple arithmetic operations at moderate numerical accuracy. The limited set of arithmetic operations are multiply repeated and in an arbitrary order, which make them well suited for an architecture consisting of multiple parallel pipelines.
In the most common systems, the graphical pipeline is divided into 3 parts: 1) The geometry subsystem, 2) the raster subsystem and 3) the display subsystem. Different vendors utilize different designs, but the three parts mentioned can be found in most of the more powerful systems. This is also due to the fact that the software library OpenGL (Open Graphical Language), used by most vendors, has functions that are closely connected to the hardware.
In the SGI Infinite Reality pipeline, the geometry subsystem is custom-designed with four geometry engine processors that can execute geometrical operations in parallel, employed in multiple instruction, multiple data (MIMD) fashion. Each geometry engine processor contains three separate floating-point cores that execute in single instruction, multiple data (SIMD) fashion.
Geometry engine processors do both geometry (per vertex) and pixel processing. Geometry processing includes vertex transformations, lighting, clipping, and projections to screen space. Gouraud shading is applied. Pixel processing include many common image processing operators, such as convolution, histograms, scale and bias, and lookup tables. Pixel operations can be applied to standard pixels, textures, or video.
Geometry engine performs geometry per vertex and pixel processing. Geometry processing includes:
Pixel processing done in the geometry engine includes:
The raster subsystem, has custom VLSI processors and memory. Triangle, point, and line data received by the Raster Manager (RM), from the geometry subsystem, must be scan-converted into pixel data, then processed into the frame buffer before a finished rendering is handled to the display subsystem for generation of video signals.
By using extensive parallelism in most stages, the raster subsystem performs anti aliasing, texture mapping, and image processing in real time. Besides basic anti aliasing for graphics primitives, the RM system also supports full anti-aliasing by multi sampling, called multi sampled anti-aliasing With multi sampling, the images are rendered into a frame buffer with higher effective resolution than that of the display-able frame buffer. For a fully configured system, (4RM boards), each pixel has 8 subsamples, positioned at 8 of the 8x8 possible sub-pixel locations, (then the resolution is 8 times the display-able resolution). The anti-aliasing technique works with the Z-buffer for each of the 8 subsamples, for superior hidden surface removal. When the image has been rendered into the 8 sub-samples, all samples that belong to each pixel are blended together to determine the pixel colors.
Image memory is interleaved among parallel processors so that adjacent pixels are always handled by different processors. Thus, each polygon can have multiple processors operating on them in parallel.
For each pixel, scan conversion produces texture coordinate data, which are sent to a dedicated texture processing unit. This hardware performs perspective correction on texture coordinates, and determines the level of MIPmapping, producing addresses for the relevant area of texture memory. In MIPmapping the texture are low pass pre-filtered, and each pre-filtered texture is stored in texture memory, for later access. The appropriate level of the MIPmapped texture is then applied to a polygon as required by the scene. Texels (basic texture elements) are stored, depending on the number and precision of color and transparency. There are various texture formats given at 16-bit, 32-bit and 48-bit. Texture color lookup tables are supported, enabling hardware support for applications involving dynamic transfer functions. Nice features for volume visualization is the hardware implementation of cut planes, that in real time can be used to slice the data, and the tri-linear interpolation.
Operations performed by the raster subsystem:
The display subsystem takes rendered images from the raster subsystem digital frame buffer and process pixels through digital-to-analog converters to generate an analog pixel stream suitable for display on high-resolution RGB video monitors. The system supports resolution from VGA (640x480) up to 1920x1200 format. A special 72 HZ non-interlaced version of this format is provided for flicker free viewing. Also NTSC and PAL video formats as well as composite and S-video and professional digital video formats are supported.
Frame-buffer pixel memory contains up to 1024 bits per pixel, expressing colors, alpha, z-buffer, overlays, textures,... The whole thing is double buffered, which enables rendering in a background buffer and later turning on the visibility of that buffer as the rendering is finished. The raster manager boards have 80Mbyte frame buffer memory each. A single infinite reality pipe can have up to 4 RM boards. In addition the RM boards contain typically 64Mbyte of texture memory.
The native graphics programming interface for several graphics platforms is OpenGl, an environment for developing high-performance 2D graphics, 3D graphics, imaging, and video applications. It is designed to be independent of operating systems and window systems, and is supported on virtually all workstations and personal computers available. If you are interested in OpenGl see: OpenGl web-page. There is a public domain version of OpenGl called Mesa.
OpenGl provides access to a rich set of graphics capabilities, including transformations, lighting, atmospheric effects, texturing, anti-aliasing, hidden surface removal, blending and pixel processing, among others.
Below follow two simple OpenGl examples, the first is in "immediate mode", the second utilize a display list:
Compared to the Infinite Reality hardware described over, modern graphics hardware architecture is rather similar. There are exceptions and we are going to comment on some of them. The Infinite Reality processor has four geometry processors which in the modern graphics processor (GPU) has been replaced with 12-16 pipes capable of handling 32bit floating point numbers. The GPUs are designed and produced mainly by companies NVIDIA and ATI. The evolution of modern GPUs has been amazing. Already in 2003 the "Fourth" generation GPU is on the market while the "First" generation appeared late 1999. An important advantage of the modern GPUs is their ability to be programed through languages like OpenGl-shaders or C for Graphics called CG which is presented in the book "The CG Tutorial". The vertex processor and the fragment processor are user programmable. The fragment is to be viewed as a generalization of the pixel. A block diagram of the programmable graphics pipeline is shown below:
The GPU with its 32bit floating point arithmetics can also be used for scientific computations. SINTEF is currently developing PDE solvers utilizing GPUs, see PDEsolver using GPUs. If you are interested in details on GPUs and GPU programming see GPGPU.
Open Inventor is a high level graphics application built on top of OpenGl. It is an object-oriented 3D toolkit. OpenGl is a library of objects and methods used to create interactive 3D graphics applications written in C++. For C programmers, C bindings are available. OpenInventor is a set of building blocks that enables you to write programs that take advantage of graphics hardware features with minimal programming effort. The toolkit provides a library of objects that you can use, modify, and extend to meet your needs. In addition to simplifying application development, Inventor facilitates moving data between applications with its 3D interchange file format. The interested reader should see the book J Wernecke, The Inventor Mentor. The figure below shows the Inventor architecture.
The Inventor tool-box is a library of 3D objects and methods and contains
Below is a simple Inventor code example, composing a red cone, rendering the cone and writing the Inventor cone data to a file named "myCone.iv". The red cone is shown below and in the right panel is the resulting output inventor exchange code (ASCII format) for the cone, the content of the file myCone.iv.
Below is an example where the cone data are rendered by the Inventor program called SceneViewer, which is an interpreter/render for Inventor data. Different widgets for manipulating light, color and material properties are shown as well as a "handle box".
The final Inventor code example is on texture mapping. This "complex" operation is handled by an Inventor code with less than 20 code lines. This demonstrates the efficiency of Inventor.
Click here to see the code.
Click here for pdf slides.
J. D. Foley, A. van Dam, S. K. Feiner and J. F. Hughes. Computer Graphics, Principles and Practice. Second edition, Addison-Wesley publishing company, 1990.
J. Wernecke. The Inventor Mentor, Programming Object-Oriented 3D Graphics with Open Inventor, Release 2. Addison-Wesley, 1994. ISBN 0-201-62495-8.
R. Fernando and M. J. Kilgard. The Cg Tutorial. The definitive guide to programmable real-time graphics. 2003. Addison-Wesley. ISBN 0-321-19496-9.
SINTEF's work on GPU implementation of PDE solvers.
GPGPU, a tutorial on GPU programming given at Visualization 2005.