next up previous



Contents

The Graphics Pipeline

Jim Blinn's book [1] provides a vivid account of the graphics pipeline. In particular read chapters 13 to 18. Each of these chapters were originally published in IEEE Computer Graphics & Its Applications.

The pipeline can be interpreted as a series of coordinate spaces where points defining objects exist. There is little agreement on the name of these spaces, and, more importantly, architectures and algorithmics may eliminate or add spaces, or change when and how certain steps are made. We will distinguish the following spaces:

Master Coordinates (MC)
is the space where individual objects are defined. It is usually three dimensional Euclidean space with a right-handed orientation.
World Coordinates (WC)
is a space where we seldom want to stop, but one that is conceptually useful as a collector and arranger of individual objects from their master coordinates.
View Coordinates (VC)
is where the recorder is at the origin looking down the positive z axis. Now we're in a left-handed world.
Perspective space (PS)
a weird place where 0 is at infinity, and points are punctured lines.
Clip space (CS)
is simply the unit cube $0\leq x, y, z \leq 1$.
Normalized Coordinates (NC)
is a (sort of) standard space that eliminates distorting when mapping to the device.
Device Coordinates (DC)
are the (x, y) integer coordinates (array indices) of hardware pixels in a viewport allocated to the graphics scene.

Other terms for these spaces are: model and object for master; universe for world (big minds, perhaps); eye or camera for view; normalized device or screen for normalized; and pixel or raster for device.

Now a word about notation. We'll use x, y, z, and w to represent coordinates in some space (if you do not understand what w is, please be patient, or look it up; if you do not understand what x, y and z are, please speak with your instructor). When it seems necessary to explicitly mention the space that these points are in we will use use a subscript M, W, V, P, C or D for master, world, view, perspective, clip, or device coordinates. Got that? Most often we are interested in a sequence of points, so they will need to be subscripted by integers starting at 0. Points is a space are written as rows: (x, y, z, w).

Always, the map from one space (coordinate system) to the next involves multiplying a point times a matrix, one of all the answers in graphics. All matrices will be denoted by capital, math italic fonts, usually, M, T, S, R, P to denote a general matrix, or a translation, scale, rotation, or projection. Almost always a matrix (transformation or map) will have four rows and four columns. Point-matrix multiplication transforms a point in space A into a point in another space B, this is written

(xA, yA, zA, wA)M=(xB, yB, zB, wB).

Now let's go into some of these spaces and see what they are all about. We'll start with where we want to get: device coordinates or pixels.

Device coordinates

In most graphics systems the image is stored in memory, often called a framebuffer, as a collection of integers that specify the color of a pixel on a display device. The framebuffer represents an image of the entire display surface and may also include off-screen regions. There may be two framebuffers to support double buffering, where one buffer is displayed as the other is written too. Other buffers, such as depth, shadow, and accumulation buffers, may exist in some graphics systems.

For concreteness, we will assume a display surface with integer grid $0 \leq x \leq 1279$ and $0\leq y \leq 1023$, dating this discussion to a particular instance in time.

Now let's back up a couple of spaces to clip space. This is the place where we crop off any portion of the world to just what we see through our window on the world.

Clip Space

Clipping to the unit cube: $0\leq x, y, z \leq 1$, can be implemented efficiently. Although clearly not the only choice for clip space, this is how we will define it.

Normal coordinates

Normal coordinates, more commonly, normalized device coordinates, in a graphics system dependent used to eliminate geometric distortion between our specified screen and what we actually see. This is arrived at by a geometric translation and scale that removes distortion that might occur in the map for our envisioned world to the particular display device. Think of the projection of the world we see onto a rectangular window, which is mapped, without distortion, to an on-screen viewport.

Now (again dating the discussion) typically, a window system manages the framebuffer and allocates a viewport where our graphics will display. Of course the window system will (usually) allow us to move and resize the viewport as we will. But the point is that we want to record colors in a portion of the framebuffer defined by an offset (xo, yo) and a width and height (w, h), where each of these quantities is an integer that can be translated into a framebuffer memory address.

Problem:

Support the framebuffer starts at memory memory location n and increases to n+ 1023*1279. Find the address of pixel (x, y) within viewport that starts at offset (xo, yo).

Problem:

Modern graphics monitors support true color: 24 bits of color, 8 for each of red, green and blue, and 8 bits for alpha blending. What amount of memory is needed to support our example framebuffer in a double buffered system? What memory bandwidth is required to display the framebuffer 60 times a second?

Bibliography

1
J. BLINN, Jim Blinn's Corner: a trip down the graphics pipeline, Morgan Kaufmann Publishers, Inc., 1996.
ISBN 1-55860-387-5.


next up previous

1998-08-31