Math
In the 3D graphics industry, math is perhaps the most important tool that is used to create stunning, photo-realistic still images - or to create realistic, real-time animations. In particular, vector math and matrix math are the two areas which are the most useful. On this page we will discuss not only the math, but the problems which the mathematics is used to solve.
I start with a summary of the equations, then follow that with discussions
about the equations - what they do, how they tie together, and why they are important.
Equation Summary
Standard Math
Problems
Vector Math
Matrix Mat
Projections
Return to top of document
Equation Summary
I hate to wait and I assume you do to. So without further discussion here are all of the
equations that are discussed in the rest of this page.
Translate Scale Perspective
x' = x + dx x' = x * Ax x' = d * y / (d - z)
y' = y + dy y' = y * Sy y' = d * x / (d - z)
z' = z + dz z' = z * Sz z' = 0
Rotate
X-axis Y-axis Z-axis
x' = x x' = z sin(a) + x cos(a) x' = x cos(a) - y sin(a)
y' = y cos(a) - z sin(a) y' = y y' = x sin(a) + y cos(a)
z' = y sin(a) + z cos(a) z' = z cos(a) - x sin(a) z' = z
Dot Product
A·B = <x1, y1, z1> · <x2, y2, z2> = x1x2 + y1y2 + z1z2
A·B = |A| |B| cos(a)
A·B = <x1, y1, z1> |x2|
|y2|
|z2|
Cross Product
A x B = <x1, y1, z1> X <x2, y2, z2> = <y1z2 - z1y2, z1x2 - x1z2, x1y2 - y1x2>
| i j k |
A x B = | x1 y1 z1 | = | y1 z1 |i - | x1 z1 |j + | x1 y1 |k
| x2 y2 z2 | | y2 z2 | | x2 z2 | | x2 y2 |
|AxB| = |A| x |B| x sin(a)
Matrix Addition
A + B = | x1 y1 | + | c1 c2 | = | x1+c1 y1+c2 |
| x2 y2 | | d1 d2 | | x2+d1 y2+d2 |
Matrix Multiplication
| a b c | | r s | | ar + bt + cv as + bu + cw |
| d e f | X | t u | = | dr + et + fv ds + eu + fw |
| v w |
2x2 Determinant
D = | a b | = ad - bc
| c d |
3x3 Determinant
| c1 c2 c3 |
D = | x1 y1 z1 | = c1 | y1 z1 | - c2 | x1 z1 | + c3 | x1 y1 |
| x2 y2 z2 | | y2 z2 | | x2 z2 | | x2 y2 |
Translation Matrix Scaling Matrix Perspective
| 1 0 0 0 | | sz 0 0 0 | | 1 0 0 0 |
| 0 1 0 0 | | 0 sy 0 0 | | 0 1 0 0 |
| 0 0 1 0 | | 0 0 sx 0 | | 0 0 0 -1/d |
| dx dy dz 1 | | 0 0 0 1 | | 0 0 0 1 |
Rotation X-Axis Rotation Y-Axis Rotation Z-Axis
| 1 0 0 0 | | cos(a) 0 -sin(a) 0 | | cos(a) sin(a) 0 0 |
| 0 cos(a) sin(a) 0 | | 0 1 0 0 | | -sin(a) cos(a) 0 0 |
| 0 -sin(a) cos(a) 0 | | sin(a) 0 cos(a) 0 | | 0 0 1 0 |
| 0 0 0 1 | | 0 0 0 1 | | 0 0 0 1 |
Return to top of document
Standard Math
In the world of 3D animation there are three basic movements:
- Translation
- simple movement of an object in specified x, y, and z increments
- Rotation
- rotation of an object about a line, usually the x, y, or z axis
- Scaling
- changing the size of an object.
Translation and scaling are particularly simple to compute. For translation, simply add the
amount of translation (dx, dy, and dz) in the direction of each axis to the corresponding x, y, and z component.
The equation is:
x' = x + dx
y' = y + dy
z' = z + dz
For scaling, simply multiply each point's x, y, and z components by the scaling factor. The equation
is:
x' = x * Ax
y' = y * Sy
z' = z * Sz
Note that scaling an object not centered at the origin also results in a translation.
See the discussion below on how to handle scaling without translation.
The general problem of rotating a point around an arbitrary line is fairly complicated,
however the rotation of a point around the x, y, and z axes have fairly straight-forward
trigonometric solutions:
X-axis Y-axis Z-axis
x' = x x' = z sin(a) + x cos(a) x' = x cos(a) - y sin(a)
y' = y cos(a) - z sin(a) y' = y y' = x sin(a) + y cos(a)
z' = y sin(a) + z cos(a) z' = z cos(a) - x sin(a) z' = z
To generate complicated 3D scene animations multiple transformations may be required.
For example, scaling of an object not centered at the origin (0,0,0) also results
in translation of the object. So, to rotate the object around a fixed point requires
three separate transformations: 1)translate that point to the origin, 2) perform the rotation,
and 3) translate the point back to the original position.
The approach just described - the application of multiple, simpler transformations to achieve
a more complicated, single transformation is a very common approach in rendering 3D scenes.
As you will see in the next section, it is the handling of multiple transformations which
has helped drive the 3D industry to depend on the use of vectors and matrices. These mathematical tools offer
efficiencies (both in terms of coding and speed of performance) which are needed to provide
real-time rendering of complex 3D objects.
Return to top of document
Problems - 3D Computations
The following list identifies the Top 10 computations that must be made routinely
in the creation of 3D images and summarizes the mathematics involved in the computation.
The details of the math is covered in other sections on this page. Some of the
descriptions may not make much sense until you read the details, but the descriptions
should be sufficiently general to help you realize the way in which vector and
matrix mathematics prove useful and necessary to generating 3D scenes.
The list certainly does not cover everything, but it serves to show how important
mathematics is to the 3D graphics industry and to emphasize why the 3D graphics
artist must understand vector and matrix mathematical concepts in order to get the most out of
the software tools available to him.
I need to re-emphasize a an aspect of 3D calculations that
may not be obvious until it is pointed out. I've mentioned that vectors have only
two properties - magnitude and direction. Strictly speaking, the sides of a triangle
are not vectors. However, a triangle side does have a magnitude (the length of the
side) and a direction (it is directed between two points). By applying vector math
to line segments (sides of triangles) we can calculate information needed to render
photo-realistic 3D scenes. Throughout these pages I'll use line segments (sides of
polygons) interchangeably with vectors - but only because the fiction allows us to
use vector math in useful ways.
- Angle of light incidence on a polygon
The color that a polygon should be shaded is a function of
the angle at which the light source strikes the polygon. If the light source
strikes the polygon straight on - directly overhead - the surface of the polygon
closely approximates the color of the light. Likewise, when the light source
strikes the polygon at a low angle, the surface of the polygon darkens, tending
towards black when the light source is parallel to the surface of the polygon.
So, in order to correctly render (color) the polygons of a 3D scene, the first thing we need
is the direction of a vector straight out (perpendicular to the surface) of the polygon. Treating
two edges of a polygon as vectors and performing a vector cross product calculation
returns a vector that is perpendicular to the surface of the polygon. This vector
has a special name, a 'normal'. As you will see, normals (vectors perpendicular to a surface)
are used extensively in 3D graphics computations.
Then we use the direction of the light source in the scene (a vector) and the normal to the surface of the
triangle to calculate the angle between the light source and the surface of the polygon - a
calculation called a vector dot product, or simply dot product. This angle is then used
in rendering/shading/coloring the polygon.
As part of the rendering process, these calculations are repeated for every polygon in the
3D scene. It may sound like a complicated process, but as you will see it's really only a
few lines of code repeated over and over.
- Direction of the 'face' of a polygon
As I've mentioned, the polygon mesh that makes up the model for a 3D scene can consist of
thousands of polygons - enough to tax the ability of even today's PC's to render
complex scenes in real time.
It's much faster to perform mathematical calculations than it is to draw on the computer
screen, so we want to avoid drawing any polygons unless absolutely necessary. In particular,
we would like not to draw the polygons facing away from a scene's point of view, because they
cannot be seen.
This highlights an interesting property of polygons. A polygon is not a 2-sided object.
For the purposes of a 3D scene, a polygon only has 1 side, which may be visible or not
depending on the direction it faces - towards or away from the point of view. Polygons
facing away from the point of view are not visible and do not need to be drawn. Polygons
which face the point of view can be visible, provided they are not behind another polygon,
and should be drawn.
What you might think of as the "back side" of a polygon is not visible and does not get rendered.
If you have an object for which you want to have a visible front and back side, then you must
use at least two polygons, one for each visible surface.
Using this information about visibility of a polygon can significantly reduced the amount of
drawing that is required to render a 3D scene. During rendering, each polygon is analyzed
and drawn, one at a time. At that point, a decision can be made whether to draw the polygon
based on a vector calculation that will determine if the polygon is facing towards
(and should be drawn), or facing away from the point of view (and need not be drawn).
This is a case where normals come into play again. By performing a dot product
between the normal (vector perpendicular to the surface of the polygon) and the
point of view vector, we can calculate the cosine of the angle between the
two. The result of the dot product is a number which may be positive or negative, which
identifies the direction of the face of the polygon as follows:
- > 0 - the face is visible
- 0 - the face appears as an edge
- < 0 - the face is not visible
If the polygon is visible, or appears as an edge, then we draw (render) it. If it is not
visible then we do not draw (render) it.
This step can be very important to real-time rendering of a 3D scene. As much as 75% of all
polygons in a scene may not be visible in any particular point of view - significantly
reducing the number of polygons to be drawn and thus the total rendering time.
This technique of not drawing polygons which face away from the point of view is also called
backface culling.
- Intersection of two objects (collision)
In 3D graphics we often need to know if two objects have collided
Use of the equation for a plane will allow us to determine
if a point is in a plane.
The minimum information needed to define a plane is a
point on the plane (x0, y0, z0) and a normal vector (n1, n2, n3)
to the plane. Using these two pieces of information, the
equation for a plane can be derived:
Plane Equation: Ax + By + Cz + D = 0
where
A = n1
B = n2
C = n3
D = -(n1x0 + n2y0 + n3z0)
- Sorting of Polygons
When rendering a 3D scene the order in which the polygons or triangles are
drawn onto the computer screen can make a big difference in how accurate
the final rendering appears. An algorithm called the 'Painter's
Algorithm' is used to address this issue. This algorithm simply states
that items farther away from the viewer are rendered first. The algorithm
works very well for simple geometric objects, but does not handle
overlapping or intersecting polygons very well.
There are several variations of this algorithm in use. The simplest
one simply sorts the polygon list by the average depth of each polygon.
The average depth can be calculated by simply adding the z-coordinate of
all the point on the polygon and then dividing by the number of points.
Sorting is not one of the topics I cover on this page, but on the source
code pages I provide sorting algorithms. It's worth noting that the choice
of sorting algorithm can be very important. The various algorithms
can be as much as 10X different in how fast they sort. So, with a 3D scene of
thousands of polygons the ability to quickly render a 3D scene is heavily
impacted by the sorting algorithm.
- Reduction of Computational Steps
Well, actually there are dozens of techniques that might fit this description.
In this case the intent is to talk about a common animation event, completing
multiple transformations in a single step. For example, a 3D graphics designer
might want an object in a 3D scene to be moved to the left while at the same
time rotating half way around the y-axis. Combinations of translation,
rotation or scaling are often required.
One approach to completing the multiple translations is to perform each
one sequentially on every point in the 3D scene. However, as you will see
later in this page, each of these transformations may involve a dozen or
more multiplications or additions.
A technique frequently used is to write each transformation as a matrix
operation (matrices will be discussed below). The specific matrix
for each operation can be combined into a single transformation matrix which
is then applied to the points in the 3D scene. Use of a transformation
matrix can reduce the number of multiplications or additions by over 50%,
providing significant speed improvements in the rendering process.
- Undo Function
Sometimes an animation involves moving an object with a 3D scene
back and forth between two positions - basically an Undo function.
As was discussed in the topic above, a transformation matrix is often
used to perform one or more movements on points.
A matrix can be manipulated to create what is caused the transpose
of the original matrix. Applying an inverse matrix to the result of
a transformation matrix has the result of reversing the actions taken
by the inverse matrix.
While a generic solution to inverse matrices is not simple, for orthogonal
systems (such as our usual x-y-z coordinate system), creating the inverse
matrix is extremely simple and is substantially faster than creating
a totally new transformation matrix from each individual actions.
As you will see later, the inverse matrix is very easily created
Return to top of document
Vectors
Vectors and vector math are used extensively in generating 3D images. So what
are vectors?
A vector is not a real, physical object. It is a mathematical concept which
is written as three numbers (called components), where each number corresponds to the x, y, and z
directions of the coordinate system. An example of vector notation is:
P = <4, -5, 9>
A vector has only two properties - magnitude and direction. The direction of a vector
is that of a line drawn from the origin of the coordinate system to the point corresponding
to given by the three components of the vector. The magnitude of a vector is the square root
of the sum of the squares of the three numbers (the 3D version of the Pythagorean equation
for calculating the hypotenuse of a right triangle), written as:
|P| = sqrt ( x^2 + y^2 + z^2)
Graphically, a vector is shown as an arrow that starts at the origin of the coordinate system
and which ends at the x-y-z coordinates given by the three numbers. An alternate way of
graphically showing a vector is to draw an arrow starting at one point in space (not necessarily
the origin) and ending at a second point.
Both representations of a vector are just aides in displaying the concept of a vector. It's
important to note that a vector has no position - it's only properties are direction
and magnitude. The points used to graphically display a vector are not properties of the
vector.
Despite it's technical inaccuracy, it is convenient to represent a vector as a line segment in space -
starting at one point and ending at another. In such a case, the vector is considered to
have a magnitude equal to the length of the line segment. The direction of the vector
is the same as the direction of the line segment. For two such line segments at points (x1, y1, z1)
and (x2, y2, z2), the vector itself is given by <x2-x1, y2-y1, z2-z1> - the direction of
the vector. Again, the points are not properties of the vectors.
The benefit of representing a vector as a line segment in space is that since the polygons
(triangles) of a 3D scene are made up of line segments, we can use vector mathematics to
manipulate the positions of the line segments. The power of this statement will become
more obvious as we progress throughout this tutorial.
Two points connected by a line is sometimes called a "directed line segment", referring
to the ability of the construct to denote direction. A line segment between two points
A and B may be referred to as AB, which can represent a vector whose direction is from A to B.
The line segment BA would represent a vector of the same magnitude, but opposite in
direction.
As we will see later, the front surface of a triangle is also denoted by the order in
which the points are listed. If a triangle is defined by three points A, B, and C,
then the front surface of triangle ABC is given by the direction of the thumb on
the right hand when traversing these point in a counter-clockwise direction.
The back surface of a triangle is the direction the thumb points when traversing the
three points in a clockwise direction.
Throughout the discussions on vectors, the order of point listing will be important
in assigning direction.
You may have noted that the notation for writing vectors is to include the three
numbers in <> brackets, whereas coordinate points are written with () brackets.
This is fairly universal notation - (4,-5,9) is a point and <4,-5,9> is a vector,
but you may run across variations in the literature.
As noted earlier, there are two equally correct ways of interpreting the three
components of a vector. One approach is to say that the vector starts at the origin
of the coordinate system (0,0,0) and ends at the point (x, y, z). This interpretation
of a vector is called a position vector.
It is also common with the graphics and
mathematics world to represent a position vector as (Ai + Bj + Ck), where i, j, and
k are vectors of magnitude 1 and a direction parallel to the axes of the coordinate
system. A, B, and C are the components of the vector as before. The use of the
i, j, and k vector notation will be useful when we discuss matrix multiplication.
A second way of interpreting this vector is to treat the components as direction
information from whatever point the vector is considered to begin. This interpretation
is known as a displacement vector, where regardless of the starting point the end of
the vector will be moved x, y, and z units in the direction of the corresponding
coordinate system axes.
In both interpretations it is important to note that the numbers used to describe the
vectors are not coordinates, They are the "components" of the vector (displacement values).
For this reason, many texts use nomenclature such as >a1, b1, c1< or >a, b, c<
for the components - just to avoid the tendency of readers to interpret x, y, and z
as actual coordinates.
Throughout this site the x, y, and z notation is used. When you get around to
actually using the equations that are discussed here, use of x, y, and z will
be helpful by avoiding the mental conversion between a1, b1, c1, or whatever letters
might be used, to x, y, and z.
There are a number of mathematical operations which can be performed on vectors
to compute the information needed to correctly and efficiently move and render
the polygons which make up the objects within a 3D scene.
Vector Addition
Vectors can be added, subtracted, and multiplied. To add two vectors,
simply add the corresponding respective components of the two vectors together
as shown in the following example:
P3 = P1 + P2 = <x1, y1, z1> + <x2, y2, z2> = _
<x1 + x2, y1 + y2, z1 + z2>
Again, remember that the values are the displacement components of the vectors,
not the starting or ending coordinates.
Graphically, addition of two vectors can be done by placing the tail of the
second vector to the head of the first. The resulting vector goes from the
tail of the first vector to the head of the second vector.
Subtraction
Likewise, subtraction of two vectors is also very simple. Just subtract the corresponding
components of the two vectors as shown in the following example:
P3 = P1 + P2 = <x1, y1, z1> + <x2, y2, z2> = +
<x1 - x2, y1 -y2, z1 - z2>
Multiplication
Multiplication of two vectors is the far more useful operation in 3D graphics. There
are actually two kinds of vector multiplications - each of which have their own use,
as you saw in the list of Problems above. Strictly speaking, the "multiplications"
are a series of operations on the components of two vectors, but it is commonplace
to describe the operations simply as multiplication.
Dot Product
The first type of vector multiplication is called a dot product. A dot product
is generated by multiplying two vectors together and results in a scalar (number).
It is sometimes called scalar multiplication. The dot production can be calculated
one of two ways:
A·B = <x1, y1, z1> · <x2, y2, z2> = x1x2 + y1y2 + z1z2
OR
A·B = |A| |B| cos(a)
where a is the angle between the two vectors.
This is a good time to point out again that the x, y, and z variables listed
in the equation are the components of the vectors - the displacement
components. If the dot product is applied to two sides of a triangle the
components of the vectors must be calculated by subtracting the coordinates
of the two points. As was noted above, the vector components of a directed
line segment are calculated as:
<x2-x1, y2-y1, z2-z1>
where (x1, y1, z1) and (x2, y2, z2) are the starting and ending points of
the line segment.
One way to physically interpret the result of a dot product is that
it is the magnitude of the component of A that extends in the direction of B (which
is called the projection of A onto B) times the magnitude of B.
Two dot product calculations of interest are:
- A·B = 0 when the two vectors are perpendicular to one another
- A·B = |A| |B| when the vectors are parallel
As was noted earlier, one use of the dot product is to compute the
angle at which a light source strikes a polygon, which in turn is
used to determine how to render the polygon. There are several
other uses of the dot product in rendering 3D scenes.
Cross Product
The second type of vector multiplication is called a cross product,
A cross product results in a vector which points in a direction
perpendicular to the plane which contains the two vectors that
were multiplied together. The calculation of a cross product is
performed as shown in this example:
A X B = <x1, y1, z1> X <x2, y2, z2> = <y1z2 - z1y2, z1x2 - x1z2, x1y2 - y1x2>
The magnitude of the resultant vector is given by:
|AXB| = |A| x |B| x sin(a)
where a is the angle between the two vectors.
When a cross product is applied to two line segments (edges) of a triangle the
result is a vector perpendicular to the surface of the triangle - where the
surface of the triangle is determine by the order in which the points are listed,
as was discussed above. This vector perpendicular to the surface of the triangle
is called a normal and is very important in rendering a 3D scene.
As with algebraic equations there are a number of manipulations of vector
equations which can be applied:
- Dot Products
- the following rules apply to dot product calculations
- Cross Products
- the following rules apply to cross product calculations
- AXB does not equal BXA. The order of the vectors in the operation changes
the answer except under very special circumstances.
Return to top of document
Matrix Math
To move an object in a 3D scene, you must move every point in the model - corresponding
to every vertex of every polygon in the 3D scene. As you will see, there are three
common types of movements of points:
- Translation
Translation is simple movement in the x, y, or z direction (or any combination)
- Rotation
Rotation is movement of a point around another point, or line, by a given angle.
3D scene rotations are often accomplished by successive rotations, up to one
for each axis. In this approach, the order of rotation is important. Changing the order of
rotation can change the way an object appears after all rotations are done.
- Scaling
Scaling is a resizing of the object.
Movement of these points is called a transformation and can be performed in a number
of ways. One way involves individually manipulating the geometric and trigonometric
equations to calculate the new x, y, and z coordinates.
A second approach is to perform a calculation on a point by using matrix operations,
which is the approach used most often in 3D graphics engines. As we shall see, using
matrix operations can significantly speed up the rendering process by significantly
reducing the number of multiply operations involved in a transformation - by
up to 75%. This tutorial focuses on the use of matrices to perform point transformations.
Matrix math is not specific to vectors and 3D graphics. It was developed originally
for its generic ability to solve multiple equations with multiple unknowns. Many
algebra classes include limited exposure to matrix math.
A matrix is simply a rectangular table of numbers. The horizontal lines are called
rows and the vertical columns are called columns. A 4-by-3 matrix has 4 rows and
3 columns, such as this example which defines a matrix A:
| 1 3 4 |
A = | 5 2 7 |
| 1 2 9 |
| 1 0 4 |
Each number in the matrix can be identified, such as A[2,3], which is the element
in the 2nd row and 3rd column - the number 7 in this example.
Matrices can be added, subtracted, or multiplied. As you will see, multiplication
in particular will prove most valuable in rendering 3D scenes. Matrix multiplications
can be used to implement point transformations (movement) as well as to calculate
dot and cross products of vectors for use in rendering 3D scenes.
Even though the example matrix above had multiple rows and columns, a matrix
can just as easily have a single row or column. In fact, representation of
a vector as a matrix with single row or column is a frequent form of applying
matrix operations to manipulate vectors - which, as we have shown, is
equivalent to operating on the coordinates of points that make up the line
segments (edges) of polygons (triangles) in a 3D scene! What will be seen
is that one matrix (loaded with vector components) will be multiplied
by a second matrix (loaded with mathematical equation variables) in order
to achieve transformation of the original vectors - translation, scaling,
and rotation. We will also see that matrix multiplication can be used
to combine multiple transformations (translation, scaling, rotation) into
a single matrix operation, resulting in a significant reduction of the
number of multiply operations - very necessary for real-time rendering of
complex 3D scenes.
Matrix Addition
For the sake of completeness, definitions of
matrix addition and subtraction are provided
here. However, neither will be used much
in creating 3D graphics images.
To add two matrices, both must contain the
same number of rows and columns. Addition
is accomplished by adding corresponding elements
of each matrix, as in this example:
A + B = | x1 y1 | + | c1 c2 | = | x1+c1 y1+c2 |
| x2 y2 | | d1 d2 | | x2+d1 y2+d2 |
Subtraction is handled in a similar fashion, except
that the values are subtracted from one another.
Matrix Multiplication
As has been stated, matrix multiplication is the primary
matrix operation of interest in working with 3D images.
To multiply two matrices, the number of columns in the
first matrix must be the same as the number of rows in
the second matrix. Matrix multiplication results
in a matrix with the number of rows of the first matrix
and the number of columns of the second matrix.
Matrix multiplication is performed as follows:
| a b c | | r s | | ar + bt + cv as + bu + cw |
| d e f | X | t u | = | dr + et + fv ds + eu + fw |
| v w |
As you might be able to see, AB < > BA - that is, matrix multiplication
is not commutative. Changing the order of the matrices in the multiplication
changes the result (there are exceptions to this rule).
There are rules which can be applied to matrix multiplication.
The rules are NOT the same as the rules for simple algebraic variable
multiplication!
(AB)C = A(BC) for all k-by-m matrices A,
m-by-n matrices B and n-by-p matrices C
(A + B)C = AC + BC for all m-by-n matrices A and B
and n-by-k matrices C
C(A + B) = CA + CB for all m-by-n matrices A and B
and k-by-m matrices C
Some matrices are seen so often, or have such
special use, that they are given a special name
that is describes the content of the matrix. These
special matrices will also be used in various aspects
of rendering, so it's useful to know them by name.
- Square matrix
Has the same number of rows and columns
- Identity matrix
All elements are zero except those of the main diagonal - top
left to bottom right. All values in the main diagonal are 1.
Here's an example of an identify matrix:
| 1 0 0 |
| 0 1 0 |
| 0 0 1 |
- Diagonal matrix
All entries not on the main diagonal (the diagonal from the
upper left to the lower right corner) are zero. This is similar
to the identify matrix without the requirement for the main
diagonal entries to have vales of 1.
- Triangular matrix
All entries above the main diagonal are zero or all below
the main diagonal are zero.
| 1 0 0 | | 4 3 2 |
| 2 3 0 | or | 0 5 1 |
| 5 4 1 | | 0 0 1 |
- Transpose matrix
A transpose matrix is formed by swapping the rows with the columns
of a matrix.
| a b c | | a d g |
| d e f | ---> | b e h |
| g h i | | c f i |
Determinant
Before we get further into the details of how matrix math applies to
3D graphics scene rendering, the concept of a matrix determinant
needs discussion. A determinant is defined for a 2x2 matrix
as follows:
D = | a b | = ad - bc
| c d |
A determinant for a larger matrix is calculated by breaking
the matrix down into smaller matrices, until 2x2 matrices are
reached, for which the definition of a 2x2 determinant is applied.
For example, the determinant of a 3x3 matrix is defined as follows:
| c1 c2 c3 |
| x1 y1 z1 | = c1 | y1 z1 | - c2 | x1 z1 | + c3 | x1 y1 |
| x2 y2 z2 | | y2 z2 | | x2 z2 | | x2 y2 |
We're finally to the point where the application of matrix
multiplication can be applied to vector operations. You may
recall that the equation for a cross product between two
vectors was a bit lengthy - difficult to remember. If we load
the components of two vectors into a 3x3 matrix as follows,
the determine will exactly equal the cross product. Consider
the following example of two vectors A and B, from which we form
the following matrix:
A x B = <x1, y1, z1> X <x2, y2, z2>
| i j k |
A x B = | x1 y1 z1 | = | y1 z1 | i + | x1 z1 | j + | x1 y1 |
| x2 y2 z2 | | y2 z2 | | x2 z2 | | x2 y2 |
A X B = <y1z2 - z1y2, z1x2 - x1z2, x1y2 - y1x2>
In addition to the components of the two vectors, this matrix is
loaded with the unit vectors - >i, j, k< - where i, j, and k
are vectors of length one that are parallel to the x, y, and z
axes of the coordinate system. These were discussed earlier
in this tutorial.
The main point to recognize from these equations is that the final line matches
exactly the definition of a cross product! The format of a matrix
is easy to remember and its determinant calculated to
arrive at the cross product of two vectors without have to memorize
the final equation.
Matrix operations can also be used to perform a dot product between
two vectors. Simply put the first vector in a single row matrix
and the second vector in a single column matrix and then perform
matrix multiplication. The result is the dot product of the two
vectors, as seen in this example:
A·B = <x1, y1, z1> |x2|
|y2|
|z2|
A·B = x1x2 + y1y2 + z1z2
As you can see, the final equation is exactly the definition
of a dot product!
Now that the use of matrix operations for calculating dot products
and cross products has been demonstrated, we'll show how matrix
multiplications can also be used to perform transformation (translation,
scaling, and rotation) of the points in a 3D graphics scene.
Matrix Transformations
Another, more important value of matrices is their use in performing
all of the transformations (translations, scaling, and rotation) on points.
Specifically, a 4x4 matrix can be defined which, when multiplied by
a one row matrix loaded with the vector components, can achieve any
of the desired transformations. These 4x4 matrices are appropriately
called transformation matrices.
Here's an example of a 4x4 matrix which can be multiplied by the
vector matrix to achieve translation of each point by an amount
dx, dy, and dz, respectively:
<x', y', z', 1> = <x, y, z, 1> | 1 0 0 0 |
| 0 1 0 0 |
| 0 0 1 0 |
| dx dy dz 1 |
<x', y', z', 1> = <x+dx, y+dy, z+dz, 1>
The first 3 values in the result are the new coordinates of the
points, translated by dx, dy, and dz respectively. The last column
is an artifact of using a 4x4 matrix to perform the transformation
and is ignored. All of the 4x4 matrices that are used for transformations
utilize a fourth column containing 0,0,0,1 and when used for multiplication
result in a fourth column of value 1 which is ignored.
This particular example may seem too trivial to require the use of
matrix multiplication, but it does show the concept that a 4x4 matrix
can be used to effect a vector (or point) transformation.
A similar 4x4 matrix can be constructed to scale a vector, as shown
here:
<x', y', z', 1> = <x, y, z, 1> | Sz 0 0 0 |
| 0 Sy 0 0 |
| 0 0 Ax 0 |
| 0 0 0 1 |
<x', y', z', 1> = <Sx, Sy, Sz, 1>
As before, the first 3 elements of the matrix multiplication are the
new coordinates of the points, scaled by a factor of S.
Though simple, both of these can also show the concept of a combining
transformation matrices in order to reduce the total number of calculations
needed to perform 3D transformations. If we let |A| and |B| be the
translation and transformation matrices, then to perform both operations
on a vector can be written as:
X' = (|X| * |A| ) * |B|
X' = |X| * (|A| * |B|)
X' = |X| * |K|
where |K| = |A| * |B|. This points out the idea of combining matrices
before applying them to a vector. Applying the transformation matrix
to thousands of points in a 3D graphics scene can reduce the number of
calculations be as much as 75% - a significant boost for real-time
3D animation.
Rotation Matrices
Rotation of objects in a 3D scene is at the top of the list of
transformations used in 3D animation. In general, rotation
is not that difficult. Earlier in this discussion the trigonometric equations
for rotation were presented. Those equations were for the
rotation of a point about one of the axes of the coordinate system,
which is a very common transformation.
Rotations about an arbitrary point or line are much more complicated
and are seldom done as a single transformation. Generally, the
3D graphics industry performs complicated transformations as a combination
of simpler transformations. For example, to rotate an object simultaneously
about all three axes is typically achieved by rotating about the x
axis, followed by rotation about the y axis, and then followed by
rotation about the z axis. The result is the same, but the individual
calculations are simpler - at the expense of performing more operations.
The equations that were presented earlier in this tutorial can be
inserted into a matrix to form transformation matrices which will
rotate about the x, y, or z axes.
The following matrix multiplication will rotate a point
around the x-axis through an angle a.
<x', y', z', 1> = <x, y, z, 1> | 1 0 0 0 |
| 0 cos(a) sin(a) 0 |
| 0 -sin(a) cos(a) 0 |
| 0 0 0 1 |
The following matrix multiplication will rotate a point
around the y-axis through an angle a.
<x', y', z', 1> = <x, y, z, 1> | cos(a) 0 -sin(a) 0 |
| 0 1 0 0 |
| sin(a) 0 cos(a) 0 |
| 0 0 0 1 |
The following matrix multiplication will rotate a point
around the z-axis through an angle a.
<x', y', z', 1> = <x, y, z, 1> | cos(a) sin(a) 0 0 |
| -sin(a) cos(a) 0 0 |
| 0 0 1 0 |
| 0 0 0 1 |
By way of clarification, the angle used here is the incremental angle
of rotation, not the cumulative angle of rotation.
To see how these transformation matrices can be programmed, take a look
in the code section of this site. Both Visual Basic and Java applet
source code are provided which utilize matrix multiplications.
Homogeneous Coordinates
When the idea of 4x4 transformation matrices was introduced, the
derivation for the matrices was not discussed. Since the
combination of individual transformations into a single transformation
has been shown to be desirable, the first criteria was that translation,
scaling, and rotation should have common transformation matrix sizes.
Scaling can be performed using a 3x3 transformation matrix, but both
translation and rotation require at least a 4x4 transformation matrix.
So the scaling transformation matrix is has been increased by one
dimension in order to allow combination of all three transformation
types - translation, scaling, and rotation.
The 4 dimensions used in the transformation matrices are known
a homogeneous coordinates.
Return to top of document
Projections
Regardless of the approach taken to calculating points locations after a transformation
- straight trigonometric equations or matrix multiplications - the next step is to display the 3D scene
information on the 2D scene of the computer screen. There are two basic methods
of projecting the 3D scene onto the computer screen - parallel and perspective
projection. In either case, the goal is to figure draw each point of the 3D scene
onto the computer screen - converting the x, y, and z coordinates to a corresponding
x' and y' coordinate on the screen.
Parallel Projection
This is certainly the simplest approach top placing a 3D scene onto a 2D screen.
In this approach, there is no perspective at all - the x and y values of the objects
in the scene are the same as on the screen.
x' = x
y' = y
This has the serious drawback that all objects, no matter how "distant" in the 3D scene
all look the same size. So, while it's easy to draw, it's seldom used because the
images are not realistic.
However, it also has the advantage of being simpler to code, plus for very simple
3D scenes, particular CAD-type drawings, the result may be totally adequate.
Perspective Projection
This approach provides the visual cues of distance that you would expect in a
photo-realistic drawing - distant objects seem smaller. There are several variations
of perspective projection, but the general technique is the standard for all
3D graphics editing software.
From a geometrical point of view, a perspective projection results in an image
where parallel lines (from the 3D scene) appear to converge (on the projected image).
The point of convergence is called the vanishing point. A vanishing point where
lines parallel to a coordinate axis would meet is called a principal vanishing point.
Not all parallel lines run parallel to the coordinate axes, but the principle
of perspective requires that parallel lines converge, so 3D scene may include
other non-principal vanishing points.
In the standard derivation of a perspective transformation, it is assumed that
the center of projection (vanishing point) is at some positive distance d along the
z axis (0,0,d) and that a point P of coordinates (x,y,z) is projected onto
the x-y plane to a new point P'. Under those circumstances, the equations
for the new x' and y' coordinates become:
x' = dy /(d-z)
y' = dx /(d-z)
z' = 0
And the equivalent transformation matrix can be written as:
<x', y', z', 1> = <x, y, z, 1> | 1 0 0 0 |
| 0 1 0 0 |
| 0 0 0 -1/d |
| 0 0 0 1 |
Practical Matters of Displaying a Projection
Regardless of the projection approach that you use, there are other decisions to
be made when attempting to create a 3D scene. For example, consider the idea
of a stand alone 3D model of a simple cube.
When defining the 8 points that define the cube it is common to define the coordinates
in terms of the center of the cube. Then, when it is placed into a 3D scene, the
cube (its points or polygons) are translated to the location in the 3D scene.
This compares to giving a cube world coordinates which are with respect to a
particular coordinate system origin. The approach works fine as long as the model
is placed in the coordinate system which matches it's coordinates, but complicates
placement if the model is to be re-used in another 3D scene.
Another decision to be made is the location of the origin of the world coordinate
system with respect to the computer screen. It is common to place the origin
at the center of the screen, with the positive x direction pointing to the
right, the positive y direction pointing to the top of the screen, and positive
z pointing into the computer screen.
Unfortunately, the various computer languages don't use this convention,
nor do all the computer language use the same convention amongst themselves.
Visual Basic, for example, uses the 0,0 point as the upper left hand corner of
the screen.
So, when coding a 3D application you may have to make language-specific translations
when placing the object in the 3D scene for the first time. Once placed correctly,
future transformations will give the expected results.
|