Our initial introduction to the Kalman filter was easy to understand because both the motion and measurement models were assumed to be one-dimensional. That’s great if you’re a lustrous point in Lineland, but the three dimensional world must be dealt with sooner or later. Specifically, within the initial introduction, location (or state) x, the control input u, and the measurement z were all scalar (numeric) values along a one-dimensional line. For actual use of the Kalman filter, x, u, and z are much more frequently vectors instead of scalar units. In order for the vectors to play nicely with one another (to add and subtract them from each other), matrices must be used to tranform the vectors into a common form. Accordingly, before delving further into the Kalman filter, this post provides a basic review of matrices and matrix operations to better prepare ourselves for more gory Kalman filter details.

To better visualize why we need to be concerned with matrices, assume you are using the Kalman filter for localization of your humble robot on a 2D map. Furthermore, assume that the robot is holonomic on a 2D plane (can turn in a circle on a dime). We now need to create a model to adequately represent the pose (or “kinematic configuration” if you’re feeling fancy), the motion model (the control input), and the measurement model (which we’ll ignore for now to focus on matrices).

The state, x, or pose of our robot, is succinctly represented as a three-dimensional column vector made up of the x and y coordinates of the robot on the two-dimensional map along with the robot’s orientation relative to the x axis, represented as θ. (Note that the x positional component here is decidedly different than the x vector variable representing the overall pose.) This 3D column vector, representing the pose on a 2D plane, is shown at right.

The control input, u, or motion model of our robot, can be represented in various forms, examples of which are described in detail in Probabilistic Robotics [1]; but for the topic at hand, assume that the motion model is simply a constant velocity, v, between two ticks of time, represented as a 2D vector containing speed and direction.

With this information, if given the previous pose and the motion model over a given timeframe, we can then calculate the current pose. To do so, we’ll need a linear equation which adds the previous pose to the control input. But the velocity vector can’t simply be added to the vector representing the previous pose – we’re talking apples and oranges here. We’ll need a transformation matrix to transform the velocity into a 3D vector which can be added to the pose. This is starting to get into Part II of the Kalman filter introduction, but this starts to give you an idea of how matrices will be used in the Kalman filter.

So onward with our matrix primer!


As illustrated above, a column vector is an ordered set of values with n dimensions, where n is the number of values within the vector. The values within a vector need not be limited to being scalars; e.g., one or more values within the vector could also be a vector. By convention, a vector is assumed to be a column vector unless otherwise noted. A vector is symbolized as a bold-face, lower-case letter.

If all of the elements of a vector are 0, the vector is a null vector.

The transpose of a column vector is a row vector (and vice-versa) and has a superscript T to denote as such.


A matrix is a two-dimensional array of scalar values (or coefficients) having r rows and n columns, noted as having (rxn) dimensions. If both r and n are one, then the matrix is a scalar value. If just n is one, then the matrix is a vector. If just r is one, then the matrix is a row vector. If r = n then the matrix is a square matrix. Matrices are symbolized as a bold-face, upper-case letter.

If all of the elements of a matrix are 0, the matrix is a null matrix. If all of the diagonal elements of a square matrix (e.g., a11, a22, …, arn) have a value while all others do not, the matrix is a diagonal matrix. If all of the diagonal elements of a diagonal, square matrix are 1, then the matrix is an identity matrix. An example identity matrix is shown at right.

The transpose of a matrix is the matrix “flipped” on its diagonal; it is created by writing the rows of A as the columns of AT.  Accordingly, the columns (n) and rows (r) of A will equal the rows (r) and columns (n) of AT, respectively; e.g., if A has the dimensions (23) then AT has the dimensions (32).

Matrix Operations

Scalar/Matrix Multiplication

When looking at available operations among scalars, vectors, and matrices, it’s easiest to start with the multiplication of a matrix by a scalar value.  Simply enough, each value within the matrix is simply multiplied by the scalar; quite elementary indeed.

Matrix/Matrix Addition & Subtraction

The next trivial operation is that of matrix-to-matrix addition and subtraction.  Simply enough, each value in the first matrix is added to, or subtracted by, the respective element in the second matrix.  In order to add or subtract to matrices, the matrices must have the same (rxn) dimensions.

Matrix/Vector Multiplication

As mentioned in the opening of this review, it is necessary within the Kalman filter to transform a control vector, for example, into a state vector, so that it may be added to the previous state to calculate the current state.  This transformation is achieved by multiplying the control vector by a matrix representing how the control vector relates to the state.

In more generic terms, a resulting variable may be the result of a linear function of another vector and a matrix representing how the vector being acted upon relates to the result.  (You might want to read that again.)  The linear funtion for the result is written as y = Ax.  More simply put, A is a matrix which represents how the vector x relates to y, the result; accordingly, A transforms x into y.  In matrix-speak, this is a linear transformation.

In order to transform a vector by a matrix, the number of columns (n) of A must equal the dimension (n) of x.  Additionally, the number of rows (r) of A will equal the dimension (n) of y.  If these constraints hold, then A is said to be conformable to x.

The following demonstrates how each value of y is calculated:

  • y1 = a11x1 + a12x2 + … + a1nxn
  • y2 = a21x1 + a22x2 + … + a2nxn
  • yr = ar1x1 + ar2x2 + … + arnxn

Interestingly, if the matrix A is a diagonal matrix (square by implication), then each y value is the product of the corresponding x and diagonal value in the matrix. If the matrix A is an identity matrix (also square by implication), then each y value is equal to the corresponding x value.  Examples of each are shown at right.

Matrix/Matrix Multiplication

The last topic worth mentioning in detail, in our rather elementary review of matrices and matrix operations, is that of multiplying two matrices together.

In order to get the result of the product of two matrices, e.g., CAB, the number of columns (n) of A must equal the number of rows (r) of B.  The result, C, will have the number of rows (r) of A and the number of columns (n) of B.

The following demonstrates how each value of C is calculated:

  • c11 = a11b11 + a12b21
  • c12 = a11b12 + a12b22
  • c21 = a21b11 + a22b21
  • c22 = a21b12 + a22b22
  • c31 = a31b11 + a32b21
  • c32 = a31b12 + a32b22

If both A and B were square, AB ≠ BA due to order in which rows and columns are multiplied and summed.  But when multiplying by an identity matrix, AAI = IA.

There is certainly much more to matrices and matrix operations, but the above gives enough to move on to the Part II of our introduction to the Kalman filter and to understand the implication of matrices when used within signals control and robotics literature.  Incidentally, this should also be enough information to understand just about every use of a vector and matrix within Sebastian Thrun’s Probabilistic Robotics (a highly recommended read if you’re interested in mobile robotics).  For a more comprehensive review of matrices and their use within control systems, there are fewer texts better (albeit, a bit daunting) than Robert Stengel’s Optimal Control and Estimation.

Billy McCafferty