Bundle Adjustment – Part 1: Jacobians

Bundle Adjustment (BA) is a well established technique in Computer Vision typically used as the last step in many feature based 3D reconstruction algorithms to tweak the intrinsic and extrinsic camera parameters as well as the position of the reconstruction points to minimize the reconstruction error between the projected and observed image points1

In this series of posts, we’ll examine BA in detail. In the first post, we’ll derive expressions for the elements of the Jacobian matrix used in BA, in the second post, we’ll plot some of these Jacobians and consider their properties and in the third and fourth post, we’ll apply BA to simulated data sets and analyze the results.

Assume each camera j is parameterized by a vector a_j and each 3D point i by a vector b_i. Then, Bundle Adjustment involves minimizing the following objective function:

\min_{a_j,b_i}\sum\limits_{i=1}^n\sum\limits_{j=1}^m d(Q(a_j, b_i), x_{ij})^2

Here, Q(a_j, b_i) is the predicted projection of point i on the image j and d(x,y) denotes the Euclidean distance between the image points represented by the vectors x and y.

The vector a_j consists of the rotation and translation parameters to transform the world point in the camera coordinate system and the intrinsic parameters of the camera that is the focal length (in the x and y dimensions), coordinates of the center of projection and the skew factor (if any).

Together, these parameters are combined as follows to project a 3D world point on the image:

\begin{bmatrix}wu\\wv\\w\end{bmatrix} = \begin{bmatrix}f_x & s_k &u_0 \\0 & f_y & v_0 \\ 0 & 0 & 1 \end{bmatrix}\begin{bmatrix}R & T \end{bmatrix}\begin{bmatrix}X \\ Y \\ Z \\ 1 \end{bmatrix}

Henceforth, we’ll assume that our camera is a rectilinear camera with square pixels. Thus, f_x = f_y and s_k = 1.

Bundle Adjustment methods typically employ the Levenberg Marquardt (LM) algorithm to find the minimum of the optimization function. The LM algorithm needs the jacobians i.e., the partial derivatives of the image coordinates wrt the intrinsic and extrinsic parameters of the camera, and the coordinates of the 3D points. During my research on Bundle Adjustment, I found plenty of references that describe the structure of the problem and the techniques used to solve it, but none that show how to compute these jacobians that are needed by any practical implementation. The primary purpose of this post is to show how to compute the jacobians and practical information about some of their properties.

Let’s first consider the number of independent parameters that we need to optimize over. For  N points each of which is visible in M images, the number of independent parameters is:

Here, X, Y, Z are the 3D coordinates of the point, f, u_0, v_0 are the focal length of the camera and the coordinates of the center of projection respectively (we assume square pixels and zero skew), w_x, w_y, w_z are the parameters of the rotation matrix in the axis-angle representation and t_x, t_y, t_z are the elements of the translation vector between the object and the camera coordinate system. Note that while the rotation matrix has 9 elements, due to the orthogonality and unit norm constraints, only 3 are independent. The axis-angle representation is a convenient format to represent rotations as it only has 3 elements and therefore keeps the number of elements used to represent a rotation to a minimum.

The axis-angle and rotation matrix formats can be inter-converted using the following equations.

R=e^{\lbrack w\rbrack_x}=I+\frac{sin(\|w\|)}{\|w\|} \lbrack w \rbrack _x + \frac{(1-cos(\|w\|))}{\|w\|^2} (\lbrack w \rbrack _x)^2

\|w\| = \arccos(\frac{trace(R)-1}{2}, \frac{w}{\|w\|}=\frac{1}{2sin(\|w\|)}\begin{bmatrix}r_{32}-r_{23}\\r_{13}-r_{31}\\r_{21}-r_{12}\end{bmatrix}

Here \lbrack w \rbrack _x has the following form:

\lbrack w \rbrack _x = \begin{bmatrix} 0 & -w_z & w_y \\ w_z & 0 & -w_x \\ -w_y & w_x & 0 \end{bmatrix}

If we represent the image of a 3D point P by a 2D vector \bf{x} = (u,v) then the jacobian matrix can expressed as two separate matrices A and B that encode the variation of \bf{x} by the camera parameters denoted by the vector \bf{a} and 3D point coordinates denoted by the vector \bf{b}.

A = \begin{bmatrix}\frac{\partial{u}}{\partial{\bf{a}}} \\ \frac{\partial{v}}{\partial{\bf{a}}} \end{bmatrix} = \begin{bmatrix}\frac{\partial{u}}{\partial{w_x}} & \frac{\partial{u}}{\partial{w_y}} & \frac{\partial{u}}{\partial{w_z}} & \frac{\partial{u}}{\partial{f}} & \frac{\partial{u}}{\partial{u_0}} & \frac{\partial{u}}{\partial{v_0}}\\ \frac{\partial{v}}{\partial{w_x}} & \frac{\partial{v}}{\partial{w_y}} & \frac{\partial{v}}{\partial{w_z}} & \frac{\partial{v}}{\partial{f}} & \frac{\partial{v}}{\partial{u_0}} & \frac{\partial{v}}{\partial{v_0}}\end{bmatrix}

B = \begin{bmatrix}\frac{\partial{u}}{\partial{\bf{b}}} \\ \frac{\partial{v}}{\partial{\bf{b}}} \end{bmatrix} = \begin{bmatrix}\frac{\partial{u}}{\partial{X}} & \frac{\partial{u}}{\partial{Y}} & \frac{\partial{u}}{\partial{Z}} \\ \frac{\partial{v}}{\partial{X}} & \frac{\partial{v}}{\partial{Y}} & \frac{\partial{v}}{\partial{Z}} \end{bmatrix}

In, general the expressions for these partial derivatives are quite complex. Most of the complexity arises from converting the rotation from axis-angle representation to rotation matrix representation. If we consider a 3D point that has already been transformed into the camera coordinate system, then the jacobian becomes significantly simpler and easier to analyze. Let’s denote the 3D point in the camera coordinate system by P^\prime. P^\prime is related to P as:

P^\prime = \begin{bmatrix}R & t\end{bmatrix}P

The image coordinates u and v can be obtained by applying the perspective projection equation:

u = \frac{fX^\prime}{Z^\prime}+u_0, v = \frac{fY^\prime}{Z^\prime}+v_0

The jacobians are now easily obtained as:

A = \begin{bmatrix}\frac{\partial{u}}{\partial{\bf{a}}} \\ \frac{\partial{v}}{\partial{\bf{a}}} \end{bmatrix} = \begin{bmatrix} \frac{\partial{u}}{\partial{f}} & \frac{\partial{u}}{\partial{u_0}} & \frac{\partial{u}}{\partial{v_0}}\\ \frac{\partial{v}}{\partial{f}} & \frac{\partial{v}}{\partial{u_0}} & \frac{\partial{v}}{\partial{v_0}}\end{bmatrix} = \begin{bmatrix} \frac{X^\prime}{Z^\prime} & 1 & 0 \\ \frac{Y^\prime}{Z^\prime} & 0 & 1\end{bmatrix}

B = \begin{bmatrix}\frac{\partial{u}}{\partial{\bf{b}}} \\ \frac{\partial{v}}{\partial{\bf{b}}} \end{bmatrix} = \begin{bmatrix} \frac{\partial{u}}{\partial{X^\prime}} & \frac{\partial{u}}{\partial{Y^\prime}} & \frac{\partial{u}}{\partial{Z^\prime}}\\ \frac{\partial{v}}{\partial{X^\prime}} & \frac{\partial{v}}{\partial{Y^\prime}} & \frac{\partial{v}}{\partial{Z^\prime}}\end{bmatrix} = \begin{bmatrix} \frac{f}{Z^\prime} & 0 & -\frac{fX}{Z^{\prime2}} \\ 0 & \frac{f}{Z^\prime} & -\frac{fY}{Z^{\prime2}}\end{bmatrix}

Let’s look at the value of these jacobians at some reasonable values of f, X^\prime, Y\prime, Z^\prime. Let’s consider a camera with a focal length of 5mm looking at a point located at X^\prime=20, Y^\prime=20, Z^\prime=80. Using these values, the jacobian matrix is:

A = \begin{bmatrix}\frac{\partial{u}}{\partial{\bf{a}}} \\ \frac{\partial{v}}{\partial{\bf{a}}} \end{bmatrix} = \begin{bmatrix} \frac{\partial{u}}{\partial{f}} & \frac{\partial{u}}{\partial{u_0}} & \frac{\partial{u}}{\partial{v_0}}\\ \frac{\partial{v}}{\partial{f}} & \frac{\partial{v}}{\partial{u_0}} & \frac{\partial{v}}{\partial{v_0}}\end{bmatrix} = \begin{bmatrix} 0.25 & 1 & 0 \\ 0.25 & 0 & 1\end{bmatrix}

B = \begin{bmatrix}\frac{\partial{u}}{\partial{\bf{b}}} \\ \frac{\partial{v}}{\partial{\bf{b}}} \end{bmatrix} = \begin{bmatrix} \frac{\partial{u}}{\partial{X^\prime}} & \frac{\partial{u}}{\partial{Y^\prime}} & \frac{\partial{u}}{\partial{Z^\prime}}\\ \frac{\partial{v}}{\partial{X^\prime}} & \frac{\partial{v}}{\partial{Y^\prime}} & \frac{\partial{v}}{\partial{Z^\prime}}\end{bmatrix} = \begin{bmatrix} 0.00625 & 0 & -0.0015625 \\ 0 & 0.00625 & -0.0015625\end{bmatrix}

 

From this jacobian matrix, we can make the following observations.

  • The image coordinates are much more sensitive to the intrinsic parameters than to the coordinates of the object point in the camera coordinate system. Among the intrinsic parameters, the coordinates of the center of projection is the most important (for \frac{X^\prime,Y^\prime}{Z^\prime}<1). Thus, it is very important to calibrate the camera properly in practical applications of computer vision.
  • For \frac{X^\prime,Y^\prime}{Z^\prime}<1, changes in X^\prime have a larger effect than changes in Z^\prime on the image coordinates. This agrees with our intuition that for small objects, movement along the viewing direction effects the image less than movement perpendicular to the viewing direction.

Recall that \bf{P^\prime} = [X^\prime,Y^\prime, Z^\prime] are the coordinates of the object point in the camera coordinate system. Thus, \bf{P^\prime} = [R | t]P. If the object and camera coordinates systems are aligned, R = I and hence [X^\prime,Y^\prime, Z^\prime] = [X, Y, Z] + [t_x, t_y, t_z]. For small objects, most of the contribution to \bf{P^\prime} is from the camera-object translation vector. Thus, the effect of changes in the object point coordinates in the object coordinate system on the image coordinates is even smaller.

Let’s now consider the complete expressions for the Jacobians. Due to the complexity introduced by converting the axis-angle representation into a rotation matrix used in the perspective projection equation, calculating the jacobians by hand is very tedious. I use the symbolic toolbox in Matlab to compute the jacobians. Here’s the code:

As mentioned earlier, the expressions for the jacobians are very complex, however most of the complexity arises from the axis-angle representation for rotation. For example, here’s the expression for Jy_z:

– (f*((wx*sin((wx^2 + wy^2 + wz^2)^(1/2)))/(wx^2 + wy^2 + wz^2)^(1/2) + (wy*wz*(cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1))/(wx^2 + wy^2 + wz^2)) – v0*(((cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1)*(wx^2 + wy^2))/(wx^2 + wy^2 + wz^2) + 1))/(tz – X*((wy*sin((wx^2 + wy^2 + wz^2)^(1/2)))/(wx^2 + wy^2 + wz^2)^(1/2) + (wx*wz*(cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1))/(wx^2 + wy^2 + wz^2)) + Y*((wx*sin((wx^2 + wy^2 + wz^2)^(1/2)))/(wx^2 + wy^2 + wz^2)^(1/2) – (wy*wz*(cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1))/(wx^2 + wy^2 + wz^2)) + Z*(((cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1)*(wx^2 + wy^2))/(wx^2 + wy^2 + wz^2) + 1)) – ((((cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1)*(wx^2 + wy^2))/(wx^2 + wy^2 + wz^2) + 1)*(f*ty + X*(f*((wz*sin((wx^2 + wy^2 + wz^2)^(1/2)))/(wx^2 + wy^2 + wz^2)^(1/2) – (wx*wy*(cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1))/(wx^2 + wy^2 + wz^2)) – v0*((wy*sin((wx^2 + wy^2 + wz^2)^(1/2)))/(wx^2 + wy^2 + wz^2)^(1/2) + (wx*wz*(cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1))/(wx^2 + wy^2 + wz^2))) + tz*v0 + Y*(v0*((wx*sin((wx^2 + wy^2 + wz^2)^(1/2)))/(wx^2 + wy^2 + wz^2)^(1/2) – (wy*wz*(cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1))/(wx^2 + wy^2 + wz^2)) + f*(((cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1)*(wx^2 + wz^2))/(wx^2 + wy^2 + wz^2) + 1)) – Z*(f*((wx*sin((wx^2 + wy^2 + wz^2)^(1/2)))/(wx^2 + wy^2 + wz^2)^(1/2) + (wy*wz*(cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1))/(wx^2 + wy^2 + wz^2)) – v0*(((cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1)*(wx^2 + wy^2))/(wx^2 + wy^2 + wz^2) + 1))))/(tz – X*((wy*sin((wx^2 + wy^2 + wz^2)^(1/2)))/(wx^2 + wy^2 + wz^2)^(1/2) + (wx*wz*(cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1))/(wx^2 + wy^2 + wz^2)) + Y*((wx*sin((wx^2 + wy^2 + wz^2)^(1/2)))/(wx^2 + wy^2 + wz^2)^(1/2) – (wy*wz*(cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1))/(wx^2 + wy^2 + wz^2)) + Z*(((cos((wx^2 + wy^2 + wz^2)^(1/2)) – 1)*(wx^2 + wy^2))/(wx^2 + wy^2 + wz^2) + 1))^2

🙂

1.
Lourakis MIA, Argyros AA. SBA. ACM Transactions on Mathematical Software. 2009;36(1):1-30. doi: 10.1145/1486525.1486527 [Source]
1.
Lourakis MIA, Argyros AA. SBA. ACM Transactions on Mathematical Software. 2009;36(1):1-30. doi:10.1145/1486525.1486527

3 Comments

  1. Hello Ankur!
    Your works has really impressed me. I’m a researcher from Ha Noi – Viet Nam. I have read and tryed to replicate your steps in Bundle Adjustment. I would like you to do a favor: Share your simulated data sets? Or You show how to create these data sets.
    Thank you very much!

Leave a Reply

Your email address will not be published.


*