Derivatives of Generalized Eigenvalues and Eigenvectors
By David K. Zhang
Consider the generalized eigenvalue problem
where and are matrices, is an eigenvalue, and is the corresponding eigenvector. For simplicity, we will restrict our attention to the special case of a symmetric-definite eigenvalue problem, in which and are real symmetric matrices and is positive-definite.
Suppose and depend smoothly upon a real parameter . That is to say, the entries and are smooth (infinitely differentiable) functions of . Suppose further that this dependence is explicitly known, in the sense that the derivatives and can be calculated by a known procedure. Then we can ask how the eigenvalues and eigenvectors vary in response to change in .
By differentiating equation (1) with respect to and applying the product rule, we have
which, after collecting and regrouping terms on the LHS, can be rewritten as
Now, we take the inner product of both sides of this equation with the original eigenvector .
Observe that by the original eigenvalue equation (1). Thus, , and only the first term of equation (4) remains.
Now, by solving for , we obtain the result
Remark: Many symmetric-definite eigen-solvers (for example, xSYGVX in LAPACK) follow the particularly convenient convention of returning eigenvectors normalized so that . In this case, a few CPU cycles can be saved by omitting division by from equation (7).
To obtain the corresponding formula for , we return to equation (3), which can be rewritten as
It is tempting to multiply both sides of this equation by , but the eigenvalue equation (1) guarantees that . Thus, the matrix , having nontrivial kernel, is explicitly singular. This means that equation (8) fails to uniquely specify the vector
However, we can still derive a meaningful result if we multiply not by the inverse , but by the Moore-Penrose pseudoinverse In this case, we obtain the equation
where is not the identity matrix, but the projection matrix onto the orthogonal complement of
This has a clear geometric interpretation if we recall that an eigenvector is never uniquely determined. In the non-degenerate case, an eigenvector is only determined up to a nonzero multiplicative constant, and in general, a set of degenerate eigenvectors is determined by any choice of basis in the eigenspace Thus, any change in occurring inside is geometrically inconsequential. The meaningful change is that which occurs orthogonal to and this is precisely what is computed by the Moore-Penrose pseudoinverse.
We will therefore adopt the convention that the derivative should always be assumed orthogonal to the eigenspace . With this convention in place, equation (9) reduces to the final result
Remark: It is possible to evaluate the RHS of equation (10) without directly computing the pseudoinverse of thanks to the following fact: if is a rank-deficient matrix, then is the unique minimum (Euclidean) norm solution of the underdetermined linear system . Minimum-norm solvers for undetermined linear systems are widely available (for example, xGELSD in LAPACK) and may be preferable depending on the efficiency of the provided implementation.
Suppose now that the matrices and depend smoothly upon two real parameters and . By differentiating equation (2) with respect to , we see that
As before, we take the inner product of both sides with to obtain
and use the original eigenvalue equation to eliminate a pair of terms:
By solving for , we obtain the result
With the second derivative of the eigenvalue in hand, we return to equation (11) and isolate terms containing to obtain
Applying the same technique as before, we multiply both sides by the pseudoinverse to obtain the result
In the special case these results reduce to the following: