As taught by Prof. Bruce Hughes
Vanderbilt University - Fall Semester 2015
Transcribed by David K. Zhang
Warning!
The notes below were taken live in-class during Prof. Hughes' MATH 2500 lectures at Vanderbilt University. They are provided here exclusively as a historical record of what topics were covered during each day of class. For readers who aim to learn this material for the first time, they are incomplete in several crucial ways:
Gaps in the logical flow of ideas due to missed classes.
Omitted discussions, proofs, and examples which I considered obvious.
Missing sections which were written down on paper and never typeset.
Time permitting, I plan to release an expanded and revised version of these notes in the future to address these deficiencies.
Lecture 02 (2015-08-28)
Definition: A vector space is a set with two operations:
(VA) Vector addition: , there is a vector .
(SM) Scalar multiplication: , , there is a vector .
These operations must satisfy the following properties:
VS1. , .
VS2. , .
VS3. such that , . ( is called a zero vector.)
VS4. , such that . ( is called an additive inverse of , and is denoted by .)
VS5. , , .
VS6. , , .
VS7. , , .
VS8. , .
Theorem (Cancellation Property of Vector Addition): If and , then .
Proof: By hypothesis, . Applying VS1, we have . Letting be the additive inverse of (using VS4), we have . We use VS2 to regroup this as , and VS4 to conclude that . Finally, applying VS3 and VS1, we have , as desired. QED
Theorem: The additive inverse of a vector is unique. (In other words, if and and , then .)
Proof: It follows that . The cancellation property implies . QED
Theorem: .
Proof: We need to show that . This follows from VS3. QED
Lecture 03 (2015-08-31)
We will temporarily leave the realm of abstract vector spaces and return to . Recall that any vector is an -tuple of real numbers. We adopt the convention that be written when interpreted as a point, and when interpreted as a vector. Points are simply visualized as points in , while vectors are visualized as directed line segments from the origin to the point .
Definition: If , then the distance from to is
This definition generalizes the Pythagorean theorem for .
Definition: The length (aka norm, magnitude) of a vector is
i.e., the distance from to the origin.
How do scalar multiplication and vector addition interact with this geometric interpretation? If , , then
Definition: If , then and have the same direction. If , then they have opposite direction.
Definition: are parallel if one is a scalar multiple of the other, i.e., such that or .
Note: The zero vector is parallel to every vector.
If , then can be interpreted as the fourth point of the parallelogram with legs and (situated at the origin). This can be confirmed by checking that the vector from to has the same slope as , and vice versa. We call this interpretation the parallelogram law of vector addition.
Definition: If , then their dot product is
This is an operation .
Observe that . This is the length formula.
Basic algebraic properties of the dot product (See Proposition 2.1 in Shifrin):
, .
, and iff .
, .
, .
Lecture 04 (2015-09-02)
Proposition: , .
Proof.
QED
The dot product can be used to define a notion of angle in . To motivate the definition to come, draw a triangle in and interpret its sides as vectors , , and . The law of cosines then implies that
where is the angle between and . We then use the above proposition to deduce that in ,
Definition. If , then and are orthogonal (or perpendicular) if .
Definition. If and , then the cosine of the angle between and is . The angle between and is the unique number such that .
For this definition to make sense, we need to know that . This is established in the following theorem.
Theorem (Cauchy-Schwarz Inequality): If , then . Moreover, iff and are parallel.
Proof:Case 1. If and are unit vectors (i.e., ), then the above proposition implies
and
It follows that , and thus , as desired.
Case 2. If , then and are unit vectors, and case 1 implies that
By property 1 of the dot product, we have , as desired.
Case 3. If or , then , as desired. QED
Lecture 05 (2015-09-04)
Discussion of homework hints omitted.
Theorem (Triangle Inequality): If , then .
Proof: Observe that
where the middle inequality follows from Cauchy-Schwarz. It follows that , as desired. QED
(Proof ofomitted. Trivial by induction).
Definition: A subset of is a linear subspace if:
The zero vector is a member of .
For all , it follows that . (We say that is “closed under vector addition.”)
For all and , it follows that . (We say that is “closed under scalar multiplication.”)
Example: Let be the line in . is a linear subspace of . (Straightforward proofs of LS1-LS3 omitted.)
Example: The line is not a linear subspace of , as LS1 fails. We call such spaces affine subspaces.
Definition: If with , then the affine line in in the direction passing through is the set
In other words, iff such that . In coordinate form, this is the system of equations
This is the set of parametric equations defining . (In this form, is said to be “given explicitly.”) In addition, we call
the linear line in the direction .
Lecture 06 (2015-09-07)
Definition: If with , then the hyperplane in with normal vector passing through is
In other words, is the solution set of
This is an example of an affine subspace of , and we will eventually see that . We may observe that the equation can be rewritten as
and since and are given, the RHS of the above equation is simply a constant. Hence, every hyperplane in is the solution set of an equation of the form . The converse also holds; given an equation of the form , we can describe its solution as a hyperplane in by setting
(or using some other component of if ).
Claim: is a linear subspace of iff .
Proof: If is a linear subspace, then , and . Conversely, if , then it is straightforward to check that LS1-LS3 hold. QED
Definition: Let .
If , then
is called a linear combination of .
The span of is the set of all linear combinations of .
Examples:
If with , then is the linear line in the direction .
If are nonparallel, then is called the {linear plane in spanned by and }.
Claim: is a linear subspace of .
(Proof omitted. It is straightforward to check that LS1-LS3 hold.)
Lecture 07 (2015-09-09)
Definition: If are linear subspaces of , then and are orthogonal if , , .
Exercise: Are the following pairs of subspaces orthogonal?
The - and -axes in . (Yes.)
The - and -planes in . (No.)
The -axis and -plane in . (Yes.)
Definition: If is a linear subspace of , then the orthogonal complement of is
Problem 16 (HW3): is a linear subspace. (Clearly, and are orthogonal.)
We will eventually show that (i.e., that the map is an involution), but this fact actually requires some work.
Definition: A linear transformation is a function satisfying
,
, ,
Examples:
Define by . Then is a linear transformation. {(Straightforward verification of LT1-LT2 omitted.)}
Define by . Again, is a linear transformation. {(Straightforward verification of LT1-LT2 omitted.)}
Define by . Then is not a linear transformation. Indeed, LT2 fails, as . This motivates the following proposition:
Proposition: If is a linear transformation, then .
Proof: Observe that
It then follows from the cancellation property that . QED
Define by . Then is not a linear transformation. Indeed, LT2 can be shown to fail by taking and . This motivates the following proposition:
Proposition: A function is a linear transformation iff such that . (To be proven next lecture.)
Lecture 08 (2015-09-11)
Proof (of last lecture's proposition): The reverse direction is trivial (simply check LT1 and LT2). For the forward direction, let , and let be given. Then
This is the desired result. QED
Goal: A function is a linear transformation iff there exists an matrix such that for all .
Definition: Suppose is an matrix and is an matrix. (We say that and .) The product of and is a matrix having -entry
If we let be the columns of , and be the rows of , then .
Example: Take and . Then , and
Lecture 09 (2015-09-14)
Recall our goal: A function is a linear transformation iff there exists a matrix such that for all . (We have already proved the reverse direction.)
Proposition: If is a linear transformation, , and , then
Proof:
This is the desired result. QED
Matrix representation of a linear transformation. If , then
It follows that
and we see that the action of on any vector in is uniquely specified by its action on the standard basis vectors . This motivates us to give these vectors a name; let
for all .
Definition:
is the standard matrix representation of , denoted by .
Remark: Stated alternatively, is the matrix whose th column is given by .
Examples:
Let be defined by . Then
Let be defined by . Then
Let be defined by . Then
Let be defined by . Then
Theorem: If is a linear transformation, then for all .
Proof: Let be given. Write . Then
where in the third equality we have used the column POV. QED
Matrix algebra. Recall that denotes the set of all matrices. This is a vector space! “Vector addition” and “scalar multiplication” are performed entrywise, and VS1-VS8 are straightforward to show. Furthermore, matrix multiplication “mixes” these spaces, in the sense that it gives a map .
Lecture 10 (2015-09-16)
Three basic properties of matrix multiplication:
.
.
.
However, observe that
while
Thus, matrix multiplication does not satisfy the cancellation property. Furthermore,
so matrix multiplication is not necessarily commutative, nor does it satisfy the zero product property.
Back to linear transformations. Recall that every matrix induces a linear transformation given by . Conversely, every linear transformation has a matrix representation . It was proven in a previous lecture that for all .
It follows that , since the th column of is , and furthermore that , since .
Example: Suppose is a linear transformation satisfying
Given , what is ? (Solution omitted.)
Rotations. Let , and let be the counterclockwise rotation about the origin through the angle . It is easy to see that, by geometric principles, is a linear transformation. By considering the action of on and , we see that
We often denote this matrix by .
(Some notes seem to have gone missing here. There should be some information on interpreting matrix multiplication as composition of linear maps. In particular, this point of view makes proving associativity of matrix multiplication trivial, since functional composition is evidently associative.)
Lecture 11 (2015-09-18)
CLT Theorem: .
Proof: Column of is , while column of is . QED
Example: Let , where is the rotation through , and is reflection across the line . We have
Then
Intuitively, is reflection across the -axis, while is reflection across the -axis.
Theorem: Matrix multiplication is associative. That is, if , , and , then
Proof: Since composition of functions is associative, . Thus . But by the CLT theorem, this means , or equivalently, that . QED
Matrix invertibility. For the moment we will restrict attention to square matrices, writing . A very special member of is the identity matrix
whose columns are the standard basis vectors. Note that is a multiplicative identity. If and , then and .
Definition: is invertible if there exists such that
If this holds, we write and call the “inverse of .” (It turns out that this is unique if it exists.)
Examples:
has no inverse, since
is its own inverse (), since .
The matrix has inverse .
Theorem: If , then is invertible iff . If this holds, then
Definition: The determinant of is
Proof: For the reverse direction, it suffices to verify that
For the forward direction, observe that
It follows that . QED
Lecture 12 (2015-09-21)
Shoe-Sock Theorem: If are invertible, then is invertible and .
Proof: Check and .
The other identity follows similarly. QED
Definition: If , then the transpose of is , defined in the following way: if , then -entry of is .
Dot products as matrix products. If and , then , and .
Straightforward properties of transposition.
.
.
.
These properties imply that the transposition operator is a linear transformation.
Shoe-Sock-y Theorem: If and , then .
Proof: The -entry of is , while the -entry of is . QED
Theorem: If , , and , then .
Proof: . QED
Theorem: If is invertible, then is invertible, and
Proof: Check that and .
The other identity follows similarly. QED
Point-set Topology of . Balls and open sets.
Definition: Let and let . The ball of radius about is the set
Definition: A subset of is open if
Lecture 13 (2015-09-23)
Exam preparation Q/A session.
Lecture 14 (2015-09-25)
Example: Show that is open.
Proof: Let be given. We can write with . Choose . We would like to show that . To see this, let be given. Then
QED
Example: Show that is not open.
Proof: Pick and show that every open ball around is not fully contained in . (Details omitted.)
Definition: A sequence in a set is a set of elements of indexed by the natural numbers . Alternatively, it is a function .
Remark: Note that while a sequence has, by definition, an infinite number of terms, it may only contain a finite number of distinct entries.
Definition: If and is a sequence in , then we say converges to iff such that if , then .
Example: The sequence does not converge in . (Proof omitted. Pick .)
Lecture 15 (2015-09-28)
Example: Let for . Does converge, and if so, to what point?
Solution: Intuitively, we would guess that . To prove this, we let be given and figure out the appropriate . Since we need
a good choice would be to take .
Proof: Let be given. Choose . If , then , and it follows that
QED
Example: Let for . Does converge, and if so, to what point?
Lemma: does not converge if such that such that . (In other words, there exist infinitely many terms of the sequence outside .)
(Scratch work omitted.)
Proof: Let be given, and choose . We would like to show that infinitely many points of the sequence lie outside . For all choose . Then
QED
Theorem: If converges to and converges to , then converges to .
Proof: Let be given. Since converges to , there exists such that . Similarly, since converges to , there exists such that . Choose . If , then
QED
Definition: A subset of is closed if is open.
Examples:
Recall that is open. It follows that is closed.
Let and .
Claim: is closed.
Proof: Let and show that is open. Let be given. Choose . Since , . Note that . QED
Lecture 18 (2015-10-05)
Recall our discussion of the following three topics:
Limits of sequences. Given a sequence in and a point , we say that
if such that if , then .
Continuity of functions. We say that is continuous at if such that if , then .
Theorem: A function is continuous at iff for all sequences in converging to , the sequence converges to .
Proof: See online notes.
Limits of functions. Given a subset of , a function , and two points , we say that
if such that if and , then .
Remark: If such that such that , then
for all .
Definition: We call a limit point of if such that . Points in which are not limit points of are called isolated points of .
Example: Let . Then and are limit points of , but is not a limit point of .
Theorem: If and is a limit point of , then a function is continuous at iff .
Example: Let . Then has no limit points, and all of its points are isolated.
Definition: Let be an open subset of and . We say that a function is differentiable at if there exists a linear transformation such that
In this case, is called the derivative of at .
Remark: This definition gives no clue how to find the derivative of a given function, or even if it exists.
Remark: Note that needs to be a limit point of the domain of
in order for this definition to make sense. This is why we take to be an open set.
Lecture 19 (2015-10-07)
Remark: The idea behind this definition of the derivative is that should be a good local linear approximation to . In fact, it should be such a good linear approximation that the error
should fall off faster than .
Theorem: If is a linear transformation and , then is differentiable at , and .
Proof:
QED
Relationship to the 1D case. Recall that we say is differentiable at when
exists. When it does, we call it , the “derivative of at .” Two ideas underlie this definition: first, the geometric, which motivates this definition as the slope of the tangent line to the graph of at , and second, the physical, which sees it as the rate of change of at . To see the correspondence to the multivariate definition, note that
exists iff
Here, the linear transformation is simply multiplication by the scalar .
Definition: Let and . If , then the th partial derivative of at is
provided this limit exists. (Here, is the th standard basis vector.)
Remark: The th partial derivative of at measures the rate of change of with respect to change in in the direction.
Goal theorem: If and is differentiable at , then
Remark: This matrix is called the Jacobian matrix of at , denoted by .
Lecture 20 (2015-10-09)
Definition: Let and . Given , the directional derivative of at in the direction is
provided this limit exists.
Remark: We interpret as the instantaneous rate of change of with respect to a change in in the direction .
Special cases:
If , then .
If , then .
Theorem: If is differentiable at , then , exists, and
Corollary: , .
(Example omitted.)
Proof: If , the result holds trivially. For , since is differentiable at , we have
Let . Since , we have
Recall that . We consider separately the right-hand and left-hand limits.
Hence,
as desired. The left-hand case follows similarly. QED
Example: Let be defined by and
elsewhere. Note that , but
where the limit on the RHS does not exist. Hence, is not differentiable at .
Lecture 21 (2015-10-12)
Let , and . Recall that is differentiable at if there exists a linear transformation such that
The idea of this definition is to require to be a good local linear approximation to at .
(These notes have been lost! Recover them from a classmate.)
Lecture 22 (2015-10-14)
Definition: let and be differentiable at . The gradient of at is
Theorem: The tangent space to the graph of at is the hyperplane in with normal vector .
Proof: Recall that the tangent space to the graph of at is the graph of (i.e., the set of points in of the form for ). Thus, the tangent space is uniquely specified by the equation . Now recall that
The equation then becomes
as desired. QED
Remark: Note that we must distinguish between the linear tangent space, defined by and the affine tangent space, defined by .
Example: Let be defined by . Find the equation of the tangent (hyper)-planes (both linear and affine) to the graph of at .
Solution: and , so the normal vector to both tangent planes is . Hence the linear tangent plane is given by , and the affine tangent plane by .
{Geometric interpretation of .} Define . By definition, . This means that the cross-section of the graph of parallel to the -plane containing has slope directly over . Furthermore, if we define , then .
Theorem: The tangent space to the graph of at is . (To be proven next lecture.)
Fall Break (2015-10-16)
Lecture 23 (2015-10-19)
Let and be differentiable at . Recall the definition of the gradient
Recall also that the tangent space to the graph of at is the graph of its derivative . It was proven last lecture that this is precisely the hyperplane in with normal vector .
Let , and be the th standard basis vector. It follows from the previous result that
is a member of this tangent space.
Theorem: The tangent space to the graph of at is .
Proof: The reverse inclusion is clear from the previous observation. For the forward inclusion, let be a member of the tangent space. By definition we can write
for some . It follows that
QED
What about ? Does it have any nice interpretation of its own?
Physical interpretation of . Suppose you are standing at and want to move so that changes as rapidly as possible. Which direction should you move in? In other words, find a unit vector maximizing .
To solve this problem, recall that
It follows from Cauchy-Schwarz that
where we have equality iff and are parallel. Thus, the maximum rate of change occurs in the direction of , and moreover, the minimum rate of change occurs in the direction of .
Geometric interpretation of .Definition: If , then
is called the level set of corresponding to .
Let be defined by . For , the level set is a circle of radius centered at the origin in .
Let be a linear transformation. Then the level set is a linear subspace of .
Definition: Let and be differentiable at . If and , then the tangent space to at is .
Why is this a reasonable definition?
Example: Again take , and let . Also let . Then is a circle of radius , and
Thus, the tangent space is the line (hyperplane) in with equation
which, by drawing a graph, is easily seen to be a tangent line to .
Theorem: The tangent space to the level set of at is precisely the hyperplane in with normal vector .
Proof: If , then is in the tangent space iff . QED
Lecture 24 (2015-10-21)
Review session for exam 2. Problems and solutions omitted.
Lecture 25 (2015-10-23)
The chain rule. Let , , , and .
Theorem: If is differentiable at and is differentiable at , then is differentiable at , and .
(Example computation omitted.)
Proof: Let and . Write , where and . Note that and are differentiable at , with and , and that .
Observe that is differentiable at with derivative , since
Similarly, is differentiable at with derivative . To show this, we introduce the Lipschitz condition.
Lemma: If is differentiable at , then there exists a constant and such that if , then .
Now define and pick a Lipschitz constant for . Then whenever , and
since . (Note that this proof is not quite correct. It needs some fixing up in the case that is locally constant and . Nonetheless, it contains the essence of the correct idea.)QED
Lecture 26 (2015-10-26)
Definition: A linear system of equations in unknowns is a system of the form
with known coefficients and unknowns . A solution of this system is a vector that satisfies each equation.
All such systems admit a matrix representation , where is the matrix of coefficients, is the vector of unknowns, and is the vector of constants. If , then we say that the corresponding system is homogeneous. We call the augmented matrix of the system. Note that the solution set is the intersection of affine hyperplanes in .
Given a linear transformation , where , recall that
is the solution set of the linear system . Furthermore,
is precisely the linear span of the columns of (by the column POV).
Definition: The null space of a matrix is the set of vectors
Definition: The column space of a matrix is
where is the th column of .
(Example of solving a linear system by Gaussian elimination omitted.)
Allowed row operations.
Type I: interchange two rows.
Type II: replace a row by a scalar multiple.
Type III: replace a row by itself plus a scalar multiple of another row.
Lecture 28 (2015-10-30)
Linear systems solutions theorem. Let be a linear system with equations and unknowns. To solve such a system, we find a sequence of elementary row operations by which in RREF. At this point there are two cases:
If has a pivot 1, there are no solutions. (The system is said to be inconsistent.)
If has no pivot 1, then there are solutions. (The system is said to be consistent.)
If there are no free variables, then the linear system has a unique solution, given by the first components of .
If there is at least one free variable, then the system has infinitely many solutions. (The free variables can take on all values, and the pivot variables are determined by the free variables.)
Example: Let . Determine its null space .
Solution: We solve the linear system . Omitting intermediate steps, we find
from which we see that
This is the standard form of the general solution, which in this case turns out to be a line in .
Example: With the same matrix as the previous example, are and members of the column space ?
Solution: We solve the linear systems and . Again omitting intermediate steps,
and since the rightmost column contains a pivot 1, has no solution. Hence . Similarly,
which has general solution
Note that this is an affine line in which is parallel to the line from the previous example. This suggests that “crushes” the family of lines parallel to into points, making its column space a plane. But how can we verify this?
Problem: With the same matrix as the previous example, what is its column space ?
Clearly, is the span of the columns of . However, this not a particularly useful description. Our approach will be to find an implicit description of , which will lead to a better explicit description. Let , and reduce .
Note that we cannot reach RREF without knowing , but we can still reach REF. The lowest row gives a constraint equation . This tells us that is a plane in with normal vector . In general, we will obtain more than one constraint equation, and these will form a linear system. In our case,
and we have
Lecture 29 (2015-11-02)
Definiton: The rank of a matrix , denoted by , is the number of non-zero rows in its RREF.
Note: If is the coefficient matrix of a linear system, then is the number of pivot variables.
Goal: We will eventually show that, for a suitable definition of the dimension operator, and .
Let . Observe that , since the number of non-zero rows cannot exceed the number of rows. But simultaneously , since a linear system cannot have more pivot variables than variables. Hence .
Recall that a function is said to be
surjective (or onto) if , or in other words, if such that .
injective (or one-to-one) if whenever satisfy , we have .
bijective if it is surjective and injective, or in other words, if such that .
(Examples of surjective, injective, and bijective functions omitted.)
Observe that a linear transformation represented by a matrix is
surjective if has a solution for every ,
injective if whenever has a solution, it is unique, and
bijective if has a unique solution for every .
Theorem 1: iff , has a solution.
Proof: Suppose and let be given. Row reduce in RREF. Each row of must have a pivot 1, so has no pivots, and it follows that has a solution.
Conversely, assume for contradiction that . Row reduce in RREF. must have at least one zero row, and consequently will have a pivot 1 in its bottom-right corner. Now apply the inverse of the row operations taking to . By hypothesis has a solution, but has no solution. This is the desired contradiction. QED
Theorem 2: Suppose has a solution. Then has a unique solution iff has a unique solution.
Remark: Note that always has the trivial solution .
Proof: Let be the unique solution of and be a solution of . Then , hence , and , as desired.
Conversely, Suppose are solutions of . Then , and it follows that , or equivalently, that . QED
Lemma: iff has a unique solution.
Proof: iff the system has no free variables, which occurs iff has a unique solution. QED
Lecture 30 (2015-11-04)
Theorem: Let . Then iff is surjective, and iff is injective.
Corollary: iff is bijective.
Corollary: If , then is surjective iff is injective.
Note that if is bijective, there exists an inverse defined by iff .
Homework: If is a bijective linear transformation, then is also a linear transformation.
Theorem 4: is bijective iff is invertible.
Proof: Suppose is bijective, and let . Then . Likewise, . Thus, .
Conversely, suppose is invertible, and let . Then , and hence . Similarly , and bijectivity of follows via diagram chasing. QED
Corollary: Suppose . Then is invertible iff , which occurs iff the RREF of is .
Corollary: Suppose . TFAE:
, has a solution.
, has a unique solution.
has a unique solution.
is invertible.
The RREF of is .
.
Theorem: If and has either a left or right inverse, then is invertible. Moreover, either one-sided inverse is .
Proof: Suppose has a left inverse . Then has a unique solution, and hence is invertible. Moreover, , so .
Now suppose has a right inverse . Then has left inverse , and by the previous case, inverse . Hence, . QED
How do we find the inverse of an invertible matrix ? We know that . Apply the same sequence of operations to the augmented matrix to obtain a matrix .
Claim: .
Proof: Let . Let be the solution of . Then , and . QED
Lecture 31 (2015-11-06)
Linear independence, basis, and dimension. Let , and be the matrix with columns . Let be the linear transformation represented by . Recall that if , then is a linear combination of , and we define to be the set of all such linear combinations.
Observe that, using the column POV, is a linear combination of with coefficients . Hence, .
Definition: Vectors are linearly independent if
only when .
Note that are linearly independent iff has only the trivial solution, which occurs iff is injective, which occurs iff , which occurs iff , has a unique solution.
Definiton: Let be a linear subspace of . A basis for is a set of vectors such that
, and
are linearly independent.
In other words, if where has rank .
Goals: We would like to show that every has a basis, and that any two bases for have the same number of elements. That number will be defined as the dimension of .
Theorem: is a basis for iff every can be written as a linear combination of in a unique way.
(Examples omitted. These took up the rest of the lecture.)
Lecture 32 (2015-11-09)
Let , and . Let be the linear transformation represented by . Recall that TFAE: