Vector Spaces
This chapter provides an overview of Gaussian elimination. The notes assume familiarity with the complex and real numbers.
Linear Equations
linear equation. An equation of the form:
is a linear equation in the variables whre are constants.
An expression of the form:
is called a linear combination. The the combination's coefficients, and the are the combinaton's variables. Append an equal assign at the end:
and we get a linear equation, where is a constant. For example, these equations are all linear equations:
Linear equations are the bedrock of modern technology. In the world of pure mathematics, we can simply write or Computers, however, don't have the same luxury. We must model and approximate these abstractions to realize them physically. For example, the derivative of a differentiable function can be viewed as a linear approximation of
The Solution Set
The set of all values that solve a linear equation is called the equation's solution set. For example, consider the linear equation:
This equation turns out to have infinitely many solutions. Some of these solutions are:
We can express the solution set for the equation above using parametric form:
Linear Systems
If we gather two or more linear equations and treat them as a single collection, we get a linear system (or system of linear equations):
If we can find a set of numbers (specifically, an -tuple) that we can plug into each of the such that all of the equations are solved simultaneously, then we call that set the system's solution. If the linear system has at least one solution, we can describe the system as consistent. If the system has no solutions, we describe it as inconsistent.
In the system above, the the subscript is a positive integer, and denotes the row index. The subscript is also a positive integer, and denotes the column index. Much like how rectangles are described with linear systems have the property We call this the linear system's order.
Notice that the row index corresponds to the number of equations we have, and the the column index corresponds to the number of variables. This yields a few more properties about linear systems:
-
If there are more unknowns than equations. In this case, we say the system underspecificed.
-
If there are more equations than unknowns, and we say the system is overspecified.
Matrices
Consider the linear system:
Early on, mathematicians recognized that having to write linear systems over and over again is a pain. Accordingly, they invented a new construct called a matrix, of which there are two types: the coefficient matrix and the augmented matrix. With a coefficient matrix, we can express the system above as:
Each number in the matrix above corresponds to a coefficient in the linear system. There's also the augmented matrix:
This is simply the coefficient matrix, but with a column for the right-hand side of the equality, separated by a vertical bar. By using a matrix, we can give assign the linear system to a variable:
Now comes the cool part — we can then express the linear system as a linear equation:
Matrices also make solution sets easier to write. For example, the linear system:
has the matrix:
and the solution set:
Each "1-column matrix" in the solution set is called a column vector or simply vector. A matrix with a single row is called a row vector (may also be called a vector). Each number in a vector is called a vector component. There's a particularly special vector we'll see often later called the zero vector — a vector whose components are all zeroes:
How did we get that solution? Through a method called Gaussian elimination. Before we get to that topic, let's say a few words about variables. Like other areas of mathematics, linear algebra has certain naming conventions. Matrices are traditionally named with uppercase, bold, Roman letters: , , , Vectors are usually lowercase Roman or Greek letters, either overlined with an or in bold: , , ,
Gaussian Elimination
Gaussian elimination (also called Gauss's Method) is a method of solving linear systems:
gauss's method. Let be a linear system. If is changed to a linear system with any of the following operations:
- one equation is swapped with another,
- one equation has both of its sides multiplied by a nonzero constant,
- one equation is replaced by the sum of itself and a multiple of another,
then and have the same solution.
Each of the three operations in Gauss's method is called an elementary row operation. Let's see what each operation does with the system:
Swapping. The first operation in Gauss's method is called swapping. Gauss's method tells us that these two systems have the same solution:
Scaling. The second operation is called swapping. Again, these two systems have the same solution:
Above, we multiplied the system to the right by the constant
Pivoting. The third operation is called pivoting. This operation is a little more elaborate. Here, we first choose one of the system's equations (which one we choose is a question we'll discuss at length later). Using our example system, let's pick the second:
Then, we scale our equation of choice. Let's use 2 (why we choose 2 is another question we'll save for later).
Next, we choose a second equation (other than what we chose first) and perform the same. Since we only have two equations, we scale the first. Let's use the constant
Then, we add the scaled equations:
Gauss's method tells us that, if we replace the equation we chose first with the new equation above, we get a linear system whose solution is the same as the original's:
Gauss's Method with Matrices
We saw what Gaussian elimination looks like on an explicit linear system. But, as we saw previously, writing linear systems explicitly is tedious, so we use matrices. Let's see what Gauss's method looks like with matrices. Using the same linear system we saw previously:
we have the following augmented matrix:
The swap operation is expressed as follows:
Above, the symbol () means "has the same solution as," and the expression means "Rows 1 and 2 were swapped." The scaling operation is expressed as:
The notation means: "Row 1 was replaced with " The pivoting operation is expressed as:
The notation means: "Row 2 was replaced with 2 times row 2 minus row 1." The remarkable aspect of Gaussian elimination is that, when used on matrices, a wide variety of patterns appear. And where there are patterns, there are theorems — the building blocks of a mathematical field.
The Point of Gaussian Elimination
Whenever we work with linear systems, there are two keys question:
- Does the system have at least one solution? (i.e., consistent).
- If a solution does exist, is the solution unique?
The point of Gaussian elimination is to manipulate the linear system into a form that's easier to solve. This is done through the process of row reduction. In row reduction, the goal is this:
Use elementary row operations to eliminate variables from select rows of an augmented matrix.
What are the select rows? It depends on the system's order. Let's consider the simplest case, a system. For the system, the goal is to obtain a 1 in the top left corner and 0 in the bottom left corner:
If we execute Gauss's method correctly, we may get a matrix that takes one of the forms below. As it turns out, these forms imply how many solutions the system has:
To see why get these patterns, let's introduce some new vocabulary.
Echelon Form
Suppose is the linear system to the left, with its corresponding matrix to the right:
The first nonzero entry in each row of is called the leading entry, and it corresponds to its equation's leading variable. All the other nonzero entries are called free entries, and correspond their equations' free variables. Using this terminology, we state the formal definition of echelon form:
echelon form. Let be the matrix
where and We say that is in echelon form if, and only if, the following propositions are true for : First, if there exists a row with an entry and a row with entries then Second, given a row and if contains a leading entry at the position and then contains a leading entry at Third, if a row contains a leading entry at the position then for all rows where the entry at
Let's break this definition down into a checklist. A matrix is in echelon form if, and only if, it satisfies the following properties.
- is a rectangular matrix.
- If contains rows with all zeroes (called zero rows) and rows that are not all zeroes (called nonzero rows), then no zero row appears before a nonzero row, from top to bottom.
- If a row contains a leading entry then is in a column immediately to the right of the leading entry of the row above
- If a row contains a leading entry then for all the rows after entries in 's column are zeros.
If a matrix is in echelon form, we call the matrix an echelon matrix. The word "echelon" is descriptive — the matrix is "steplike." For example, the matrices below are all in echelon form. The symbol indicates that the entry is a nonzero value, and the symbol indicates that the entry is any value (zero or nonzero):
Notice how the zeroes form a step-like (echelon) arrangement. Once a matrix is in echelon form, we can apply back substitution solve the system. For example, consider the following linear system, already in echelon form:
8 | -2 | 1 | 4 |
---|---|---|---|
0 | 3 | -1 | 7 |
0 | 0 | 2 | 4 |
The last row tells us that since Plugging into the second row, we get which yields And plugging and into the first row, we get
Reduced Echelon Form
From the definition of echelon form, we can state the formal definition of reduced echelon form.
reduced echelon form. Let be the matrix
be the rows of and be the columns of is in reduced echelon form if, and only if, the following propositions are true for First, is in echelon form. Second, for each row (that is, a nonzero row), then the leading entry of is 1. Third, if is a nonzero row and where and then for all rows and
Once again, let's break this definition down into a checklist. A matrix is in reduced echelon form if, and only if, the following conditions are met:
- is in reduced echelon form,
- If is a nonzero row, then its leading entry is 1.
- Each leading 1 in is the only nonzero entry in its column.
For example, the following are all reduced echelon matrices:
Now that we've seen the echelon definitions, we state another motivation for Gaussian elimination: Any nonzero matrix can be row reduced — using Gaussian elimination — into some (more than one) echelon matrix. If we reduce a matrix to an echelon matrix, then finding the solution of a linear system becomes much easier because of several facts.
Given a reduced echelon matrix
1 | 0 | 0 | ... | 0 | |
0 | 1 | ... | 0 | ||
0 | 0 | 1 | ... | 0 | |
... | 0 | ||||
1 | |||||
0 | 0 | 0 | ... | 0 | 0 |
0 | 0 | 0 | ... | 0 | 0 |
the first nonzero entry in each row is called the pivot, and each column that contains a pivot is called the pivot column. Once we have pivots, we can apply the row reduction algorithm:
row reduction algorithm. Let be a rectangular matrix. Then the row reduction algorithm proceeds as follows.
- Apply Gaussian elimination to transform into its echelon form
- Apply Gaussian elimination to transform into its reduced echelon form
- Go to the leftmost nonzero column (the pivot column ).
- Select an entry as the pivot
- If is not at the top-left position, interchange the rows such that is at the top-left position.
- Create zeroes in all positions below the pivot using row replacement.
- Create a new submatrix without the rows above and including the pivot.
- Apply steps 3 to 6 to the submatrix.
- With the resulting submatrix, repeat steps 7 and 8 until there are no more nonzero rows that can be modified.
- With the final submatrix go to the rightmost pivot.
- From the bottom to the left, create zeros above each pivot.
- If a pivot is not make it a via scaling.
Vector Spaces
vector space. Let be a set of vectors, a set of scalars, the operation of addition, and the operation of scalar multiplication. Then the algebra is a vector space if, and only if, for all and for all scalars the following properties hold:
is closed under vector addition. Vector addition is commutative on Vector addition is associative on For all there exists an additive identity in For all there exists an additive inverse in is closed under scalar multiplication. Scalar multiplication is distributive over vector addition on Scalar multiplication is associative on For all there exists a multiplicative identity in
A vector space, in short, is just a very special set. It's members may be lists, vectors, functions, or some other strange object. It's key requirement is that we can meaningfully perform vector addition and scalar multiplication. The properties laid out in the definition above establish what "meaningfully" means. Moreover, notice that we use different fonts to denote a vector space. We do so to emphasize the difference between a field and a vector space. All fields are vector spaces — is simply the vector space a vector space over itself. Not all vector spaces, however, are fields. To list just one reason, fields require commutativity of multiplication for all its carrier set's members. As we'll see later, that's a very special property, and it's not one that many vector spaces satisfy.
Conventions
In these notes, we adopt the following conventions:
convention. denotes a vector space over
convention. denotes a zero vector.
Vector Properties
property. For every it is true that
property. For every it is true that
property. For every it is true that
Subspaces
subspace. Given a vector space we call a subset a subspace of if, and only if, is also a vector space. That is, if, and only if, the following conditions are satisfied: (1) (2) If then (2) If and then
sum of subsets. Let The sum is defined as
Linear Combinations
linear combination. A vector of the form where and is called a linear combination (a list) of vectors in
span. Let be a list of vectors where each vector is a vector in The set of all linear combinations of is called the span of denoted The span of an empty list is defined to be