Vector Spaces

This chapter provides an overview of Gaussian elimination. The notes assume familiarity with the complex and real numbers.

Linear Equations

linear equation. An equation of the form:

a1x1+a2x2++anxn=b a_1 x_1 + a_2 x_2 + \ldots + a_n x_n = b

is a linear equation in the variables x1,{x_1,} x2,{x_2,} ,{\ldots,} xn{x_n} whre a1,{a_1,} a2,{a_2,} ,{\ldots,} an,{a_n,} b{b} are constants.

An expression of the form:

a1x1+a2x2+anxn a_1x_1 + a_2x_2 + \ldots a_n x_n

is called a linear combination. The aiR{a_i \in \reals} the combination's coefficients, and the xiR{x_i \in \reals} are the combinaton's variables. Append an equal assign at the end:

a1x1+a2x2+anxn=b a_1x_1 + a_2x_2 + \ldots a_n x_n = b

and we get a linear equation, where b{b} is a constant. For example, these equations are all linear equations:

2x15x2x3=8 2x_1 - 5x_2 - x_3 = 8
4y+7=8 4y + 7 = 8
2x+1=4 2x + 1 = 4

Linear equations are the bedrock of modern technology. In the world of pure mathematics, we can simply write {\nil} or .{\infty.} Computers, however, don't have the same luxury. We must model and approximate these abstractions to realize them physically. For example, the derivative dfdx(x){\frac{df}{dx}(x)} of a differentiable function f:RR{f: \reals \to \reals} can be viewed as a linear approximation of f.{f.}

f(x)=f(a)+dfdf(a)(xa)+. f(x) = f(a) + \frac{df}{df}(a)(x-a) + \ldots.

The Solution Set

The set of all values that solve a linear equation is called the equation's solution set. For example, consider the linear equation:

x1+2x23x3=8 x_1 + 2x_2 - 3x_3 = 8

This equation turns out to have infinitely many solutions. Some of these solutions are:

x1=3,x2=4,x3=1x1=1,x2=3,x3=1 \sys{ &x_1=\no{-}3, &x_2=4, &x_3=\no{-}1 \\ &x_1=-1, &x_2=3, &x_3=-1 }

We can express the solution set for the equation above using parametric form:

{x1=s,x2=t,x3=s+2t83:s,tR} \lset{ x_1 = s, x_2 = t, x_3 = \dfrac{s + 2t - 8}{3} : s, t \in \reals }

Linear Systems

If we gather two or more linear equations and treat them as a single collection, we get a linear system (or system of linear equations):

a1,1x1+a1,2x2++a1,nxn=b1a2,1x1+a2,2x2++a2,nxn=b2am,1x1+am,2x2++am,nxn=bm \sys{ & a_{1,1}x_1 + a_{1,2}x_2 + \ldots + a_{1,n}x_n & = & b_1 \\ & a_{2,1}x_1 + a_{2,2}x_2 + \ldots + a_{2,n}x_n & = & b_2 \\ & \vdots & & \\ & a_{m,1}x_1 + a_{m,2}x_2 + \ldots + a_{m,n}x_n & = & b_m }

If we can find a set of n{n} numbers (specifically, an n{n}-tuple) that we can plug into each of the xi,{x_i,} such that all of the equations are solved simultaneously, then we call that set the system's solution. If the linear system has at least one solution, we can describe the system as consistent. If the system has no solutions, we describe it as inconsistent.

In the system above, the the subscript m{m} is a positive integer, and denotes the row index. The subscript n{n} is also a positive integer, and denotes the column index. Much like how rectangles are described with length×width,{\text{length} \times \text{width},} linear systems have the property m×n.{m \times n.} We call this the linear system's order.

Notice that the row index m{m} corresponds to the number of equations we have, and the the column index n{n} corresponds to the number of variables. This yields a few more properties about linear systems:

  1. If m<n,{m \ltn n,} there are more unknowns than equations. In this case, we say the m×n{m \times n} system underspecificed.

  2. If m>m,{m \gtn m,} there are more equations than unknowns, and we say the m×n{m \times n} system is overspecified.

Matrices

Consider the linear system:

2x+y=02xy+3z=32x2y3z=3 \sys{ & \no{2}x & + & y & & & = & 0 \\ & 2x & - & y & + & 3z & = & 3 \\ & \no{2}x & - & 2y & - & \no{3}z & = & 3 }

Early on, mathematicians recognized that having to write linear systems over and over again is a pain. Accordingly, they invented a new construct called a matrix, of which there are two types: the coefficient matrix and the augmented matrix. With a coefficient matrix, we can express the system above as:

[110213121] \mx{ 1 & \no{-}1 & \no{-}0 \\ 2 & -1 & \no{-}3 \\ 1 & -2 & -1 }

Each number in the matrix above corresponds to a coefficient in the linear system. There's also the augmented matrix:

[110021331213] \mx{ 1 & \no{-}1 & \no{-}0 & \aug & 0 \\ 2 & -1 & \no{-}3 & \aug & 3 \\ 1 & -2 & -1 & \aug & 3 }

This is simply the coefficient matrix, but with a column for the right-hand side of the equality, separated by a vertical bar. By using a matrix, we can give assign the linear system to a variable:

A=[110213121] {\bf \A} = \mx{ 1 & \no{-}1 & \no{-}0 \\ 2 & -1 & \no{-}3 \\ 1 & -2 & -1 }

Now comes the cool part — we can then express the linear system as a linear equation:

Ax=b {\bf \A} \x = b

Matrices also make solution sets easier to write. For example, the linear system:

x+y+6z6w=1x+y6z+6w=13x+y+6z6w=6xy+6z6w=1 \sys{ & x & + & \no{-}y & + & \no{6}z & - & \no{6}w & = & \no{-}1 \\ & \no{x} & \no{+} & \no{-}y & - & \no{6}z & + & \no{6}w & = & -1 \\ & 3x & + & \no{-}\no{y} & \no{+} & 6z & - & 6w & = & \no{-}6 \\ & \no{x} & - & y & + & \no{6}z & - & \no{6}w & = & \no{-}1 }

has the matrix:

[11111011113066601111] \mx{ 1 & \no{-}1 & \no{-}1 & -1 & \aug & \no{-}1 \\ 0 & \no{-}1 & -1 & \no{-}1 & \aug & -1 \\ 3 & \no{-}0 & \no{-}6 & -6 & \aug & \no{-}6 \\ 0 & -1 & \no{-}1 & -1 & \aug & \no{-}1 }

and the solution set:

{[2100]+[2110]z+[2101]w  z,wR} \left. \lset{ \mx{ 2 \\ -1 \\ 0 \\ 0} + \mx{ -2 \\ 1 \\ 1 \\ 0} z + \mx{2 \\ -1 \\ 0 \\ 1} w ~\right \vert~ z,w \in \reals }

Each "1-column matrix" in the solution set is called a column vector or simply vector. A matrix with a single row is called a row vector (may also be called a vector). Each number in a vector is called a vector component. There's a particularly special vector we'll see often later called the zero vector — a vector whose components are all zeroes:

[000] \mx{0 \\ 0 \\ \vdots \\ 0}

How did we get that solution? Through a method called Gaussian elimination. Before we get to that topic, let's say a few words about variables. Like other areas of mathematics, linear algebra has certain naming conventions. Matrices are traditionally named with uppercase, bold, Roman letters: A{\bf \A}, B{\bf \B}, C{\bf \C}, Z.{\bf \Z.} Vectors are usually lowercase Roman or Greek letters, either overlined with an or in bold: a{\bf \a}, b{\bf \b}, α{\vector{\alpha}}, β.{\vector{\beta}.}

Gaussian Elimination

Gaussian elimination (also called Gauss's Method) is a method of solving linear systems:

gauss's method. Let L1{L_1} be a linear system. If L1{L_1} is changed to a linear system L2{L_2} with any of the following operations:

  1. one equation is swapped with another,
  2. one equation has both of its sides multiplied by a nonzero constant,
  3. one equation is replaced by the sum of itself and a multiple of another,

then L1{L_1} and L2{L_2} have the same solution.

Each of the three operations in Gauss's method is called an elementary row operation. Let's see what each operation does with the system:

2x+y=1x3y=17 \sys{ 2x & + & y & = & -1 \\ x & - & 3y & = & 17 }

Swapping. The first operation in Gauss's method is called swapping. Gauss's method tells us that these two systems have the same solution:

2x+y=1x3y=17 \sys{ 2x & + & y & = & -1 \\ x & - & 3y & = & 17 }
x3y=172x+y=1 \sys{ x & - & 3y & = & 17 \\ 2x & + & y & = & -1 }

Scaling. The second operation is called swapping. Again, these two systems have the same solution:

2x+y=1x3y=17 \sys{ 2x & + & y & = & -1 \\ x & - & 3y & = & 17 }
6x3y=3x3y=17 \sys{ -6x & - & 3y & = & 3 \\ x & - & 3y & = & 17 }

Above, we multiplied the system to the right by the constant 3.{-3.}

Pivoting. The third operation is called pivoting. This operation is a little more elaborate. Here, we first choose one of the system's equations (which one we choose is a question we'll discuss at length later). Using our example system, let's pick the second:

x3y=17 x - 3y = 17

Then, we scale our equation of choice. Let's use 2 (why we choose 2 is another question we'll save for later).

2(x3y=17)2x6y=34 2(x - 3y = 17) \nc 2x - 6y = 34

Next, we choose a second equation (other than what we chose first) and perform the same. Since we only have two equations, we scale the first. Let's use the constant 1:{-1:}

1(2x+y=1)2xy=1 -1(2x + y = -1) \nc -2x - y = 1

Then, we add the scaled equations:

+2x6y=34+2x6y=1+7y=35 \eqs{ \no{+} \no{-} 2x - 6y &= 34 \\ + -2x - \no{6} y &= 1 \\ \hline \no{+} -7y &= 35 }

Gauss's method tells us that, if we replace the equation we chose first with the new equation above, we get a linear system whose solution is the same as the original's:

2x+y=1x3y=17 \sys{ 2x & + & y & = & -1 \\ x & - & 3y & = & 17 }
2x+y=12x7y=35 \sys{ 2x & + & y & = & -1 \\ \no{2x} & - & 7y & = & 35 \\ }

Gauss's Method with Matrices

We saw what Gaussian elimination looks like on an explicit linear system. But, as we saw previously, writing linear systems explicitly is tedious, so we use matrices. Let's see what Gauss's method looks like with matrices. Using the same linear system we saw previously:

2x+y=1x3y=17 \sys{ 2x & + & y & = & -1 \\ x & - & 3y & = & 17 }

we have the following augmented matrix:

[2111317]. \mx{ 2 & \no{-}1 & \aug & -1 \\ 1 & -3 & \aug & 17 }.

The swap operation is expressed as follows:

[2111317][1317211]  R1R2 \mx{ 2 & \no{-}1 & \aug & -1 \\ 1 & -3 & \aug & 17 } \sim \mx{ 1 & -3 & \aug & 17 \\ 2 & \no{-}1 & \aug & -1 } ~~ R_1 \swap R_2

Above, the symbol ({\sim}) means "has the same solution as," and the expression R1R2{R_1 \swap R_2} means "Rows 1 and 2 were swapped." The scaling operation is expressed as:

[2111317][63131317]  R13R1 \mx{ 2 & \no{-}1 & \aug & -1 \\ 1 & -3 & \aug & 17 } \sim \mx{ -6 & -3 & \aug & \no{1}3 \\ \no{-}1 & -3 & \aug & 17 } ~~ R_1 \alt -3R_1

The notation R13R1{R_1 \alt -3R_1} means: "Row 1 was replaced with 3×row 1.{-3 \times \text{row 1}.}" The pivoting operation is expressed as:

[2111317][2110735]  2R22R2R1 \mx{ 2 & \no{-}1 & \aug & -1 \\ 1 & -3 & \aug & 17 } \sim \mx{ 2 & \no{-}1 & \aug & -1 \\ 0 & -7 & \aug & 35 } ~~ 2R_2 \alt 2R_2 - R_1

The notation 2R22R2R1{2R_2 \alt 2R_2 - R_1} means: "Row 2 was replaced with 2 times row 2 minus row 1." The remarkable aspect of Gaussian elimination is that, when used on matrices, a wide variety of patterns appear. And where there are patterns, there are theorems — the building blocks of a mathematical field.

The Point of Gaussian Elimination

Whenever we work with linear systems, there are two keys question:

  1. Does the system have at least one solution? (i.e., consistent).
  2. If a solution does exist, is the solution unique?

The point of Gaussian elimination is to manipulate the linear system into a form that's easier to solve. This is done through the process of row reduction. In row reduction, the goal is this:

Use elementary row operations to eliminate variables from select rows of an augmented matrix.

What are the select rows? It depends on the system's order. Let's consider the simplest case, a 2×2{2 \times 2} system. For the 2×2{2 \times 2} system, the goal is to obtain a 1 in the top left corner and 0 in the bottom left corner:

[10] \mx{ 1 & \ldots & \ldots & \aug & \ldots \\ 0 & \ldots & \ldots & \aug & \ldots \\ }

If we execute Gauss's method correctly, we may get a matrix that takes one of the forms below. As it turns out, these forms imply how many solutions the system has:

A=[101] {\bf A} = \mx{ 1 & \square & \aug & \square \\ 0 & 1 & \aug & \square \\ }
B=[100] {\bf B} = \mx{ 1 & \square & \aug & \square \\ 0 & 0 & \aug & \square \\ }
C=[1000] {\bf C} = \mx{ 1 & \square & \aug & \square \\ 0 & 0 & \aug & 0 \\ }
1 1
0 0
\infty

To see why get these patterns, let's introduce some new vocabulary.

Echelon Form

Suppose L{\L} is the linear system to the left, with its corresponding matrix to the right:

{2x+y=02xy+3z=32x2y3z=3}    [110213121] \lset{ \sys{ \no{2}x & + & y & & & = & 0 \\ 2x & - & y & + & 3z & = & 3 \\ \no{2}x & - & 2y & - & \no{3}z & = & 3 } } ~~~~ \mx{ 1 & \no{-}1 & \no{-}0 \\ 2 & -1 & \no{-}3 \\ 1 & -2 & -1 }

The first nonzero entry in each row of L{\L} is called the leading entry, and it corresponds to its L{\L} equation's leading variable. All the other nonzero entries are called free entries, and correspond their equations' free variables. Using this terminology, we state the formal definition of echelon form:

echelon form. Let M{\bf M} be the matrix

[x1,1x1,2x1,jx2,1x2,2x2,jxi,1xi,2xi,j], \mx{ x_{1,1} & x_{1,2} & \ldots & x_{1,j} \\ x_{2,1} & x_{2,2} & \ldots & x_{2,j} \\ \vdots & \vdots & \vdots & \vdots \\ x_{i,1} & x_{i,2} & \ldots & x_{i,j} \\ },

where {i,jZ+},{\set{i,j \in \pint},} xN,{x \in \nat,} r,s[1,2,,i],{r,s \in \ix{1,2,\ldots,i},} and c[1,2,,j].{c \in \ix{1,2,\ldots,j}.} We say that M{\bf M} is in echelon form if, and only if, the following propositions are true for M{\bf M}: First, if there exists a row Rr{R_r} with an entry x0{x \neq 0} and a row Rs{R_s} with entries [0,0,,0],{\ix{0,0,\ldots,0},} then r<s.{r \lt s.} Second, given a row Rr{R_r} and r1,{r \neq 1,} if Rr{R_r} contains a leading entry at the position Rr,c{R_{r,c}} and c1,{c \neq 1,} then Rr1{R_{r-1}} contains a leading entry at Rr1,c1.{R_{r-1,c-1}.} Third, if a row Rr{R_r} contains a leading entry at the position Rr,c{R_{r,c}} then for all rows Rr+k{R_{r + k}} where k<ir,{k \lt i - r,} the entry at Rr+k,c=0.{R_{r+k,c} = 0.}

Let's break this definition down into a checklist. A matrix M{\bf M} is in echelon form if, and only if, it satisfies the following properties.

  1. M{\bf M} is a rectangular matrix.
  2. If M{\bf M} contains rows with all zeroes (called zero rows) and rows that are not all zeroes (called nonzero rows), then no zero row appears before a nonzero row, from top to bottom.
  3. If a row R{R} contains a leading entry x,{x,} then x{x} is in a column immediately to the right of the leading entry of the row above R.{R.}
  4. If a row R{R} contains a leading entry x,{x,} then for all the rows after R,{R,} entries in x{x}'s column are zeros.

If a matrix is in echelon form, we call the matrix an echelon matrix. The word "echelon" is descriptive — the matrix is "steplike." For example, the matrices below are all in echelon form. The symbol {\square} indicates that the entry is a nonzero value, and the symbol {\star} indicates that the entry is any value (zero or nonzero):

[000]  [00000000000]  [00000000000000000000] \mx{ \square & \star & \star & \star \\ 0 & \square & \star & \star \\ 0 & 0 & \square & \star \\ } ~~ \mx{ \square & \star & \star & \star \\ 0 & \square & \star & \star \\ 0 & 0 & \star & \star \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ } ~~ \mx{ 0 & \square & \star & \star & \star & \star & \star & \star & \star & \star \\ 0 & 0 & 0 & \square & \star & \star & \star & \star & \star & \star \\ 0 & 0 & 0 & 0 & \square & \star & \star & \star & \star & \star \\ 0 & 0 & 0 & 0 & 0 & \square & \star & \star & \star & \star \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & \square & \star & \star \\ }

Notice how the zeroes form a step-like (echelon) arrangement. Once a matrix is in echelon form, we can apply back substitution solve the system. For example, consider the following linear system, already in echelon form:

8x1{8x_1}{-}2x2{2x_2}+{+}x3{x_3}={=}4{4}
3x2{3x_2}{-}x3{x_3}={=}7{7}
2x3{2x_3}={=}4{4}
8-214
03-17
0024

The last row tells us that x3=2,{x_3 = 2,} since 2x3=4.{2x_3 = 4.} Plugging x3=2{x_3 = 2} into the second row, we get 3x22=7,{3x_2 - 2 = 7,} which yields x2=3.{x_2 = 3.} And plugging x2=3{x_2 = 3} and x3=2{x_3 = 2} into the first row, we get x1=1.{x_1 = 1.}

Reduced Echelon Form

From the definition of echelon form, we can state the formal definition of reduced echelon form.

reduced echelon form. Let M{\bf M} be the matrix

[x1,1x1,2x1,jx2,1x2,2x2,jxi,1xi,2xi,j], \mx{ x_{1,1} & x_{1,2} & \ldots & x_{1,j} \\ x_{2,1} & x_{2,2} & \ldots & x_{2,j} \\ \vdots & \vdots & \vdots & \vdots \\ x_{i,1} & x_{i,2} & \ldots & x_{i,j} \\ },

Rr, r=1,2,,i{R_r,~r=1,2,\ldots,i} be the rows of M,{\M,} and Cc, c=1,2,,j{C_c,~c=1,2,\ldots,j} be the columns of M.{M.} M{\bf M} is in reduced echelon form if, and only if, the following propositions are true for M.{\bf M.} First, M{\bf M} is in echelon form. Second, for each row Rr,{R_r,} Rr[0,0,,0]{R_r \neq [0,0,\ldots,0]} (that is, a nonzero row), then the leading entry of Rr{R_r} is 1. Third, if Rr{R_r} is a nonzero row and Ra,b=1,{R_{a,b} = 1,} where a(1,2,,r){a \in (1,2,\ldots,r)} and b(1,2,,c),{b \in (1,2,\ldots,c),} then for all rows Rr,c{R_{r,c}} and cb,{c \neq b,} Rr,c1.{R_{r,c} \neq 1.}

Once again, let's break this definition down into a checklist. A matrix M{\bf M} is in reduced echelon form if, and only if, the following conditions are met:

  1. M{\bf M} is in reduced echelon form,
  2. If M{\bf M} is a nonzero row, then its leading entry is 1.
  3. Each leading 1 in M{\bf M} is the only nonzero entry in its column.

For example, the following are all reduced echelon matrices:

[100100000000]  [010000000100000001000000010000000001] \mx{ 1 & 0 & \star & \star & \\ 0 & 1 & \star & \star & \\ 0 & 0 & 0 & 0 & \\ 0 & 0 & 0 & 0 & \\ } ~~ \mx{ 0 & 1 & \star & 0 & 0 & 0 & \star & \star & 0 & \star \\ 0 & 0 & 0 & 1 & 0 & 0 & \star & \star & 0 & \star \\ 0 & 0 & 0 & 0 & 1 & 0 & \star & \star & 0 & \star \\ 0 & 0 & 0 & 0 & 0 & 1 & \star & \star & 0 & \star \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & \star }

Now that we've seen the echelon definitions, we state another motivation for Gaussian elimination: Any nonzero matrix can be row reduced — using Gaussian elimination — into some (more than one) echelon matrix. If we reduce a matrix to an echelon matrix, then finding the solution of a linear system becomes much easier because of several facts.

Given a reduced echelon matrix

100...0b1{b_1}
0{\vdots}1...0b2{b_2}
001...0b3{b_3}
{\vdots}{\vdots}{\vdots}...0{\vdots}
1bk{b_k}
000...00
{\vdots}{\vdots}{\vdots}{\vdots}{\vdots}
000...00

the first nonzero entry in each row is called the pivot, and each column that contains a pivot is called the pivot column. Once we have pivots, we can apply the row reduction algorithm:

row reduction algorithm. Let M{\bf M} be a rectangular matrix. Then the row reduction algorithm proceeds as follows.

  1. Apply Gaussian elimination to transform M{\bf M} into its echelon form Mef.{\bf M_{ef}.}
  2. Apply Gaussian elimination to transform Mef{\bf M_{ef}} into its reduced echelon form Mref.{\bf M_{ref}.}
  3. Go to the leftmost nonzero column (the pivot column Pc{P_c}).
  4. Select an entry xPc{x \in P_c} as the pivot p.{p.}
  5. If p{p} is not at the top-left position, interchange the rows such that p{p} is at the top-left position.
  6. Create zeroes in all positions below the pivot using row replacement.
  7. Create a new submatrix without the rows above and including the pivot.
  8. Apply steps 3 to 6 to the submatrix.
  9. With the resulting submatrix, repeat steps 7 and 8 until there are no more nonzero rows that can be modified.
  10. With the final submatrix Mf,{\bf{M}_f,} go to the rightmost pivot.
  11. From the bottom to the left, create zeros above each pivot.
  12. If a pivot is not 1,{1,} make it a 1{1} via scaling.

Vector Spaces

vector space. Let V{V} be a set of vectors, S{S} a set of scalars, +{+} the operation of addition, and   {\op} the operation of scalar multiplication. Then the algebra V={VS,{+,  }}{{\sf V} = \set{V \cup S, \set{+,\op}}} is a vector space if, and only if, for all u,v,wV{\uu, \vv, \ww \in {V}} and for all scalars c,dS,{c,d \in {S},} the following properties hold:

V{{\sf V}} is closed under vector addition.u+vV.{\uu + \vv \in {\sf V}.}
Vector addition is commutative on V.{{\sf V}.}u+v=v+u.{\uu + \vv = \vv + \uu.}
Vector addition is associative on V.{{\sf V}.}(u+v)+w=v+(u+w).{(\uu + \vv) + \ww = \vv + (\uu + \ww).}
For all uV,{\uu \in {V},} there exists an additive identity in V.{{\sf V}.}u+0=u.{\uu + \mathbf{0} = \uu.}
For all uV,{\uu \in {V},} there exists an additive inverse in V.{{\sf V}.}u+(u)=0.{\uu + (-\uu) = \mathbf{0}.}
F{{\sf F}} is closed under scalar multiplication.c  uF.{c \op \uu \in {\sf F}.}
Scalar multiplication is distributive over vector addition on V.{{\sf V}.}c  (u+v)=(c  u)+(c  v).{c \op (\uu + \vv) = (c \op \uu) + (c \op \vv).}
u  (c+d)=(c  u)+(d  u).{\uu \op (c + d) = (c \op \uu) + (d \op \uu).}
Scalar multiplication is associative on V.{{\sf V}.}c  (d  u)=(c  d)  u.{c \op (d \op \uu) = (c \op d) \op u.}
For all uV,{\uu \in {\sf V},} there exists a multiplicative identity in V.{{\sf V}.}1(u)=u.{1(\uu) = \uu.}

A vector space, in short, is just a very special set. It's members may be lists, vectors, functions, or some other strange object. It's key requirement is that we can meaningfully perform vector addition and scalar multiplication. The properties laid out in the definition above establish what "meaningfully" means. Moreover, notice that we use different fonts to denote a vector space. We do so to emphasize the difference between a field and a vector space. All fields are vector spaces — R{\reals} is simply the vector space R1;{{\sf R}^1;} a vector space over itself. Not all vector spaces, however, are fields. To list just one reason, fields require commutativity of multiplication for all its carrier set's members. As we'll see later, that's a very special property, and it's not one that many vector spaces satisfy.

Conventions

In these notes, we adopt the following conventions:

convention. V{\sf V} denotes a vector space over F.{\sf F.}

convention. 0v{0_{\vv}} denotes a zero vector.

Vector Properties

property. For every vV,{\vv \in {\sf V},} it is true that 0v=0.{0\vv = 0.}

property. For every aF,{a \in {\sf F},} it is true that a0v=0.{a0_{\vv}= 0.}

property. For every vV,{\vv \in {\sf V},} it is true that (1)v=v.{(-1)\vv = -\vv.}

Subspaces

subspace. Given a vector space V,{\sf V,} we call a subset UV{{\sf U} \subseteq {\sf V}} a subspace of V{\sf V} if, and only if, U{\sf U} is also a vector space. That is, if, and only if, the following conditions are satisfied: (1) 0vU.{0_{\vv} \in {\sf U}.} (2) If u,wU,{\uu,\ww \in {\sf U},} then u+wU.{\uu + \ww \in {\sf U}.} (2) If aF{a \in {\sf F}} and uU,{\uu \in {\sf U},} then auU.{au \in {\sf U}.}

sum of subsets. Let U1,UmV.{{\sf U}_1, \ldots {\sf U}_m \subseteq {\sf V}.} The sum U1++Um{{\sf U}_1 + \ldots + {\sf U}_m} is defined as

U1++Um={u1++um:u1U1,,umUm}. {\sf U}_1 + \ldots + {\sf U}_m = \set{u_1 + \ldots + u_m : u_1 \in {\sf U}_1, \ldots, u_m \in {\sf U}_m}.

Linear Combinations

linear combination. A vector of the form a1v1++amvm{a_1 v_1 + \ldots + a_m v_m} where a1,,amF{a_1, \ldots, a_m \in {\sf F}} and v1,,vmV{v_1, \ldots, v_m \in {\sf V}} is called a linear combination (a list) of vectors in V.{{\sf V}.}

span. Let L{L} be a list of vectors v1,,vm,{v_1,\ldots,v_m,} where each vector is a vector in V.{{\sf V}.} The set of all linear combinations of L{L} is called the span of L,{L,} denoted span(v1,,vm).{\df{span}(v_1,\ldots,v_m).} The span of an empty list is defined to be 0.{0.}