Constructing the Reals

The following are notes on constructing the reals.

The rationals "contain" the integers and the naturals, in the sense that given a rational number, we can get back the integer or natural by "unwrapping" the relevant equivalence class. The rationals are an ordered field — the equivalence classes have an inherent order, alongside operations that, when applied to the rationals, do not disturb their inherent ordering. That is, given x,y,zQ,{x,y,z \in \rat,} if y<z,{y \lt z,} then x+y<x+z.{x + y \lt x + z.} The same goes for multiplication, if y<z,{y \lt z,} then xy<xz.{xy \lt xz.} This all stems from the fact that the rationals are equivalence classes of pairs of integers, integers are equivalence classes of pairs of naturals, the naturals are inherently ordered because they're defined in terms of a successor set, and no successor set is a subset of its elements. (I.e., given a,bN,{a,b \in \nat,} there are only a three cases: ab,a=b,ba,{a \subset b, a = b, b \subset a,} which, by convention, we denote as a<b,a=b,b<a{a \lt b, a = b, b \lt a}).

The set of all rationals, however, are insufficient. We've seen this theme repeatedly. The naturals were insufficient, in the sense that there is no natural that satisfies x+1=0.{x + 1 = 0.} So, we constructed the integers. Then we found that the integers were insufficient, as there is no integer that satisfies 2x=1.{2x = 1.} We patched that up with the rationals. Now we find that the rationals are insufficient, because they have a hole. Suppose the a group of analysts set out to lay all of the rationals on a number line in order, Sisyphean as it may be. One day, the geometers, working on some right triangles, come in and ask for a rational of length x,{x,} where x2=2.{x^2 = 2.} The analysts look at their number line, and find that no such length exists! But it must exist, because we see right triangles whose legs are of length 1. This is why we need R,{\reals,} the set of all real numbers.

The set R,{\reals,} however, is unlike any of the other sets we've seen. We have no way of pinpointing where the hole is. So how do we fill in this hole if we have no way of knowing where the hole is? As it turns out, there are two possible approaches: Dedekind cuts (or, as this author likes to call them, King Dedede cuts) and Cauchy sequences.

Cauchy Sequences

So, the notion of a square root doesn't exist, but we want to find out what {xQ:x2=2}{\set{x \in \rat:x^2=2}} is. We know that 12=1,{1^2 = 1,} and 1<2,{1 \lt 2,} so how about we try 1.12.{1.1^2.} This gives us (11/10)2=121/100=1.21.{(11/10)^2=121/100=1.21.} Still pretty far from 2, so let's try (1.2)2.{(1.2)^2.} Here, we have (12/10)2=1.44.{(12/10)^2=1.44.} Still pretty far. How about we jump to 1.4.{1.4.} This gives us (14/10)2=1.96.{(14/10)^2=1.96.} That's close. Let's look at the squares for x1=1.41,x2=1.414,x3=1.41414,{x_1=1.41, x_2=1.414, x_3=1.41414,} and so on:

x1=(1.4)2=(14/10)2=1.96x2=(1.41)2=(141/100)2=1.9881x3=(1.414)2=(1414/1000)2=1.999396x4=(1.4141)2=(14141/10000)2=1.99967881x5=(1.41414)2=(141414/100000)2=1.9997919396x6=(1.414141)2=(1414141/1000000)2=1.99979476788 \eqs{ x_1 &= (1.4)^2 = (14/10)^2 = 1.96 \\ x_2 &= (1.41)^2 = (141/100)^2 = 1.9881 \\ x_3 &= (1.414)^2 = (1414/1000)^2 = 1.999396 \\ x_4 &= (1.4141)^2 = (14141/10000)^2 = 1.99967881 \\ x_5 &= (1.41414)^2 = (141414/100000)^2 = 1.9997919396 \\ x_6 &= (1.414141)^2 = (1414141/1000000)^2 = 1.99979476788 \\ }

This keeps getting closer and closer to two. Moreover, the distance between each term, xnxn1{\abs{x_n-{x_{n-1}}}} where nZ+,{n \in \pint,} keeps getting smaller and smaller. Visually, we might think of it like:

                          2  \ldots \ldots \ldots \circ~~~~~~~~ \circ~~~~~~ \circ~~~~ \circ~~~~ \circ~~ \circ~ \circ \circ \circ{\circ}\mathllap{\circ}~{\large 2}~\ldots \ldots \ldots

So, there's clearly some sort of sequence. But it's a very peculiar sequence, because the distance between the terms gets smaller and smaller. Augustin-Louis Cauchy (1789-1857) — the French mathematician credited with noticing and building upon this pattern with a definition. Before we get to that definition, we start with some definitions.

absolute value. Let xQ.{x \in \rat.} If x>0,{x \gt 0,} then x=x.{\abs{x} = x.} If x=0,{x=0,} then x=0.{\abs{x} = 0.} If x<0,{x \lt 0,} then x=x.{\abs{x} = -x.}

From these definitions, we now derive some lemmas.

lemma. For all xQ,{x \in \rat,} it is true that x=0{\abs{x} = 0} iff x=0.{x = 0.}

proof. Let xQ.{x \in \rat.} We have two cases, x=0    x=0{x=0 \implies \abs{x}=0} and x=0    x0.{\abs{x}=0 \implies x \neq 0.} The first case is true by definition, so only the second case must be addressed. Suppose x=0{\abs{x} = 0} and x0.{x \neq 0.} If x0,{x \neq 0,} then by the trichotomy law, either x<0{x \lt 0} or x>0.{x \gt 0.} If x<0,{x \lt 0,} then by definition, x=x.{\abs{x} = -x.} But that cannot be true, since we assumed that x=0.{\abs{x}=0.} If x>0,{x \gt 0,} then by definition, x=x.{\abs{x}=x.} But that cannot be true either, since we assumed that x=0.{\abs{x}=0.} It follows that x0{x \neq 0} and x=0{\abs{x}=0} can never be true. Therefore, x=0{\abs{x}=0} if and only if x=0.{x = 0.} {\blacksquare}

lemma. For all x,yQ,{x,y \in \rat,} it is true that xy=0{\abs{x-y} = 0} iff x=y.{x = y.}

proof. Let x,yQ.{x,y \in \rat.} Then by the closure of subtraction on rationals, xyQ.{x - y \in \rat.} By the additive inverse, xy=0{x-y = 0} if and only if x=y.{x = y.} It follows that xy=0{\abs{x-y}=0} if and only if x=y.{x = y.} {\bs}

nonnegativity of absolute value. For all xQ,{x \in \rat,} it is true that x0.{\abs{x} \ge 0.}

proof. Let xQ.{x \in \rat.} There are only three cases, x<0,{x \lt 0,} x=0,{x = 0,} and x>0{x \gt 0} by the trichotomy law. If x<0,{x \lt 0,} then x=x,{x = -x,} and x=(x)=x.{\abs{x}=-(-x)=x.} If x=0,{x = 0,} then x=0.{\abs{x}=0.} If x>0,{x \gt 0,} then x=x.{\abs{x} = x.} This exhausts all three possible cases of x.{x.} Therefore, for all xQ,{x \in \rat,} it is true that x0.{\abs{x} \ge 0.} {\bs}

lemma. For all x,yQ,{x,y \in \rat,} it is true that yx=xy.{\abs{y-x}=\abs{x-y}.}

proof. Let x,yQ.{x,y \in \rat.} If xy=0,{x - y = 0,} then x=y,{x=y,} and yx=0.{y - x = 0.} Thus, xy=0=yx.{\abs{x-y}=0=\abs{y-x}.} If xy>0,{x - y \gt 0,} then yx<0.{y - x \lt 0.} It follows that xy=xy{\abs{x - y} = x - y} and yx=(yx)=xy.{\abs{y-x}=-(y-x)=x - y.} If yx>0,{y-x \gt 0,} then xy<0,{x - y \lt 0,} so xy=(xy)=yx{\abs{x-y}=-(-x-y)=y-x} and yx=yx.{\abs{y-x}=y-x.} If xy=0{x - y = 0} then yx=0,{y - x = 0,} and by lemma 0.1.4, it follows that xy=0=yx.{\abs{x-y}=0=\abs{y-x}.} Therefore, for all x,yQ,{x,y \in \rat,} we have yx=xy.{\abs{y-x}=\abs{x-y}.} {\bs}

lemma. If yxy,{-y \le x \le y,} then xy.{\abs{x} \le y.}

proof. The inequality yxy{-y \le x \le y} is equivalent to xy.{-x \le y.} Since x{\abs{x}} equals x{x} or x,{-x,} the lemma holds. {\bs}

triangle inequality. For all x,yQ,{x,y \in \rat,} it is true that x+yx+y.{\abs{x+y} \le \abs{x} + \abs{y}.}

proof. Let x,yQ.{x,y \in \rat.} From the nonnegativity of absolute value, we know that xxx,{-\abs{x} \le x \le \abs{x},} and yyy.{-\abs{y} \le y \le \abs{y}.} Therefore, we have xyx+yx+y.{-\abs{x} - \abs{y} \le x + y \le \abs{x} + \abs{y}.} By definition, x+y{\abs{x+y}} equals x+y{x + y} or (x+y).{-(x + y).} It follows that x+yx+y.{\abs{x + y} \le \abs{x} + \abs{y}.} {\bs}

multiplicativity of absolute value. For all x,yQ,{x,y \in \rat,} it is true that xy=xy.{\abs{xy} = \abs{x} \by \abs{y}.}

proof. We address five cases: (a) x=y=0;{x = y = 0;} (b) x>0{x \gt 0} and y>0;{y \gt 0;} (c) x<0{x \lt 0} and y<0;{y \lt 0;} (d) x<0{x \lt 0} and y>0;{y \gt 0;} and (e) x>0{x \gt 0} and y<0.{y \lt 0.} The theorem holds for case (a): xy=0=xy=xy.{\abs{x}\abs{y}=0=xy=\abs{xy}.} For case (b), it follows that xy>0,{xy \gt 0,} and xy=xy{\abs{xy}=xy} by definition. Given x=x{x = \abs{x}} and y=y,{y = \abs{y},} we have xy=xy=xy.{\abs{x} \by \abs{y}=xy=\abs{xy}.} For case (c): It follows that xy>0,{xy \gt 0,} and xy=xy.{\abs{xy}=xy.} Given x=x{-x=\abs{x}} and y=y{-y=\abs{y}} by definition, we have xy=(x)(y)=xy=xy.{\abs{x}\abs{y}=(-x)(-y)=xy=\abs{xy}.} For case (d): Let x<0{x \lt 0} and y>0.{y \gt 0.} Then xy<0{xy \lt 0} and xy=(xy).{\abs{xy}=-(xy).} By definition, we have x=x{-x = \abs{x}} and y=y.{y = \abs{y}.} Therefore, xy=(x)y=(xy)=xy.{\abs{x}\abs{y}=(-x)y=-(xy)=\abs{xy}.} By commutativity of multiplication, case (e) is equivalent to case (d). Therefore, we have xy=xy{\abs{xy}=\abs{x}\by\abs{y}} for all x,yQ.{x,y \in \rat.} {\bs}

distance. Given x,yQ,{x,y \in \rat,} we define the notation d(x,y):=xy.{\d(x,y):=\abs{x-y}.}

closeness. Given x,y,εQ{x,y,\varepsilon \in \rat} and ε>0,{\varepsilon \gt 0,} we say that x{x} and y{y} are close if and only if d(x,y)ε.{\d(x,y) \le \varepsilon.} If x{x} and y{y} are close, we write xεy.{x \ec y.} Otherwise, we write x̸εy.{x \not\ec y.}

The notion of closeness leads to a few helpful lemmas.

reflexivity of closeness. For all x,yQ{x,y \in \rat} with ε>0,{\varepsilon \gt 0,} if x=y,{x = y,} then xεy.{x \ec y.}

proof. Let x,y,εQ{x,y,\varepsilon \in \rat} with ε>0.{\varepsilon \gt 0.} If x=y,{x = y,} then xy=0.{\abs{x-y}=0.} Since ε>0,{\varepsilon \gt 0,} it follows that d(x,y)<ε,{\d(x,y) \lt \varepsilon,} which implies that xεy.{x \ec y.}

symmetry of closeness. For all x,y,εQ{x,y,\varepsilon \in \rat} with ε>0,{\varepsilon \gt 0,} if xεy,{x \ec y,} then yεx.{y \ec x.}

proof. Let x,y,εQ{x,y,\varepsilon \in \rat} with ε>0.{\varepsilon \gt 0.} We have three cases, x<y,{x \lt y,} x=y,{x = y,} and x>y.{x \gt y.} If x=y,{x = y,} then y=x.{y = x.} Since x=yxεy{x = y \nc x \ec y} and y=xyεx,{y = x \nc y \ec x,} the theorem holds when x=y.{x = y.} If x<y,{x \lt y,} then xy=yx.{\abs{x-y}=y-x.} If x>y,{x \gt y,} then xy=xy.{\abs{x-y}=x-y.} We know that xy=yx.{\abs{x-y}=\abs{y-x}.} Thus, if yεx,{y \ec x,} then yxε,{\abs{y-x} \le \varepsilon,} and if xεy,{x \ec y,} then xyε.{\abs{x-y} \le \varepsilon.} Since xy=yx,{\abs{x-y}=\abs{y-x},} we have (xy=yx)ε.{(\abs{x-y}=\abs{y-x})\le \varepsilon.} It follows that if xεy,{x \ec y,} then yεx.{y \ec x.} {\bs}

sequence. A sequence is a function a:{n,mZ:nm}Q,{a:\set{n,m \in \uint:n \ge m} \mapsto \rat,} denoted ( an )n=m.{\seq{a_n}{n=m}{\infty}.} The members of a{a} are an ordered collection of rationals, which we may denote as (a(1),a(2),a(3),){(a(1), a(2), a(3), \ldots)} By convention, we denote each application of X{X} as (a1,a2,a3,){(a_1, a_2, a_3, \ldots)} in lieu of the standard notation a(n).{a(n).}

example. ( n )n=1=(1,2,3,4,).{\seq{n}{n=1}{\infty} = (1,2,3,4,\ldots).}

example. ( 2n )n=1=(2,4,6,8,).{\seq{2n}{n=1}{\infty} = (2,4,6,8,\ldots).}

example. (1n+(1n)2)n=1=(0,1,0,1,0,1,).{\ar{\frac{1^n+(-1^n)}{2}}_{n=1}^{\infty} = (0,1,0,1,0,1,\ldots).}

example. (2nn2+1)n=1=(22,45,610,817,).{\ar{\frac{2n}{n^2+1}}_{n=1}^{\infty} = \ar{\frac{2}{2}, \frac{4}{5}, \frac{6}{10}, \frac{8}{17},\ldots}.}

From that definition, we have the Cauchy sequence:

cauchy sequence. A sequence (xn)nZ+{(x_n)_{n \in \pint}} is a Cauchy sequence if, and only if: For any rational ε>0,{\varepsilon \gt 0,} there exists an integer NZ+,{N \in \pint,} such that, if n,mN,{n,m \ge N,} then xnxm<ε.{\abs{x_n-x_m} \lt \varepsilon.}

Note what this preliminary definition is trying to capture — the behavior we saw earlier. Cauchy chose the symbol ε{\varepsilon} to denote "error" — that teeny-tiny bit separating xn{x_n} and xm.{x_m.} Now, remember that sequences are functions. Cauchy's sequence says: There's some positive integer N{N} (some index), where, for function arguments n{n} or m{m} greater than N{N} (any two indices on or after N{N}), we're going to see this behavior where x(n)x(m){\abs{x(n) - x(m)}} (the distance between the rationals xn{x_n} and xm{x_m}) is always less than any distance we can think of (the distance ε{\varepsilon}). That is, for any two terms after the index N,{N,} no matter what ε{\varepsilon} someone picks — "they're separated by ε{\varepsilon}" — we can always go back and tell them, "No, they're actually closer than that." That's all ε{\varepsilon} is — how separated the terms are.

Now, we might wonder why Cauchy decided to characterize this distance in terms of an "error." Here's some intuition: Suppose we bought a 16-inch pizza. We get the order back, and it's 12-inches. Would we complain? Of course! The pizza's off by 6 inches (which could very well translate to a price difference of over 10 dollars). How about 14-inches? Certainly. 15 inches? Maybe. What about 15.999999999999 inches? Unlikely — it's unclear whether we'd even notice the difference in the first place. That's what's going on with the term "error." If the error ε{\varepsilon} is small enough, the number xn,{x_n,} for all intents and purposes, is tantamount to xm.{x_m.}

With that intuition, we now state the following definition:

convergent sequence. Let (xn){(x_n)} be a sequence, with nZ+.{n \in \pint.} We say that (xn){(x_n)} converges to a{a} if, and only if, for all ε>0,{\varepsilon \gt 0,} there exists an NN,{N \in \nat,} such that, for all nN,{n \ge N,} it is true that xna<ε.{\abs{x_n - a} \lt \varepsilon.} If (xn){(x_n)} coverges to a,{a,} we say that a{a} is the limit of (xn).{(x_n).}

This seems like a complicated definition. Let's compare its logical form to the Cauchy sequence definition's logical form:

cauchy sequenceε>0.{\forall \varepsilon \gt 0.}NN.{\exists N \in \nat.}n,mN.{\forall n,m \ge N.}xnxm<ε.{\abs{x_n-x_m} \lt \varepsilon.}
convergent sequenceaQ.{\exists a \in \rat.}ε>0.{\forall \varepsilon \gt 0.}NN.{\exists N \in \nat.}nN.{\forall n \ge N.}xna<ε.{\abs{x_n - a} \lt \varepsilon.}

Do we see how this definition is different? It hones in on that xm{x_m} term from the Cauchy sequence — the term that "comes after" xn.{x_n.} Why the quotes? We're putting quotes around that phrase because we're now going to add a bit more nuance. Suppose we're not the pizza buyers, but the pizza sellers. Would we be upset if an employee accidentally delivered a 32-inch pizza instead of the 16-inch? Sure. How about an 18-inch pizza? Perhaps. How about 16.000000001? Hopefully not. The same idea applies here. The xn{x_n} isn't necessarily 15.99999. It could also be 16.00000001. When we say that a sequence converges, we're saying that all of the sequence elements, eventually, are tantamount to a,{a,} before or after a.{a.}

All that said, we have to be careful with our language. At no point — never ever ever — is xn=a.{x_n = a.} It's why we use the phrase "tantamount to" rather than "same" or "indistinguishable." Clearly 15.999999999 and 16 are distinguishable. One has that symbol "5" and the other "6." The word tantamount means "virtually the same as," which is exactly the phrasing we want.

Putting it all together, we get something that looks like the following:

aε                               a                             a+ε \ldots \stackrel{\stackrel{\normalsize a-\varepsilon}{\mid}}{\ws} \circ~~~~~~~~~~ \circ~~~~~~~~ \circ~~~~~~ \circ~~~~ \circ~~ \circ~ \circ \stackrel{\stackrel{\normalsize a}{\mid}}{\ws} \circ ~\circ ~~~~\circ ~~~~~~\circ ~~~~~~~~\circ ~~~~~~~~~~\circ \stackrel{\stackrel{\normalsize a+\varepsilon}{\mid}}{\ws} \ldots

The red dots comprise a region called the epsilon neighborhood of a.{a.} What the convergence definition tells us is, if a sequence (xn){(x_n)} converges, there are only so many points outside this neighborhood. That is, there are finitely many points outside the neighborhood. In contrast, no matter how small we make ε,{\varepsilon,} we're always going to find a term closer to a.{a.} An often cited example of a convergent sequence is (1/n).{(1/n).} Examine its values:

n{n}110100100010 000{10~000}100 000{100~000}1 000 000{1~000~000}
(1n){\ar{\frac{1}{n}}}10.10.010.0010.00010.000010.000001

Eventually we get something that looks like

0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001 \footnotesize \ax{ & 0.00000000000000000000000000000000000000000 \\ & 0000000000000000000000000000000000000000000 \\ & 0000000000000000000000000000000000000000000 \\ & 0000000000000000000000000000000000000000000 \\ & 0000000000000000000000000000000000000000000 \\ & 0000000000000000000000000000000000000000000 \\ & 0000000000000000000000000000000000000000000 \\ & 0000000000000000000000000000000000000000000 \\ & 0000000000000000000000000000000000000000000 \\ & 0000000000000000000000000000000000000000000 \\ & 0000000000000000000000000000000000000000000 \\ & 0000000000000000000000000000000000000000000 \\ & 0000000000000000000000000000000000000000001 \\ }

and yet there will always be something smaller. We should now able to see the connection between the Cauchy sequence and convergent sequences: All convergent sequences are Cauchy sequences, but not all Cauchy sequences are convergent. Why? Because the Cauchy sequences exist only among the rationals — they don't have that number a,{a,} because the a{a} doesn't exist among the rationals. If it did, we wouldn't have this problem of trying to figure out what x{x} is in x2=2.{x^2 = 2.}

Convergence of Cauchy Sequences

Suppose (xn){(x_n)} is a convergent sequence, with some limit a.{a.} We'll let ε>0,{\varepsilon \gt 0,} and define ε:=ε/2>0.{\varepsilon' := {\varepsilon}/{2} \gt 0.} Because we assumed that (xn){(x_n)} is convergent, there exists an NN{N \in \nat} such that, for any index n{n} greater than or equal to N{N}:

xna<ε. \abs{x_n - a} \lt \varepsilon'.

We need to tie this to the Cauchy sequence's definition of distance (error between xn{x_n} and xm{x_m}). We can just write:

xnxm=xna+axm. \abs{x_n - x_m} = \abs{x_n - a + a - x_m}.

And by the triangle inequality:

xnxm=xna+axm \abs{x_n - x_m} = \abs{x_n - a} + \abs{a - x_m}

We thus have:

xnxm=xna<ε+axm<ε \abs{x_n - x_m} = \tnote{\abs{x_n - a}}{$\lt \varepsilon'$} + \tnote{\abs{a - x_m}}{$\lt \varepsilon'$}

And since we defined ε:=ε/2>0,{\varepsilon' := \varepsilon/2 \gt 0,} we have:

(xnxm)(xna+axm)<(2ε=ε). (\abs{x_n - x_m}) \le (\abs{x_n - a} + \abs{a - x_m}) \lt (2 \varepsilon'=\varepsilon).

The Cauchy sequence, does, in fact, converge.

Dedekind Cuts

remark: For this section, assume Cauchy sequences do not exist. In fact, Dedekind cuts came before Cauchy sequences historically. Cauchy sequences were addressed first because they're generally easier to think about. Dedekind cuts are a bit trickier because the definitions are hefty, but they're well manageable, as long as we follow the definitions very carefully. Otherwise, we can quickly come unstuck!

Let's think about where x{x} might be, given x2=2.{x^2 = 2.} Let's consider x22.{x^2 \neq 2.} In that case, we have x2<2,{x^2 \lt 2,} or x2>2.{x^2 \gt 2.} Let's look at x2<2{x^2 \lt 2} first. What is the set of all numbers whose square is less than 2? I.e., what is the set X={xQ:x2<2}?{X = \set{x \in \rat : x^2 \lt 2}?} If we try some values, we'd find that this is a pretty big subset:

{,2,,{,1110,1,,18,14,13,12,0,12,13,14,18,,1,1110,}A,,2,}\footnotesize \lset{\ldots, -2, \ldots, \tnote{\lset{ \ldots,-1\frac{1}{10},-1,\ldots, -\frac{1}{8}, -\frac{1}{4}, -\frac{1}{3}, -\frac{1}{2},0,\frac{1}{2},\frac{1}{3},\frac{1}{4}, \frac{1}{8}, \ldots, 1,1 \frac{1}{10},\ldots }}{$A$}, \ldots, 2, \ldots}

The problem with A{A} are those pesky dots at the ends. We'd like to know where this set starts and ends. More formally, we'd like to know if A{A} has a least upper bound. Let's define some terminology.

upper bound. Let P{P} be a poset (P,){(P, \le)} with a set SP.{S \subseteq P.} If there exists an element aP{\mathcal{a} \in P} such that for all sS,{s \in S,} the relation sa{s \le \mathcal{a}} is true, then a{\mathcal{a}} is called an upper bound of S.{S.} We write upper-bounds[S]{\df{upper-bounds}[S]} to denote the set of all upper bounds of S.{S.} If upper-bounds[S],{\df{upper-bounds}[S] \neq \nil,} we say that S{S} is bounded above.

example. Let A={1,2,3,4,5}{A = \set{1,2,3,4,5}} with AZ.{A \subset \uint.} 5 is an upper bound of A.{A.} So are 6, 7, 8, 9, 10, and so on, because AZ.{A \subset \uint.}

lower bound. Let P{P} be a poset (P,){(P,\le)} with a set SP.{S \subseteq P.} If there exists an element aP{\mathcal{a} \in P} such that, for all sS,{s \in S,} the relation as{\mathcal{a} \le s} is true, then a{\mathcal{a}} is called a lower bound of S.{S.} We write lower-bounds[S]{\df{lower-bounds}[S]} to denote the set of all lower bounds of S.{S.} If lower-bounds[S],{\df{lower-bounds}[S] \neq \nil,} then we say that S{S} is bounded below.

example. Let A={7,8,9,10}{A = \set{7,8,9,10}} with AZ.{A \subset \uint.} 7 is a lower bound of A.{A.} So is 6, 5, 4, and so on, because AZ.{A \subset \uint.}

bounded set. Given a poset (P,){(P,\le)} with a set SP.{S \subseteq P.} We say that P{P} is bounded above and below if, and only if, lower-bounds[S]{\df{lower-bounds}[S] \neq \nil} and upper-bounds[S].{\df{upper-bounds}[S] \neq \nil.} That is, there exists elements n{n} and N{N} such that nxN{n \le x \le N} for all xS.{x \in S.}

least upper bound. Given a poset (P,){(P,\le)} with a set SP{S \subseteq P} and upper-bounds[S].{\df{upper-bounds}[S] \neq \nil.} If there exists an element nupper-bounds[S],{n \in \df{upper-bounds}[S],} such that, for all uupper-bounds[S],{u \in \df{upper-bounds}[S],} the relation nu{n \le u} holds, then we say that n{n} is the least upper bound or supremum of S{S} and write supS=n.{\sup S = n.}

The least upper bound (supremum), in short, is the least upper bound among all upper bounds. Notice that this implies that the least upper bound need not be in the subset S.{S.} Why? Because any element in the superset of S{S} is a candidate for being an upper bound.

greatest lower bound. Given a poset (P,){(P,\le)} with a set SP{S \subseteq P} and lower-bounds[S].{\df{lower-bounds}[S] \neq \nil.} If there exists an element lower-bounds[S],{\ell \in \df{lower-bounds}[S],} such that, for all Llower-bounds[S],{L \in \df{lower-bounds}[S],} the relation L{\ell \ge L} holds, then we say that n{n} is the greatest lower bound or infimum of S{S} and write infS=n.{\inf S = n.}

The diagram below is helpful for parsing the definitions.

infsup

We've marked the maximum and minimum of A{A} separately to communicate the fact that they can be distinct, but they don't have to be, and vice versa. If supA{\sup A} turns out to be in A,{A,} then it is also maxA.{\max A.} Likewise, if infA{\inf A} turns out to be in A,{A,} then it is also minA.{\min A.} But it's not always the case that supAA,{\sup A \in A,} nor is it always the case that infAA.{\inf A \in A.}1 Put differently, if minA{\min A} exists, then minA=infA,{\min A = \inf A,} and if maxA{\max A} exists, then maxA=supA.{\max A = \sup A.} For both these propositions, neither the sufficient condition nor the converse is guaranteed to be true.

The greatest lower bound (infimum) is the greatest lower bound among all lower bounds. Like the supremum, the least upper bound need not be in the subset S.{S.} It depends entirely on how we've defined the superset of S.{S.} The infimum is also denoted lub S,{\text{lub}~S,} and the infimum is also alternatively denoted glb S.{\text{glb}~S.} We use supS{\sup S} and infS{\inf S} because LaTeX{\LaTeX} provides them natively.

So, for our set Φ={x:x<2},{\Phi = \set{x: x \lt 2},} there are many upper bounds. We know that supΦ=2.{\sup \Phi = 2.} Why? Because the subset Φ{\Phi} comprises all the rationals less than 2, and for all rationals x2{x \ge 2} (the upper bounds of {x:x<2}{\set{x:x \lt 2}}), it is true that 2x.{2 \le x.} Is there a supremum for the set A={x:x2<2}?{A=\set{x:x^2 \lt 2}?} Let y=pqQ,{y = \frac{p}{q} \in \rat,} such that y>x{y \gt x} for all xA.{x \in A.} Suppose y{y} is the first rational satisfying y2>2.{y^2 \gt 2.} Consider y^=4+3y3+2y.{\hat{y}=\frac{4+3y}{3+2y}.} In that case, we have (y^22=y22(3+2y)2)>0.{\ar{\hat{y}^2-2=\frac{y^2-2}{(3+2y)^2}} \gt 0.} So, we have y^2>2.{\hat{y}^2 \gt 2.} But we know that y^<y,{\hat{y} \lt y,} so it cannot be true that y2>2.{y^2 \gt 2.} What if y2<2?{y^2 \lt 2?} Again we have y^=4+3y3+2y,{\hat{y}=\frac{4+3y}{3+2y},} but now we have (y^22=y22(3+2y)2)<0.{\ar{\hat{y}^2-2=\frac{y^2-2}{(3+2y)^2}} \lt 0.} Thus, there can be no such way. The set A{A} has no supremum.

These two examples, however, give us an idea. Because of the trichotomy law, we know that for any rational number, we're always going to have one of the following relations apply: <,=,>.{\lt, =, \gt.} So, for any rQ,{r \in \rat,} we'll have many numbers to the left of r{r} (<{\small \lt}), r{r} itself (={\small =}), and many numbers to the right of r{r} (>{\small \gt}). Maybe we could define the real numbers in this way: They're cuts in the sequence of rationals. That is, a real number r{r} is either the supremum of a subset containing r,{r,} or the gap between the two sets.

{}r{}{}{r} \ldots \ldots \lset{\ldots\ldots\ldots\ldots\ldots} {\large r} \lset{\ldots\ldots\ldots\ldots\ldots} \ldots \ldots \\ \ldots \ldots \lset{\ldots\ldots\ldots\ldots\ldots} \lset{{\large r}\ldots\ldots\ldots\ldots\ldots} \ldots \ldots

This is where the name Dedekind cut comes from. We're defining a new set of numbers by making cuts (or cutoffs, splits, etc.) along the sequence of rationals. We may may feel that this isn't quite as "natural" or "intuitive" as what we did for the integers and rationals. For those constructions, the new numbers resulted from mappings defined in terms of established operations. We'll revisit this feeling once we've established a bit more rigor to this notion of cutting.

cut. A cut α{\alpha} is a subset of Q{\rat} such that: (1) α{\alpha} contains a rational number, but not all rational numbers. (2) Every rational number in α{\alpha} is smaller than every rational number not in α.{\alpha.} (3) No number in α{\alpha} is greater than any other number in α.{\alpha.}

Right off the bat, this definition is packed with a lot of conditions:

definition components: cut
2.0.1αQ{{\alpha\subset\rat}}
2.0.2α{{\alpha\neq\nil}}
2.0.3αQ{{\alpha\neq\rat}}
2.0.4(pα)(qQ)(q<p)qα{{(p\in\alpha)\land(q\in\rat)\land(q \lt p)\nc q \in\alpha}}
2.0.5(pα)(r.[(rα)(p<r)]){{(p\in\alpha)\nc(\exists r.[(r\in\alpha)\land(p\lt r)])}}

We have to be a little careful with the notion of a cut.

Footnotes

  1. As we'll see later, the key distinction between the four concepts is that minA{\min A} and maxA{\max A} aren't guaranteed to exist for a set AR,{A \subset \reals,} but supA{\sup A} and infA{\inf A} are guaranteed to exist for AR{A \subset \reals} (where R{\reals} is the set of all real numbers).