\documentclass[12pt]{article}
\usepackage[margin=0.7in]{geometry}
\usepackage{fancyhdr,lastpage}
\usepackage{amsmath,amsfonts,amsthm,amssymb}
\newtheorem{proposition}{Proposition}
\newtheorem{axiom}{Axiom}
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\newtheorem{conjecture}{Conjecture}
\newtheorem{corollary}{Corollary}
\theoremstyle{definition}
\newtheorem{definition}{Definition}
\pagestyle{fancy}
\lhead{MATH 311-02}
\rhead{Full Proof of the Cantor-Schr\"oder-Bernstein Theorem}
%\rhead{Introduction to Higher Math}
\lfoot{}
\cfoot{Page \thepage\ of \pageref{LastPage}}
\rfoot{}
\begin{document}
This is the full proof of the Cantor-Schr\"oder-Bernstein Theorem, a
powerful result for developing relationships between different sets.
\section{Statement of the Theorem, and a simple case}
\begin{theorem}[Dedekind 1887, Cantor 1895, Schr\"oder 1896, Bernstein
1897, K\"onig 1906]
For sets $A$ and $B$, if $f:A\rightarrow B$ is an injective function
and $g:B\rightarrow A$ is an injective function, then there is a
bijective function $h:A\rightarrow B$.
\end{theorem}
This turns out to be a very-nearly trivial result when $A$ or $B$ is
finite; we'll prove that much simpler statement just to get a handle
on the question being asked, although none of our methods will
translate over.
\begin{proposition}
For sets $A$ and $B$, if either $A$ or $B$ is finite, and
$f:A\rightarrow B$ is an injective function, and $g:B\rightarrow A$
is an injective function, then there is a bijective function
$h:A\rightarrow B$.
\end{proposition}
\begin{proof}
Recall a result from previously, which we will use a great deal in
this proof: if $B$ is finite and $f:A\rightarrow B$ is injective,
then $A$ is a finite set and $|A|\leq |B|$; in other words, an
injective function with finite codomain has a domain which is finite
and no larger than its codomain.
Note that if $A$ is finite, then by the above result applied to
$g:B\rightarrow A$, $B$ is finite; likewise if $B$ is finite, then
by the above result applied to $f:A\rightarrow B$, $A$ is finite, so
from the premise that at least one of these two sets is finite, we
may conclude that both are.
Now, by injectivity of $f$, it follows that $|A|\leq |B|$, and by
injectivity of $g$, it follows that $|B|\leq |A|$. Thus $|A|=|B|$;
let us denote the size of $A$ by $n$, so that $|A|=|B|=n$. Then we
may assign names to the $n$ distinct elements of $A$ and $B$ as
such: let $A=\{a_1,a_2,\ldots,a_n\}$ and $B=\{b_1,b_2,\ldots,b_n\}$,
and now we can build a bijection by simply letting $h(a_i)=b_i$ for
each $i$ in $1,2,\ldots, n$. This is clearly injective, since if
$i\neq j$, then $a_i\neq a_j$ and $b_i\neq b_j$; likewise it is
surjective, since for each $b_i\in B$, we have that $f(a_i)=b_i$.
\end{proof}
Of course, for infinite sets, we have no such convenient notion of
``size'', and even the idea that we could label the elements of $A$
and $B$ in order turns out to be subtly flawed. So we must be much
more careful in our actual proof. The following proof is based on
K\"onig's argument; the previous proofs were more technical and used
some unusual set-theoretical axioms.
\section{The Zig-Zag Sequence and Partition of $A$}
One key mechanism we will use is what might be called a ``zig-zag
sequence'' built from the given injections $f:A\rightarrow B$ and
$g:A\rightarrow A$. We will construct sequences which partition $A$
and $B$, and on each part of this partition we'll build a bijection.
Let us consider some $a\in A$. In one direction, it is clear that we
can trace the process of repeated applications of $f$ and $g$ to this
element of $A$, and build a sequence of alternating elements from $A$
and $B$ in such a manner:
$$a,f(a),g(f(a)),f(g(f(a))),g(f(g(f(a)))),\ldots,(g\circ f)^n(a),(f\circ(g\circ f)^n)(a),(g\circ f)^{n+1}(a),\ldots$$
and we could continue this process indefinitely (although not every
element of the sequence is guaranteed to be distinct).
We'd like to also step \emph{backwards} from $a$; we shall see why this
is useful as we analyze each individual sequence. Here we need to be
careful: there is no good reason to believe there is any $b$ such that
$g(b)=a$! But we will build inverse functions and be extremely careful
about their application. $f:A\rightarrow B$ and $g:B\rightarrow A$,
which may not be surjective, are not guaranteed to have inverses;
however, by restricting their codomain, we can frame the same set of
ordered pairs in a surjective \emph{context}: let's define $A'$ and
$B'$ to be the images of $g$ and $f$ respectively; then if we consider
the reframing of the same functions as $f:A\rightarrow B'$ and
$g:B\rightarrow A'$, their codomains are definitionally equal to their
images so they are surejctive, and we can assert the existence of
inverses $f^{-1}:B'\rightarrow A$ and $g^{-1}:A'\rightarrow B$.
Now, with these inverses, we can attempt to build a backwards-stepping
sequence from $a$:
$$a,g^{-1}(a),f^{-1}(g^{-1}(a)),g^{-1}(f^{-1}(g^{-1}(a))),\ldots,(f^{-1}\circ g^{-1})^n(a),(g^{-1}\circ(f^{-1}\circ g^{-1})^n)(a),(f^{-1}\circ g^{-1})^{n+1}(a),\ldots$$
But our blithe assertion that this process can be performed is
predicated on some shaky ground; remember that $f^{-1}$ and $g^{-1}$
have domains $A'$ and $B'$, not $A$ and $B$. This process could fail
to be meaningful on the first step, if $a\notin A'$, or on the second
step, if $g^{-1}(a)\notin B'$, or the third, if
$f^{-1}(g^{-1}(a))\notin A'$, and so forth.
We shall thus define a structure called the \emph{zig-zag sequence} of a
particular $a\in A$ to be one of the three following constructs.
\begin{itemize}
\item A sequence which has a ``stopping point'' in $A$:
\begin{multline*}
(f^{-1}\circ g^{-1})^n(a), \left(g^{-1}\circ(f^{-1}\circ g^{-1})^{n-1}\right)(a),(f^{-1}\circ g^{-1})^{n-1}(a),\ldots,\\
(f^{-1}\circ g^{-1})(a),g^{-1}(a),a,f(a),(g\circ f)(a),(f\circ
g\circ f)(a),(g\circ f\circ g\circ f)(a),\ldots
\end{multline*}
where $(f^{-1}\circ g^{-1})^n(a)\notin A'$,
\item a sequence which has a ``stopping point'' in $B$:
\begin{multline*}
\left(g^{-1}\circ(f^{-1}\circ g^{-1})^n\right)(a), (f^{-1}\circ g^{-1})^n(a), \left(g^{-1}\circ(f^{-1}\circ g^{-1})^{n-1}\right)(a),(f^{-1}\circ g^{-1})^{n-1}(a),\ldots,\\
(f^{-1}\circ g^{-1})(a),g^{-1}(a),a,f(a),(g\circ f)(a),(f\circ
g\circ f)(a),(g\circ f\circ g\circ f)(a),\ldots
\end{multline*}
where $\left(g^{-1}\circ(f^{-1}\circ g^{-1})^n\right)(a)\notin B'$,
\item or a sequence which does not stop in its backwards propogation:
\begin{multline*}
\ldots,(g^{-1}\circ f^{-1}\circ g^{-1})(a),(f^{-1}\circ
g^{-1})(a),g^{-1}(a),a,f(a),(g\circ f)(a),(f\circ
g\circ f)(a),(g\circ f\circ g\circ f)(a),\ldots
\end{multline*}
\end{itemize}
These three types of sequences have traditionally been handled
separately; in addition, the third case is often considered with
regard to two distinct possibilities, where all terms of the sequence
are distinct or where they are not all distncit. As it turns out, we
actually only need to consider \emph{two} distinct cases when we
finally get to analyzing individual zig-zag sequences.
First, however, we will have to carve $A$ and $B$ up into these
sequences. We will show that it is possible to do so by defining a
relation on shared-sequence-membership, and that this relation is
actually an equivalence relation.
Let us define a relation $R$ on $A$ as such: for $x,y\in A$,
$x\mathrel R y$ if and only if $y$ appears in the zig-zag sequence of
$x$: that is to say, if either $y=x$ or there is some positive integer
$n$ such that either $(g\circ f)^n(x)=y$ or $(f^{-1}\circ
g^{-1})^n(x)=y$.
\begin{lemma}
$R$ is an equivalence relation.
\end{lemma}
\begin{proof}
First we shall demonstrate reflexivity: for every $x\in A$, the
defined construction of $x$'s zig-zag sequence always includes $x$
between the forward- and backward-propagations, so clearly $x$ lies
in the zig-zag sequence of $x$ and $x\mathrel R x$.
Now we demonstrate symmetry: let us assume the symmetric premise of
specific $x,y\in A$ such that $x\mathrel R y$, so either $y=x$ (in
which case $y\mathrel R x$ trivially), or $y=(g\circ f)^n(x)$, or
$y=(f^{-1}\circ g^{-1})^n(x)$. If $y=(g\circ f)^n(x)$, we can
perform inverses step by step:
\begin{align*}
y&=(g\circ f)^n(x)\\
g^{-1}(y)&=\left(g^{-1}\circ g\circ f\circ(g\circ f)^{n-1}\right)(x)\\
g^{-1}(y)&=\left(f\circ(g\circ f)^{n-1}\right)(x)\\
(f^{-1}\circ g^{-1})(y)&=\left(f^{-1}\circ f\circ(g\circ f)^{n-1}\right)(x)\\
(f^{-1}\circ g^{-1})(y)&=\circ(g\circ f)^{n-1}(x)\\
\end{align*}
and continue inductively until we determine that
$$(f^{-1}\circ g^{-1})^n(y)=x$$
which is a criterion for showing that $y\mathrel x$.
Likewise, when $y=(f^{-1}\circ g^{-1})^n(x)$, we can disentangle $x$
by repeated applications of $f$ and $g$:
\begin{align*}
y&=(f^{-1}\circ g^{-1})^n(x)\\
f(y)&=\left(f\circ f^{-1}\circ g^{-1}\circ(f^{-1}\circ g^{-1})^{n-1}\right)(x)\\
f(y)&=\left(g^{-1}\circ(f^{-1}\circ g^{-1})^{n-1}\right)(x)\\
(g\circ f)(y)&=\left(g\circ g^{-1}\circ(f^{-1}\circ g^{-1})^{n-1}\right)(x)\\
(g\circ f)(y)&=\circ(f^{-1}\circ g^{-1})^{n-1}(x)\\
\end{align*}
so again using induction we could show that
$$(g\circ f)^n(y)=x$$
so that $y\mathrel R x$.
Finally, we must show transitivity, which ends up being surprisingly
simple: there are four cases but they are largely identical: we
compose two functions of the forms $(g\circ f)^n$ or $(f^{-1}\circ
g^{-1})^n$, and it is easy to show we get back a function of the
same form.
\end{proof}
Because being-in-the-same-sequence is an equivalence relation, the
equivalence classes (i.e., the sets of elements of $A$ which lie in a
single sequence) partition $A$. Thus, each element of $A$ lies in
exactly one zig-zag sequence. It is easy to show that every element of
$B$ also lies in exactly one zig-zag sequence: if $b\in B$, then $b$
lies in the sequence determined by $g(b)$; if $b$ lay in multiple
sequences those multiple sequences would both contain $g(b)$.
This decomposition gives us a framework for our proof: if we can build
a bijection between the elements of $A$ and $B$ within a single
sequence, and then repeat that procedure for every sequence, then
we'll have constructed a bijection between $A$ and $B$. Henceforth, we
will look not at $A$ and $B$ as a whole, but at the sets $S_A$ and
$S_B$ consisting of the elements from $A$ and $B$ of a specific
zig-zag sequence.
On each sequence, we will find that a \emph{restriction} of the
function $f$ or $g$ suffices; let us define a restriction explicitly:
\begin{definition}
For a function $f:A\rightarrow B$ and $S\subseteq A$, the
\emph{restriction of $f$ to $S$}, written $f_S$, is a function from
$S$ to $B$ given by $f_S(a)=f(a)$.
\end{definition}
There are four essential ``shapes'' the zig-zag sequences can have,
which have traditionally been handled separately: there are sequences
where the inverse functions remain applicable at every backwards step
and which never repeat themselves, there are sequences where the
inverse functions remain applicable at every backwards step and which
do repeat themselves, there are sequences which eventually reach some
element of $A$ on which $g^{-1}$ is not defined, and there are
sequences which eventually reach some element of $B$ on which $f^{-1}$
is not defined. For simplicity, we can actually fold the first three
prospects into a single case.
\section{Zig-Zag sequences without $B$-blockage}
We shall consider in this section a specific type of zig-zag sequence:
we shall look at the case where $S_B\subseteq B'$, which is to say,
$f^{-1}$ is defined on each element of $S_B$. We shall show the
following significant proposition for such zig-zag sequences:
\begin{lemma}
If the sets $S_A\subseteq A$ and $S_B\subseteq B$ are the terms of a
zig-zag sequence, and if furthermore $S_B\subseteq B'$, then the
restriction $f_{S_A}$ is a bijection from $S_A$ to $S_B$.
\end{lemma}
\begin{proof}
Note that as traditionally defined, $f_{S_A}$ has domain $S_A$ and
codomain $B$, and is furthermore injective, since it is a
restriction of a function which was injective. However, we want to
show that, considered with codomain $S_B$, it is also surjective;
thus, we want to show that the image of $f_{S_A}$ is actually $S_B$;
to do so, we must first show that for all $a\in S_A$, it is the case
that $f(a)\in S_B$; in addition, we must show that for all $b\in
S_B$, there is some $a\in S_A$ such that $f(a)=b$.
In order to deal with the first assertion, let us consider an
arbitrary $a\in S_A$. Since we showed that every element of a
particular zig-zag sequence's $S_A$ generates the exact same
sequence (since they lie in the same equivalence class under the
members-of-the-same-sequence relation explored above), we may assert
that the zig-zag sequence of which $S_A$ and $S_B$ are the terms was
in fact generated by $a$, so one forward step from this generator
yields $f(a)$ as a term of the sequence, so $f(a)\in S_B$.
On the other hand, if we take an arbitrary $b\in S_B$, since
$S_B\subseteq B'$, $b\in B'$, so since $b$ is in the domain of
$f^{-1}$, we may assert that $f^{-1}(b)\in A$. The zig-zag sequence
generated by $f^{-1}(b)$ has one forward-step from its generator the
term $f(f^{-1}(b))$, which is simply $b$. Since $b$ lies in exactly
one zig-zag sequence, it follows that $f^{-1}(b)$ generates the
exact same zig-zag sequence as the one whose terms were distributed
into $S_A$ and $S_B$. Thus $f^{-1}(b)\in S_A$, and $f(f^{-1}(b))=b$,
so we have shown the existence of $a\in S_A$, specifically
$a=f^{-1}(b)$, such that $f(a)=b$..
Thus $f_{S_A}$ is surjective, when contextualized with codomain
$S_B$, and as previously seen it is injective, so
$f_{S_A}:S_A\rightarrow S_B$ is a bijection.
\end{proof}
Thus, among the parts of $A$ and $B$ described by zig-zag sequences
which do not have terms in $B-B'$, we may restrict $f$ to achieve
bijections on these parts.
\section{Zig-Zag sequences with $B$-blockage}
Here we consider the case not addressed in the previous section, where
$S_B\not\subseteq B'$; that is to say, sequences in which not every
element of $S_B$ lies in the domain of $f^{-1}$. Note that in this
case the argument from the previous section will not work: elements of
$B-B'$ aren't in the image of $f$, so a restriction of $f$ won't be a
surjection into $S_B$.
Now, as it turns out, the entirety of $S_A$ must lie within $A'$,
which will allow us to craft a bijection not with $f$, but with $g$.
\begin{proposition}
If the sets $S_A\subseteq A$ and $S_B\subseteq B$ are the terms of a
zig-zag sequence and $S_B\not\subseteq B'$, then it is the case that
$S_A\subseteq A'$, and furthermore $|S_B-B'|=1$.
\end{proposition}
\begin{proof}
Since $S_B\not\subseteq B'$, there is at least one element of $S_B$
which is not an element of $B'$. Let $b\in(S_B-B')$. Since $b\in
S_B$, it follows by the process defining the zig-zag sequence that
$g(b)\in S_A$, so the zig-zag sequence whose terms provide the
elements of $S_A$ and $S_B$ is equivalent to the sequence generated
by $g(b)$. Identifying our sequence by this generator allows us to
explicitly list both the forward-propogated and
backwards-propagated elements of the sequence:
$$b,g(b),f(g(b)),g(f(g(b))),f(g(f(g(b)))),\ldots$$
We can be assured that the back-propogation is exactly one step,
since $g(b)$ is definitionally in the image of $g$, so $g(b)\in A'$,
but by our premise $b\not\in B'$, so no back propagation further
than $b$ is possible.
It is now easy to show that every element of $S_A$ is in $A'$. The
terms this sequence contributes to $S_A$ are $g(b)$, $g(f(g(b)))$,
$g(f(g(f(g(b)))))$, and so forth; thus they are terms of the form
$g\bigl[(f\circ g)^n(b)\bigr]$; since this term is of the form
$g(b')$ for some $b'\in B$ (specifically given by $b'=(f\circ g)^n(b)$),
it is clear that each $g\bigl[(f\circ g)^n(b)\bigr]$ is in the image
of $g$, and thus lies in $A'$; since every element of $S_A$ is in
$A'$, it follows that $S_A\subseteq A'$.
Similarly, the terms this sequence contributes to $S_B$ are $b$,
$f(g(b))$, $f(g(f(g(b))))$, and so forth. Every term of this list
except for the first one is of the form $f(a)$ for some $a\in A$
particularly, of the form $a=g\circ(f\circ g)^n(b)$), so every
element of $S_B$ except for $b$ lies in the image of $f$, so there
is only one element of $S_B$ which is not an element of $B'$.
\end{proof}
Since we know $S_A\subseteq A'$, we can craft a bijection using
$g^{-1}$ following almost exactly the logic in the previous section.
\begin{lemma}
If the sets $S_A\subseteq A$ and $S_B\subseteq B$ are the terms of a
zig-zag sequence and $S_B\not\subseteq B'$, then the
restriction $g^{-1}_{S_A}$ is a bijection from $S_A$ to $S_B$.
\end{lemma}
\begin{proof}
By the previous proposition, given $S_B\not\subseteq B'$, we know
$S_A\subseteq A'$; since $g^{-1}$ has domain $A'$, we are thus
justified in defining a restriction onto its subset $S_A$.
As traditionally defined, $g^{-1}_{S_A}$ has domain $S_A$ and
codomain $B$. It is furthermore injective, since $g^{-1}$ was
injective and it is a restriction of a function which was
injective. However, we want to show that, considered with codomain
$S_B$, it is also surjective; thus, we want to show that the image
of $g^{-1}_{S_A}$ is actually $S_B$; to do so, we must first show that
for all $a\in A$, it is the case that $g^{-1}(a)\in S_B$; in addition, we
must show that for all $b\in S_B$, there is some $a\in S_A$ such
that $g^{-1}(a)=b$.
In order to deal with the first assertion, let us consider an
arbitrary $a\in S_A$. Since we showed that every element of a
particular zig-zag sequence's $S_A$ generates the exact same
sequence (since they lie in the same equivalence class under the
members-of-the-same-sequence relation explored above), we may assert
that the zig-zag sequence of which $S_A$ and $S_B$ are the terms was
in fact generated by $a$. Since $S_A\subseteq A'$, it follows that
$a\in A'$, so $g^{-1}(a)$ is defined, and one backwards step from
this generator yields $g^{-1}(a)$ as a term of the sequence, so
$g^{-1}(a)\in S_B$.
On the other hand, if we take an arbitrary $b\in S_B$, we may assert
that $g(b)\in S_A$ by forward-propagation through the zig-zag
sequence. Then note that since $g(b)\in A'$, $g(b)$ is in the domain
of $g^{-1}$, so $g^{-1}(g(b))=b$; thus $a=g(b)$ is an element of
$S_A$ such that $g^{-1}(a)=b$.
Thus $g^{-1}_{S_A}$ is surjective, when contextualized with codomain
$S_B$, and as previously seen it is injective, so
$f_{S_A}:S_A\rightarrow S_B$ is a bijection.
\end{proof}
\section{The Explicit Construction, Recapitulated}
In the end, the actual construction here is tedious (and difficult to
actually perform, since certain steps involve an infinite number of
steps, since both the number and lengths of zig-zag sequences may be
infinite. However, the procedure used to construct our bijection
$h:A\rightarrow B$ is really very simple.
Provided with sets $A$, $B$, and injections $f:A\rightarrow B$ and
$g:B\rightarrow A$, we produce an assignment rule for each $a_0\in A$
as such: produce by back-propagation a sequence $b_1=g^{-1}(a_0)$,
$a_1=f^{-1}(b_1)$, $b_2=g^{-1}(a_1)$, $a_2=f^{-1}(b_2)$, and so
forth. If at any point in this procedure, some $b_i\notin B'$, then
let $h(a_0)=g^{-1}(a_0)$. Otherwise, let $h(a_0)=f(a_0)$.
This (possibly infinite-time to determine) criterion will determine
$h$ on every point of the codomain, and the justifications in the
previous two sections serve to show that $H$ is a bijection.
\end{document}