Infinite Constructions And Cantor Diagonalization

Georg Cantor used his diagonalization construction to famously “prove” that, while the set of integers and the set of real numbers were both infinite, the latter was a larger infinity than the former.

Or did he? This essay will delve into some aspects of his proof, to see if I can expose holes in his logic. After all, his proof relies on an infinite construction, and while that is not in itself a fatal flaw, perhaps there is more than one way of interpreting it.

Finite Constructions: Termination

Let us first consider the concept of an algorithm. This is a procedure consisting of a finite set of precisely-specified steps — so precisely specified, in fact, that they can be carried out by a machine with no capacity for intelligent thought. We know of such machines as computers, and the embodiment of algorithms that they obey as programs.

An essential characteristic of an algorithm is that, even though particular steps may be executed multiple times, the total number of executions of all the steps must always be finite — the algorithm must terminate. That is, sooner or later it must come to a stop and produce an answer.

As an example, consider the well-known procedure for finding the greatest common divisor of two non-negative integers a and b:

Does a = b? If so, the GCD is a. Stop.
Is a > b? If so, replace a with a - b, and b with (the previous value of) a. Repeat from step 1.
Otherwise, a < b: replace b with b - a, and a with (the previous value of) b. Repeat from step 1.

It is easy to see why this algorithm will always terminate: given that a and b are finite (which is true of all the integers), the subtractions in steps 2 and 3 will always replace the larger of them with a smaller integer, and you can’t do this indefinitely without reaching 1, if not some integer greater than 1. (Step 1 prevents the result from being 0.) If one of them is zero to begin with, then it gets replaced with the other in step 2 or 3, and then the procedure stops in step 1 with that as the answer.

Consider a second example: what we might consider a “naïve” (and also less than efficient) algorithm for determining the quotient q and remainder r of the division of two non-negative integers a and b:

Set q initially to 0.
Is a < b? If so, set r = a. Stop.
Otherwise, a ≥ b; replace a with a - b, and also replace q with q + 1. Repeat from step 2.

This will work fine, except for the case where b = 0: in this situation, it will loop endlessly, because the subtraction in step 3 will leave the value of a unchanged, so the check for loop termination in step 2 will never succeed.

It is easy to fix this, of course, by an extra check that b is not zero at the start.

However, in general it is not possible to prove that a purported algorithm will always terminate, or even to precisely list the conditions under which it will do so. This is a consequence of Alan Turing’s famous “Halting Problem” proof.

Termination Versus Convergence

Computers are also frequently used to compute approximate answers. After all, when quantities are measured in the physical world (e.g. distances, times, masses, energies, temperatures), we usually have to use numbers with fractional parts, limited by the accuracy of our measuring instruments. Furthermore, the mathematical formulas we use to compute with these numbers are often incapable of exact representation within the limitations of computer arithmetic.

Consider something as simple as computing the circumference s of a circle from its radius r: the formula is a straightforward linear proportion with a constant factor, just

s = 2πr

But what is the value of that constant factor π? Everybody knows the first few digits: 3.14159265 ... but the sequence never ends, and never repeats. Formulas for computing those digits are well-known, and every now and then somebody sets a new world record for the number of digits computed.

So we can say that, for any given positive integer n, we have algorithms for computing π to n digits. But there is no actual algorithm for computing π as such, because to determine π exactly would require an infinity of digits, so such an “algorithm” could never terminate!

So in numerical mathematics, we amend our concept of an “algorithm” a little bit, by replacing the requirement for “termination” with one for “convergence”, where we can get as close as we want (the computation “converges”) to the correct answer (“in the limit”), though we can never get it exactly. So for any quantity x we want to compute, we have to specify an error bound ε > 0, such that what we really end up computing is some approximation x' for which we know

|x - x'| ≤ ε

but more than that we cannot be specific. So we have an algorithm which (hopefully) will terminate for any given positive ε, no matter how small, though it may take longer and longer (and more and more storage) for smaller and smaller ε, just so long as we do not ask for ε = 0.

Such a number x is known as a computable number. The infinity of the computable numbers is the same as the infinity of the integers, because every computer program (algorithm) must be representable as a finite-length string of symbols. You can encode each symbol of the program as an integer (for example, by stringing together the Unicode codes for each character), and end up with a unique representation of the entire program as a very large, but still finite, integer, in a procedure known as Gödel numbering.

Thus, the infinity of the number of possible computer programs is the same as the infinity of the integers; and since every computable number has (by definition) a program for computing it, that determines the cardinality of the computable numbers as well.

Infinite Constructions: Not An Algorithm?

There is an infinity of integers. There is also an infinity of rational numbers (ratios of two integers). Are the two infinities the same?

The well-known answer is yes, which comes as a surprise to some. The proof involves writing out an infinite list of the rationals in some well-defined order, and demonstrating that any given rational will occur sooner or later within this list. Since each item in the list can be given an integer index starting from one, this shows that the cardinality of the rationals is the same as that as that of the integers.

To be more specific, first let us just consider the positive rationals, and show that these can be mapped one-to-one onto the positive integers. Since for every positive rational there is a corresponding negative rational that can be produced just by changing the sign of the numerator (or the denominator), and the same is true for every positive integer, proving the case just for the positive ones also proves it for the negative ones.

How do we produce an ordering of the rationals? There is in fact no algebraic formula for doing this; coming up with an appropriate construction requires a bit of lateral thinking.

One way to do it is to interleave the decimal digits of the numerator and denominator, encoding them into a single integer that way. That is, if the digits of the numerator are

...a₃a₂a₁a₀

and those of the denominator are

...b₃b₂b₁b₀

Then we re-encode these as the single integer

...a₃b₃a₂b₂a₁b₁a₀b₀

In all cases, the progression off to the left eventually runs out of nonzero digits and just becomes an endless series of repeating zeroes. Since this is always true of the original a_n and b_n sequences, it is always true of their combination as well.

And then, we write out the rationals in order of the integer value of their encoded form.

So, the sequence of positive rationals, encoded in this way, looks in part like this:

1/1 ⇔ 11
1/2 ⇔ 12
...
1/9 ⇔ 19
2/1 ⇔ 21
2/3 ⇔ 23
...
1/10 ⇔ 110
1/11 ⇔ 111
...
10/1 ⇔ 1001
...

Note that there are gaps in the encoding, where there are no such rationals (division by zero), or because the ratio is not in its lowest terms (2/2 is the same as 1/1, so the encoding 22 is not used). But this doesn’t matter, since the encoding is only used for sorting, and the actual correspondence to positive integers is taken from the index of the item in the list.

Also note that, besides the positive and negative rationals, this encoding represents zero as 0/1 ⇔ 1.

So, given any rational number, to find its place in the list, first flip its sign if it is negative, and ensure it is reduced to its lowest terms. Interleave the decimal digits of numerator and denominator; since these are always finite integers, the result will always be a finite integer. If the resulting integer is n, then (because of the gaps in the encoding) the result will always be found before the nth position in the list.

Is the construction of the list an algorithm? No, because the construction never terminates. Or perhaps it can be viewed as an algorithm in this way: given any integer or any rational, it will find their place in the list after a finite number of steps. There is no algorithm for constructing the whole list, but there is one for constructing any given element on it!

When Are Two Computable Numbers Equal?

How do you compare two numbers, to see if they’re the same?

Intuitively, this seems simple: compare corresponding pairs of digits; if a pair of digits is the same, keep going, but if they’re not, you’ve found a difference, therefore they’re not equal. If you’ve reached the end without finding a difference, then they are equal.

So how does this work for computable numbers in general? Answer: it doesn’t.

Of course, there are specific cases where this is possible. But in the general case, the comparison procedure is not guaranteed to terminate. Therefore, it cannot be considered to be an algorithm.

There is a term for questions for which no algorithm exists to answer them: they are called undecidable.

Which brings us to a conclusion that some may find surprising:

Comparing two computable numbers for equality is, in general, undecidable.

The Cantor Diagonal Construction

Now, let us consider Georg Cantor’s proof of the size of the set of the integers (commonly denoted ℵ₀) versus that of the reals (commonly denoted ℵ₁). Both are infinities, but is one infinity more infinite than the other?

First of all, as we saw with π above, a real number in general requires an infinity of (not necessarily repeating) decimal places in order to represent it exactly. As was known to the ancient Greeks, reals include both rational numbers (representable as the ratio of two integers), and irrational ones (not so representable). So trying to represent a real as a ratio of two numbers does not get us out of the need for infinite decimal places.

The proof of the inequality of the two infinities is based on the principle of reductio ad absurdum: start by assuming the opposite of what we are trying to prove, and show that, by combining that assumption with well-established statements, we get a contradiction. Since those well-established statements are (presumably) not the source of the contradiction, it must come from our initial assumption. Therefore the assumption must be wrong.

Cantor’s proof is supposed to show that ℵ₁ > ℵ₀. So to prove this, he started by assuming that ℵ₁ = ℵ₀. If this was true, then it should be possible to put all the reals in a list in a one-to-one-correspondence with the integers.

Let us consider just trying to make a list of all the reals greater than or equal to zero, but less than one: it turns out that proving (or disproving) the theorem for this case also proves (or disproves) it for all reals. It makes it easier to visualize the list, because the sequence of infinite decimal digits only extends off to the right. Suppose the list looks like this:

0.Xxxxxxxxxxxxx...
0.xXxxxxxxxxxxx...
0.xxXxxxxxxxxxx...
0.xxxXxxxxxxxxx...
...

where each “x” or “X” represents a decimal digit. Remember, the assumption (which we are to disprove) is that every real x such that 0 ≤ x < 1 occurs somewhere in this ordered list.

Notice that the capital “X” characters lie on a diagonal line. This is key to Cantor’s proof. What we do is construct a real number as follows:

The first digit after the decimal point is something different from the capital “X” in the first number in the above list.
The second digit after the decimal point is something different from the capital “X” in the second number in the above list.
And so on for each subsequent digit.

To come up with the different digit each time, we could use a simple formula like adding 1 to the digit X modulo 10, so that adding 1 to 9 gives 0.

What we end up with is a new real number that is different in at least one digit from every number in the list. Therefore it is not in the list. But the existence of this number contradicts our assumption, that the list contains every real number!

And that is Cantor’s proof, in a nutshell.

So what happens if we add our newly-constructed number to the list? Well, we can repeat the proof, and come up with a new number that is not already in the list.

And so on and so on.

Does that sound pretty foolproof to you? It did to me, until I realized ...

Finding the number that is not in the list takes an infinite number of steps, while adding it to the list only takes one step.

That is to say, the proof is conventionally summarized as “no matter how many numbers you add to the list, I can always find a new one that isn’t there”. But that could be flipped on its head: “no matter how many times you find a number that isn’t in the list, I can always add it to the list”.

In a game of infinite construction versus counter-construction, who gets the last word? What about infinite construction versus finite counter-construction?

Further consider: can Cantor’s diagonal construction be described as an “algorithm” for finding a new real number that is not already on the list? Well, you can certainly say that it can determine any given digit of that number after a finite number of steps. So it certainly converges to the new number.

But an exact equality (or inequality) comparison can only complete when all the digits have been compared. So the proof that the number is not on the list can only happen after all its digits have been computed! (As we saw previously, this is in general an undecidable question.) Therefore you can only say that the construction converges towards proving that the number is not on the list — it can never actually get there.

To put it another way, after stage n of the construction, which involves looking at the first n numbers on the list, we have computed the corresponding first n digits of the new number. But there is always the chance that these first n digits will exactly match the first n digits of some number later in the list.

In fact, given that there is always an infinitude of numbers remaining in the list, the chance that our so-far-constructed number will not match some later item in the list is always effectively zero.

Maybe you don’t like an argument based on probabilities, even probabilities of exactly 1 (will always occur) and 0 (will never occur). Maybe you would prefer an argument where the list is constructed so that we can always say that those first n digits will provably match some item within an initial finite part of the list whose size is given by some integer function of n.

Let us consider how to construct such a list. Write out the whole numbers from 0 in increasing order, then flip the digits of each one round, so that the digit from the 10⁰ place goes to the 10^-1 place, 10¹ to 10^-2, 10² to 10^-3, and in general 10ⁿ to 10^-n-1:

0.0000000000000...
0.1000000000000...
0.2000000000000...
0.3000000000000...
...
0.9000000000000...
0.0100000000000...
0.1100000000000...
0.2100000000000...
0.3100000000000...
...
0.9100000000000...
0.0200000000000...
...

(Actually, this ordering is flawed, because it leaves out whole swathes of the reals. Yet this does not actually make the Cantor construction any more effective.)

Now when we perform the Cantor diagonal construction, when we choose a digit in the 10^-1 place different from the first number in the list, the new number looks like it could match some item in the list within 10 items of that first number, since all possible values of that digit are to be found there, and we have yet to draw a distinction on any remaining digits. Then when we choose the digit in the 10^-2 place different from the second number in the list, the new number looks like it could match some item in the list within 100 items of that second number, since all possible values of those two digits are to be found there, and again, we haven’t drawn a distinction on any of the remaining digits yet.

So at every stage of the construction, the digits chosen so far do not distinguish the new number from those already at determinable positions in the list (albeit further and further down the list). The new number only becomes provably distinct from any one already in the list when the construction completes. But the construction never completes!

So, Cantor’s construction cannot be seen as a proof that ℵ₁ > ℵ₀ at all.

Of course, the construction of the list in the first place is an infinite construction. You could express it as an algorithm which, given the number i of an item in the list, and the position j of the decimal digit for that item, computes that decimal digit in a finite time.

This algorithm would give us a convergence to the value of any item in the list: fix the value of i to determine the list item, and compute its successive digits for ever-increasing j.

We can also try to use it the other way, where we fix j and compute the digits for ever-increasing i, except this does not “converge” to anything, since the values of the digits we are computing never get smaller and smaller.

Either way, it’s clear Cantor’s construction cannot complete until the construction of the list itself is complete. Which takes an infinite number of steps.

Or perhaps you want to look at Cantor’s construction as proceeding in tandem with the construction of the list. That is, as item n is being added to the list, once you get as far as computing digit n of that item, you can correspondingly perform the computation of digit n of the candidate Cantor number, the one that is (supposedly) not in the list. And every single part of this process only requires a finite number of steps.

Does that procedure look familiar? It is in fact the same form as the procedure for calculating a computable number, with error bound ε = 10^1-n!

Therefore, the Cantor number is always a computable number! But the cardinality of the set of computable numbers is only ℵ₀, same as the integers. Once you look at the Cantor construction in this way, you realize that it in no way proves that the cardinality of the reals is greater than ℵ₀.

Let us recap the main points of the argument:

Any kind of one-to-one correspondence between two sets (including infinite sets) has to be expressed as an algorithm.
The only kinds of numbers that can be computed by algorithms are the computable numbers.
The cardinality of the computable numbers is ℵ₀.
By definition, it is impossible to write down any number that is not computable.
The Cantor construction can never produce any number that is not computable: that is, not a member of a known existing set with cardinality ℵ₀.
Therefore, the Cantor construction fails to prove the existence of any member of a set with cardinality ℵ₁ > ℵ₀ that is not also in a subset of that set with cardinality ℵ₀.

Implications Of Cantor’s Non-Proof

Do my arguments above prove conclusively that ℵ₁ = ℵ₀? I’m not sure. If it doesn’t, then I suspect the question of their equality is undecidable. Elsewhere it has been shown that the question of whether there are other transfinite numbers between ℵ₀ and ℵ₁ (assuming they were distinct) is also undecidable. This means you can have additional axioms in your arithmetic that say it is so (or not so), and your mathematical theory remains just as consistent as it was before.

Some scientists have been contemplating the notion that the laws of physics are somehow founded on algorithmic computations, that the very Universe itself operates like a computer. One stumbling block has been the notion of real numbers as being (thanks to Cantor) a much bigger set than the numbers that computers can compute. But if this notion goes away, then we can conclude that all the numbers we already deal with in physics (both theoretically and practically) are in fact computable numbers.

In other words, we don’t need to seek replacements for our existing physical laws as formulated in terms of the real numbers, and try to rework them in terms of some set of “computable” numbers, since the “real” and “computable” sets are one and the same — and we can take those existing formulas straight across to a computational model of physics!

Created 2021 November 26, last modified 2022 October 28 by Lawrence D’Oliveiro.