Early versions of the proof below were rejected from Godel's incompleteness theorems. I hope they will find a home either there or on more technical pages.
The theorems talk about a formal system S, which is sufficiently strong. The precise assumptions on S are as follows:
1. S is capable of stating any theorem about the memory state of a computer program. Each clock cycle of a computer takes a certain memory state M to M'=f(M), where the memory contents M can be viewed as a big integer and f is a simple computable function which does one processor step. The theorems that S should be able to state are of the form "for all N, f iterated N times on M does not have the k-th bit zero".
2. S will prove that the value of f(f(f(...f(M))))=M' where M is any initial memory state and M' is the state after a finite number N of steps. For the conclusions of the corollary, it is also necessary to assume that S can prove property 2 is true of itself, S needs to prove that S proves these theorems for all N and some special M. In arithmetic this holds when S has at least enough mathematical induction to prove theorems with one (forall X) in front.
3. S is consistent, meaning it never proves a statement and its negation.
Theorem 1: There are true theorems about the asymptotic behavior of computer programs that S cannot prove.
Corollary: S cannot prove its own consistency.
Theorem 2: S is incomplete, meaning there is a statement that it cannot prove or disprove.
Godel proved that there is an explicit algorithm to find all the consequences of a set of axioms. This algorithm can, in modern language, be represented as a computer program running on an infinite memory computer. Using the modern notions of computability, it is then easy to prove Godel's incompleteness theorem from basic ideas in computer science.
1. Quining-- any computer program can include a subroutine that prints out the entire program's code. This allows any program to write itself into a string variable, or a large enough bigint.
2. The
Halting problem: There does not exist a computer program PREDICT(P) which takes the code to program P and predicts whether P eventually halts.
The incompleteness theorem: Suppose that an axiom system describes integers (or any other discrete structure), and that it has enough operations (addition and multiplication are sufficient) to describe the working of a computer. This means that the axioms will eventually prove all theorems of the form "Starting with memory contents X, after N steps of running, the memory of the computer will be Y", for any fixed integers X and N, and can state propositions of the form "For all N, the contents of the memory will have property P", where P is some fixed definite property, like "The first bit in memory is zero" or "the tenth bit is different then the third bit".
Such an axiom system is either inconsistent or incomplete.
To prove the second incompleteness theorem, note that if the axioms are consistent, it is easy to prove that DEDUCE does not halt. This means that if a consistent axiom system proves its own consistency then it will also prove that DEDUCE does not halt, which is impossible. [1] [2].
Below is an incomplete short discussion of some essential results of recursion theory. The point of this is to try to contribute simple proofs of key results to the following pages which are either non-existent or with imprecise mathematical content.
The language is that of computer scientists.
Recursion theory is the study of computer programs on an idealized computer with potentially infinite memory. The computer programs should be thought of as written in C on a Random Access Machine, with pointers to memory locations, as in a real computer. To describe a computer with potentially infinite memory, you can imagine that there is a primitive processor instruction MOORE, which doubles the size of pointers and maximizes the adressable memory on demand.
A Random Access Machine can be constructed from a Turing machine, at the cost of increasing the running time of a program. The crucial difference is that a Turing machine must scan its memory tape sequentially, while a Random Access Machine can jump from point to point.
One of the more important operations in recursion theory is printing your own code. A program that does this is called a quine. In C, a simple quine subroutine is
print_me(){s="print_me(){s=%c%s%c;printf(s,34,s,34);}";printf(s,34,s,34);}
This immediately implies that any C program can append a subroutine to print itself out. The proof is by construction:
For simplicity, the old code was assumed quote-free. If it has quotes, replace the quotes with %c in the block, and replace the puts(A) with a printf(A,34...), with as many 34's as there are quotes.
A decision problem is a question about computer programs. A computer program can also be viewed as a big integer, its code in binary, so a decision problem is equivalently a question about integers. The problem is to determine whether the computer program has some property.
For many properties, the question of whether the program has the property or does not have the property is undecidable, meaning that it is impossible to write a computer program to find the answer.
The most famous example is when the property is The computer program halts. If x is a bigint, the program
doesn't halt, it goes on forever in an infinite loop, while the program
halts after one line.
It is impossible to write a computer program which tells you which programs halt and which ones do not.
Similiarly, it is impossible to write a program to decide any of the following questions:
The halting set K is the set of all integers which are the ASCII for the computer programs that halt.
To eventually determine which programs are in K, write a program HALT to list all programs and, as each one comes out, run it on a separate thread. Keep track of which threads halt. This will produce a list of the programs in K, but it cannot produce a list of programs not in K. To correctly list all the programs that do not halt, the program must run for an infinite amount of time.
A computable function is a subroutine which takes an integer and returns an integer, where an integer can be arbitrarily large, a bigint, not a finite width int. Every subroutine defines a map, or mathematical function, from the integers to the integers. The C declaration of a computable function F is
A computable function is total if it returns a value for every argument. It is partial if it only returns a value for certain arguments and for others it goes into an infinite loop. The collection of all total functions is countable, since there are only countably many computer programs. Cantor's diagonal argument then explicitly produces a predicative description of a non-computable function.
The non-computable function described by Cantor is the function F(R) which is defined over all R which are the ASCII description of a total computable function. The value of F(R) is defined as:
It is easy to see that F(R) is not computable, because if there were a computable function S which described F, whose ASCII code will also be called S, the value of F at every n would be S(n). The value of F at S would therefore have to be S(S), but it is S(S)+1 by construction.
This type of construction might seem naively to be translatable into a computer code for calculating F. It isn't, precisely because determining when R is the ASCII code for a total function requires knowing whether R halts for every value of its argument, which is undecidable. The undecidability theorem was first discovered by A. Turing by a close examination of the effective content of Cantor's diagonal argument applied to all computable real numbers.
The values of the function F(R), which is defined by a predicative description, cannot be calculated by a computer program.
It is possible to express any computer program as a computable function, since the memory of the entire computer can be thought of as an integer, and the program itself a suitable modification of that integer. In principle then, none of these results require explicit code in C. All computable functions can be defined by iterating and composing more primitive functions. This is the traditional way to express recursion theory arguments.
The proof of undecidability provides a running time bound on programs which halt. There are programs of length M bytes which only halt after running for a time which is longer than any computable function of M.
This proof can be turned into an explicit construction of a program which explicitly violates any assumed running time bound.
This bound was first formulated by Gödel as a theorem about lengths of proofs in an axiomatic system.
Gödel's Incompleteness theorem is about axiomatic systems in first-order logic. Since axiomatic systems can be identified with the programs that list all deductions of these systems, the theorem can be alternately stated as theorems about deduction programs.
The precise assumptions on the axiom system S are the following:
Assuming that the axioms are consistent, notP is the true option--- ROSSER is not going to halt, so it isn't going to print anything out. Likewise, if S is consistent DEDUCE does not halt.
This immediately suggests a corollary:
The corollary requires an additional assumption, which is that the argument can be formalized within S. This requires that:
Property 4 is satisfied in arithmetic with induction limited to statements with one quantifier.
The running time bound proved for programs translates into a statement about the lengths of proofs. Gödel established two theorems about proof-lengths in 1934.
For any theorem P, define L(P) as the length of the shortest proof of P. Redundantly define L(m) to also be the largest value of L(P) for P whose ASCII is m bytes or less. L(m) is the length of the proof of the statement of length <m which is easiest to state and hardest to prove.
A theorem whose statement is short but whose shortest proof is long will be called hard. The hardness of a theorem is the value of L(m) ( or L(m)-m, or L(m)/m, or any other minor computable modification).
This proof can be turned into an explicit construction of a theorem of S with a very long proof.
S will prove that NDEDUCE returns zero, but if S is consistent, S cannot prove NDEDUCE returns 0 with a proof which is shorter than F(N) bytes, where N is the length of the statement.
This theorem establishes rigorously that there are theorems which are easy to state but arbitrarily hard to prove in any mathematical system.
This corollary is a special case of the second Gödel theorem on the length of proofs.
Assume that S has an extension to S+A, where A is an additional axiom. Suppose that S+A is stronger than S, meaning that there is at least one program LOOP for which S+A proves LOOP does not halt but S does not.
Note that:
The interpretation of this theorem is that adding stronger axioms can simplify proofs of already proven theorems by an arbitrarily great amount. With better axioms, many hard theorems become easy.
A Turing degree is defined by adding an idealized infinite CD-ROM to the computer. Such a CD-ROM is called an oracle. No matter what CD-ROM is given, the halting problem with the CD-ROM is still undecidable,as long as SPITE has the same access to the CD-ROM as PREDICT.
The blank oracle, which contains no data, is called 0. The oracle which contains a list of all the halting programs is called the halting oracle, and it is called 0'. When given access to 0', the halting problem for programs with oracle 0 is solved, but the halting problem for programs with oracle 0' is not.
Each oracle defines a turing degree. Two oracles are called Turing equivalent if the entire information on one CD-ROM can be printed out using the second one and vice-versa. If turing degree A can be printed out from turing degree B (but not conversely), then , or A is weaker than B, or A is Turing reducible to B.
The set of all oracles under Turing equivalence forms the collection of Turing degrees, and these are a partially ordered set of cardinality equal to that of the continuum. Given two Turing degrees A and B, one can interleave the CD-ROM's of the two to form .
Despite the fact that the Turing degrees are essentially just the set of real numbers, it is traditional to refer to the Turing degrees as a class. This distinction is entirely philosophical. If the set of all undecidable problems is included in the set of all real numbers, no finite axiom system could ever properly describe all the properties of an arbitrarily real number, let alone the set of all real numbers. Even though it is traditional in other branches of mathematics to treat the real numbers as a set, the real numbers in recursion theory are respectfully elevated to a proper class.
Sometimes the undecidable problem B can be solved given the answer to another undecidable problem A. When this happens, B is reducible to A. The strongest notion of reducibility is Turing reducibility, which means that the problem B can be solved once an unbounded number of answers to a question of type A are given. But there are weaker notions of reducibility.
Early versions of the proof below were rejected from Godel's incompleteness theorems. I hope they will find a home either there or on more technical pages.
The theorems talk about a formal system S, which is sufficiently strong. The precise assumptions on S are as follows:
1. S is capable of stating any theorem about the memory state of a computer program. Each clock cycle of a computer takes a certain memory state M to M'=f(M), where the memory contents M can be viewed as a big integer and f is a simple computable function which does one processor step. The theorems that S should be able to state are of the form "for all N, f iterated N times on M does not have the k-th bit zero".
2. S will prove that the value of f(f(f(...f(M))))=M' where M is any initial memory state and M' is the state after a finite number N of steps. For the conclusions of the corollary, it is also necessary to assume that S can prove property 2 is true of itself, S needs to prove that S proves these theorems for all N and some special M. In arithmetic this holds when S has at least enough mathematical induction to prove theorems with one (forall X) in front.
3. S is consistent, meaning it never proves a statement and its negation.
Theorem 1: There are true theorems about the asymptotic behavior of computer programs that S cannot prove.
Corollary: S cannot prove its own consistency.
Theorem 2: S is incomplete, meaning there is a statement that it cannot prove or disprove.
Godel proved that there is an explicit algorithm to find all the consequences of a set of axioms. This algorithm can, in modern language, be represented as a computer program running on an infinite memory computer. Using the modern notions of computability, it is then easy to prove Godel's incompleteness theorem from basic ideas in computer science.
1. Quining-- any computer program can include a subroutine that prints out the entire program's code. This allows any program to write itself into a string variable, or a large enough bigint.
2. The
Halting problem: There does not exist a computer program PREDICT(P) which takes the code to program P and predicts whether P eventually halts.
The incompleteness theorem: Suppose that an axiom system describes integers (or any other discrete structure), and that it has enough operations (addition and multiplication are sufficient) to describe the working of a computer. This means that the axioms will eventually prove all theorems of the form "Starting with memory contents X, after N steps of running, the memory of the computer will be Y", for any fixed integers X and N, and can state propositions of the form "For all N, the contents of the memory will have property P", where P is some fixed definite property, like "The first bit in memory is zero" or "the tenth bit is different then the third bit".
Such an axiom system is either inconsistent or incomplete.
To prove the second incompleteness theorem, note that if the axioms are consistent, it is easy to prove that DEDUCE does not halt. This means that if a consistent axiom system proves its own consistency then it will also prove that DEDUCE does not halt, which is impossible. [1] [2].
Below is an incomplete short discussion of some essential results of recursion theory. The point of this is to try to contribute simple proofs of key results to the following pages which are either non-existent or with imprecise mathematical content.
The language is that of computer scientists.
Recursion theory is the study of computer programs on an idealized computer with potentially infinite memory. The computer programs should be thought of as written in C on a Random Access Machine, with pointers to memory locations, as in a real computer. To describe a computer with potentially infinite memory, you can imagine that there is a primitive processor instruction MOORE, which doubles the size of pointers and maximizes the adressable memory on demand.
A Random Access Machine can be constructed from a Turing machine, at the cost of increasing the running time of a program. The crucial difference is that a Turing machine must scan its memory tape sequentially, while a Random Access Machine can jump from point to point.
One of the more important operations in recursion theory is printing your own code. A program that does this is called a quine. In C, a simple quine subroutine is
print_me(){s="print_me(){s=%c%s%c;printf(s,34,s,34);}";printf(s,34,s,34);}
This immediately implies that any C program can append a subroutine to print itself out. The proof is by construction:
For simplicity, the old code was assumed quote-free. If it has quotes, replace the quotes with %c in the block, and replace the puts(A) with a printf(A,34...), with as many 34's as there are quotes.
A decision problem is a question about computer programs. A computer program can also be viewed as a big integer, its code in binary, so a decision problem is equivalently a question about integers. The problem is to determine whether the computer program has some property.
For many properties, the question of whether the program has the property or does not have the property is undecidable, meaning that it is impossible to write a computer program to find the answer.
The most famous example is when the property is The computer program halts. If x is a bigint, the program
doesn't halt, it goes on forever in an infinite loop, while the program
halts after one line.
It is impossible to write a computer program which tells you which programs halt and which ones do not.
Similiarly, it is impossible to write a program to decide any of the following questions:
The halting set K is the set of all integers which are the ASCII for the computer programs that halt.
To eventually determine which programs are in K, write a program HALT to list all programs and, as each one comes out, run it on a separate thread. Keep track of which threads halt. This will produce a list of the programs in K, but it cannot produce a list of programs not in K. To correctly list all the programs that do not halt, the program must run for an infinite amount of time.
A computable function is a subroutine which takes an integer and returns an integer, where an integer can be arbitrarily large, a bigint, not a finite width int. Every subroutine defines a map, or mathematical function, from the integers to the integers. The C declaration of a computable function F is
A computable function is total if it returns a value for every argument. It is partial if it only returns a value for certain arguments and for others it goes into an infinite loop. The collection of all total functions is countable, since there are only countably many computer programs. Cantor's diagonal argument then explicitly produces a predicative description of a non-computable function.
The non-computable function described by Cantor is the function F(R) which is defined over all R which are the ASCII description of a total computable function. The value of F(R) is defined as:
It is easy to see that F(R) is not computable, because if there were a computable function S which described F, whose ASCII code will also be called S, the value of F at every n would be S(n). The value of F at S would therefore have to be S(S), but it is S(S)+1 by construction.
This type of construction might seem naively to be translatable into a computer code for calculating F. It isn't, precisely because determining when R is the ASCII code for a total function requires knowing whether R halts for every value of its argument, which is undecidable. The undecidability theorem was first discovered by A. Turing by a close examination of the effective content of Cantor's diagonal argument applied to all computable real numbers.
The values of the function F(R), which is defined by a predicative description, cannot be calculated by a computer program.
It is possible to express any computer program as a computable function, since the memory of the entire computer can be thought of as an integer, and the program itself a suitable modification of that integer. In principle then, none of these results require explicit code in C. All computable functions can be defined by iterating and composing more primitive functions. This is the traditional way to express recursion theory arguments.
The proof of undecidability provides a running time bound on programs which halt. There are programs of length M bytes which only halt after running for a time which is longer than any computable function of M.
This proof can be turned into an explicit construction of a program which explicitly violates any assumed running time bound.
This bound was first formulated by Gödel as a theorem about lengths of proofs in an axiomatic system.
Gödel's Incompleteness theorem is about axiomatic systems in first-order logic. Since axiomatic systems can be identified with the programs that list all deductions of these systems, the theorem can be alternately stated as theorems about deduction programs.
The precise assumptions on the axiom system S are the following:
Assuming that the axioms are consistent, notP is the true option--- ROSSER is not going to halt, so it isn't going to print anything out. Likewise, if S is consistent DEDUCE does not halt.
This immediately suggests a corollary:
The corollary requires an additional assumption, which is that the argument can be formalized within S. This requires that:
Property 4 is satisfied in arithmetic with induction limited to statements with one quantifier.
The running time bound proved for programs translates into a statement about the lengths of proofs. Gödel established two theorems about proof-lengths in 1934.
For any theorem P, define L(P) as the length of the shortest proof of P. Redundantly define L(m) to also be the largest value of L(P) for P whose ASCII is m bytes or less. L(m) is the length of the proof of the statement of length <m which is easiest to state and hardest to prove.
A theorem whose statement is short but whose shortest proof is long will be called hard. The hardness of a theorem is the value of L(m) ( or L(m)-m, or L(m)/m, or any other minor computable modification).
This proof can be turned into an explicit construction of a theorem of S with a very long proof.
S will prove that NDEDUCE returns zero, but if S is consistent, S cannot prove NDEDUCE returns 0 with a proof which is shorter than F(N) bytes, where N is the length of the statement.
This theorem establishes rigorously that there are theorems which are easy to state but arbitrarily hard to prove in any mathematical system.
This corollary is a special case of the second Gödel theorem on the length of proofs.
Assume that S has an extension to S+A, where A is an additional axiom. Suppose that S+A is stronger than S, meaning that there is at least one program LOOP for which S+A proves LOOP does not halt but S does not.
Note that:
The interpretation of this theorem is that adding stronger axioms can simplify proofs of already proven theorems by an arbitrarily great amount. With better axioms, many hard theorems become easy.
A Turing degree is defined by adding an idealized infinite CD-ROM to the computer. Such a CD-ROM is called an oracle. No matter what CD-ROM is given, the halting problem with the CD-ROM is still undecidable,as long as SPITE has the same access to the CD-ROM as PREDICT.
The blank oracle, which contains no data, is called 0. The oracle which contains a list of all the halting programs is called the halting oracle, and it is called 0'. When given access to 0', the halting problem for programs with oracle 0 is solved, but the halting problem for programs with oracle 0' is not.
Each oracle defines a turing degree. Two oracles are called Turing equivalent if the entire information on one CD-ROM can be printed out using the second one and vice-versa. If turing degree A can be printed out from turing degree B (but not conversely), then , or A is weaker than B, or A is Turing reducible to B.
The set of all oracles under Turing equivalence forms the collection of Turing degrees, and these are a partially ordered set of cardinality equal to that of the continuum. Given two Turing degrees A and B, one can interleave the CD-ROM's of the two to form .
Despite the fact that the Turing degrees are essentially just the set of real numbers, it is traditional to refer to the Turing degrees as a class. This distinction is entirely philosophical. If the set of all undecidable problems is included in the set of all real numbers, no finite axiom system could ever properly describe all the properties of an arbitrarily real number, let alone the set of all real numbers. Even though it is traditional in other branches of mathematics to treat the real numbers as a set, the real numbers in recursion theory are respectfully elevated to a proper class.
Sometimes the undecidable problem B can be solved given the answer to another undecidable problem A. When this happens, B is reducible to A. The strongest notion of reducibility is Turing reducibility, which means that the problem B can be solved once an unbounded number of answers to a question of type A are given. But there are weaker notions of reducibility.