philosophy : Alan Turing's Computability Thesis, in his own words

Date: 20-Jun-2005/17:25

Characters: (none)

Alan Turing claimed that it was fundamentally impossible to create a machine which is more computationally powerful than the simplest mathematical model of a "computer" he could come up with. This means that if you've got any old PC on your desktop, no one can sell you something that's more powerful in terms of what problems it can theoretically solve (though a new computer might perform some tasks faster). Today we accept this radical idea of computational equivalence as a fundamental truth of computer science, and I refer to it as a good reason to be wary of those advertising "powerful" programming tools.

One of the best explanations I've ever seen of why this is true was written by Turing himself, and published in 1936. You don't find it many places on the web...and sites that publish it tend to go down (usually when the CS student who posted it graduates). So I'm republishing an excerpt as a mirror-site, in hopes that it will increase the odds that people get a chance to appreciate his framing of the idea.

While reading, bear in mind that in Turing's time, the word "computer" meant a person who made their living doing computation! When he wants to speak about what we call a "computer" today, he uses the word machine.

Computing is normally done by writing certain symbols on paper. We may suppose this paper is divided into squares like a child's arithmetic book. In elementary arithmetic the two-dimensional character of the paper is sometimes used. But such a use is always avoidable, and I think that it will be agreed that the two-dimensional character of paper is no essential of computation. I assume then that the computation is carried out on one-dimensional paper, i.e., on a tape divided into squares. I shall also suppose that the number of symbols which may be printed is finite. If we were to allow an infinity of symbols, then there would be symbols differing to an arbitrarily small extent.¹ The effect of this restriction of the number of symbols is not very serious. It is always possible to use sequences of symbols in the place of single symbols. Thus an Arabic numeral such as 17 or 999999999999999 is normally treated as a single symbol. Similarly in any European language words are treated as single symbols (Chinese, however, attempts to have an enumerable infinity of symbols). The differences from our point of view between the single and compound symbols is that the compound symbols, if they are too lengthy, cannot be observed at one glance. This is in accordance with experience. We cannot tell at a glance whether 9999999999999999 and 999999999999999 are the same.

The behavior of the computer at any moment is determined by the symbols which he is observing, and his state of mind at that moment. We may suppose that there is a bound B to the number of symbols or squares which the computer can observe at one moment. If he wishes to observe more, he must use successive observations. We will also suppose that the number of states of mind which need be taken into account is finite. The reasons for this are of the same character as those which restrict the number of symbols. If we admitted an infinity of states of mind, some of them will be arbitrarily close and will be confused. Again this restriction is not one which seriously affects computation, since the use of more complicated states of mind can be avoided by writing more symbols on the tape.

Let us imagine the operations performed by the computer to be split up into simple operations which are so elementary that it is not easy to imagine them further divided. Every such operation consists of some change of the physical system consisting of the computer and his tape. We know the state of the system if we know the sequence of symbols on the tape, which of these are observed by the computer (possible with a special order), and the state of mind of the computer. We may suppose that in a simple operation not more than one symbol is altered. Any other changes can be split up into simple changes of this kind. The situation in regard to the squares whose symbols may be altered in this way is the same as in regard to the observed squares. We may, therefore, without loss of generality, assume that the squares whose symbols are changed are always observed squares.

Besides these changes of symbols, the simple operations must include changes of distribution of observed squares. The new observed squares must be immediately recognisable by the computer. I think it is reasonable to suppose that they can only be squares whose distance from the closest of the immediately previously observed squares does not exceed a certain fixed amount. Let us say that each of the new observed squares is within L squares of an immediately previously observed square.

In connection with immediate recognisability, it may be thought that there are other kinds of squares which are immediately recognisable. In particular, squares marked by special symbols might be taken as immediately recognisable. Now if these squares are marked only by single symbols there can be only a finite number of them, and we should not upset our theory by adjoining these marked squares to the observed squares. If, on the other hand, they are marked by a sequence of symbols, we cannot regard the process of recognition as a simple process. This is a fundamental point and should be illustrated. In most mathematical papers the equations and theorems are numbered. Normally the numbers do not go beyond (say) 1000. It is, therefore, possible to recognise a theorem at a glance by its number. But if the paper was very long, we might reach Theorem 1577677334377; then further on in the paper, we might find ...hence (applying Theorem 1577677334377) we have .... In order to make sure which was the relevant theorem we should have to compare the two numbers figure by figure, possible ticking the figures off in pencil to make sure of their not being counted twice. If in spite of this it is still thought that there are other immediately recognisable squares, it does not upset my contention as long as these squares can be found by some process of which my tape machine is capable.

The simple operations must therefore include:

Changes of the symbol on one of the observed squares.

Changes of one of the squares observed to another square within L squares of one of the previously observed squares.

It may be that some of these changes necessarily involve a change of state of mind. The most general single operation must therefore be taken to be one of the following:

A possible change (a) of symbol together with a possible change of state of mind.

A possible change (b) of observed squares, together with a possible change of state of mind.

The operation actually performed is determined, as has been suggested (above) by the state of mind of the computer and the observed symbols. In particular, they determine the state of mind of the computer after the operation.

We may now construct a machine to do the work of this computer. To each state of mind of the computer corresponds an internal state of the machine. The machine scans B squares corresponding to the B squares observed by the computer. In any move the machine can change a symbol on a scanned square or can change any one of the scanned squares to another square distant not more than L squares from one of the other scanned squares. The move which is done, and the succeeding configuration, are determined by the scanned symbol and the internal state. The machines just described do not differ very essentially from computing machines as defined (previously) and corresponding to any machine of this type a computing machine can be constructed to compute the same sequence, that is to say the sequence computed by the computer.

Our definition is the now-customary one with B = 1 and L = 1. That is, there is one scanned square and the head changes position by at most one in a single move.

¹ If we regard a symbol as literally printed on a square we may suppose that the square is 0 < x < 1, 0 < y < 1. The symbol is defined as a set of points in this square, viz. the set occupied by printers ink. If these sets are restricted to be measurable, we can define the distance between two symbols as the cost of transforming one symbol into the other if the cost of moving a unit area of printers ink unit distance is unity, and there is an infinite supply of ink at x = 2, y = 0. With this topology the symbols form a conditionally compact space.

Turing, Alan M. (1936), On computable numbers, with an application to the Entscheidungsproblem, Proceedings of the London Mathematical Society, Ser. 2-42, 230-265.

If you would like to see one of these hypothetical machines in action, there is at least one online Turing machine simulator. It boggles the mind to think that this convoluted means of programming has been proven to be able to solve Rubik's Cubes, play chess, or do the 3-D rendering in Halo. But it can!

Assertions Parameterized by Location

"Freedom To" and "Freedom From" in Software Architecture