Alan Turing was a Computer Science pioneer whose brilliant and tragic life has been the subject of books, plays and films – most recently The Imitation Game with Benedict Cumberbatch. Turing and others were excited by the possibility of Artificial Intelligence (AI) from the outset, in Turing’s case at least from 1941. In his 1950 paper Computing Machinery and Intelligence, Turing proposed his famous test where, roughly put, a machine would be deemed “intelligent” if it could pass for a human in an interactive session he called “The Imitation Game” – whence the movie title. (Futurologist Ray Kurzweil has predicted that a machine will pass the Turing Test by the year 2029.)
Historically, the practice of war has hewn closely to developments in technology. And warfare, in turn, has made demands on technology. Indeed, even men of genius like Archimedes and Leonardo da Vinci developed weapons systems. However, the relationship between matters military and matters technological became almost symbiotic with WWII. Technological feats such as radar, nuclear power, rockets, missiles, jet planes and the digital computer are all associated with the war efforts of the different powers of that conflict. Certainly, the fundamental research behind these achievements was well underway by the 1930s but the war determined which areas of technology should be prioritized, thereby creating special concentrations of brilliant scientific talent. The Manhattan Project itself is studied as a model of large scale R&D; furthermore, the industrial organization of the war period and military operations such as countering submarine warfare gave rise to a new mathematical discipline, aptly called Operations Research, which is now taught in Business Schools under the name Management Science.
In his masterful treatise War in the Age of Intelligent Machines (1991), Manuel DeLanda summarizes it thusly: “The war … forged new bonds between the military and scientific communities. Never before had science been applied at so grand a scale to such a variety of warfare problems.”
However, now the reliance on military funding might be skewing technological progress leading in less fruitful directions than capitalism or science-as-usual itself would. Perhaps this is why, instead of addressing the environmental crisis, post WWII technological progress has perfected drones and fueled the growth of organizations such as the NSA and its surveillance prowess. In the process, it has created surveillance capitalism: our personal behavorial data are amassed by AI enhanced software – Fitbit, Alexa, Siri, Google, FaceBook, … ; the data are analyzed and sold for targeted advertising and other feeds to guide us in our lives; this is only going to get worse as the internet of things puts sensors and listening devices throughout the home and machines start to shepherd us through our day – a GPS for everything.
All that said, since WWII the US Military has been a very strong supporter of research into AI; in particular funding has come from the Defense Advanced Research Projects Agency (DARPA). It is worth noting that one of their other projects was the ARPANET which was once the sole domain of the military and research universities; this became the internet of today when liberated for general use and eCommerce by the High Performance Computing and Communications Act (“Gore Act”) of 1991.
The field of AI was only founded formally in 1956, at a conference at Dartmouth College, in Hanover, New Hampshire, where the term Artificial Intelligence itself was coined.
Following a scheme put forth by John Launchbury of DARPA, the timeline of AI can be broken into three parts. The 1st Wave (1950-2000) saw the development of three fundamental approaches to AI – one based on powerful Search Algorithms, one on Mathematical Logic and one on Connectionism, imitating the structure of neurons in the human brain. Connectionism develops slowly in the First Wave but explodes in the 2nd Wave (2000-2020). We are now entering the 3rd Wave.
Claude Shannon, a scientist at the legendary Bell Labs, was a participant at the Dartmouth conference. His earlier work on implementing Boolean Logic with electromagnetic switches is the basis of computer circuit design – this was done in his Master’s Thesis at MIT making it probably the most important Master’s Thesis ever written. In 1950, Shannon published a beautiful paper Programming a Computer for Playing Chess, which laid the groundwork for games playing algorithms based on searching ahead and evaluating the quality of possible moves.
Fast Forward: Shannon’s approach led to the triumph in 1997 of IBM’s Deep Blue computer which defeated reigning chess champion Gary Kasparov in a match. And things have accelerated since – one can now run even more powerful codes on a laptop.
Known as the “first AI program”, Logic Theorist was developed in 1956 by Allen Newell, Herbert A. Simon and Cliff Shaw – Simon and Newell were also at the Dartmouth Conference (Shaw wasn’t). The system was able to prove 38 of the first 52 theorems from Russell and Whitehead’s Prinicipia Mathematica and in some cases to find more elegant proofs! Logic Theorist established that digital computers could do more than crunch numbers, that programs could deal with symbols and reasoning.
With characteristic boldness, Simon (who was also a Nobel prize winner in Economics) wrote
[We] invented a computer program capable of thinking non-numerically, and thereby solved the venerable mind-body problem, explaining how a system composed of matter can have the properties of mind.
Again with his characteristic boldness, Simon predicted in 1957 that computer chess programs would outperform humans within “ten years” but that was wrong by some thirty years! In fact, “over-promising” has plagued AI over the years – but presumably all that is behind us now.
AI has also proved too attractive to researchers and companies. For example, at Xerox PARC in the 1970s, the computer mouse, the Ethernet and WYSIWYG editors (What you see is what you get) were invented. However, rather than commercializing these advances for a large market as Apple would do with the Macintosh, Xerox produced the Dandelion, a $50,000 workstation designed for work on AI by elite programmers.
The Liar’s Paradox (“This statement is false”) was magically transformed into the Incompleteness Theorem by Kurt Gödel in 1931 by exploiting self-reference in systems of mathematical axioms. With Turing Machines, an algorithm can be the input to an algorithm (even to itself). And indeed, the power of self-reference gives rise to variants of the Liar’s Paradox that become theorems about Turing machines and algorithms. Thus, the only algorithm for telling how long an algorithm or program will run will come down to running the program; and, be warned, it might run forever and there is no sure way you can tell that in advance.
In a similar vein, it turns out that the approach through Logic soon ran into the formidable barrier called Combinatorial Explosion where all possible algorithms will necessarily take too long to reach a conclusion on a large family of mathematical problems – for example, there is the Traveling Salesman Problem:
Given a set of cities and distance between every pair of cities, the problem is to find the shortest possible route that visits every city exactly once and returns to the starting point.
This math problem is not only important to salesmen but is also important for the design of circuit boards, for DNA sequencing, etc. Again the impasse created by Combinatorial Explosion is not unrelated to the issues of limitation in Mathematics and Computer Science uncovered by Gödel and Turing
Expert Systems are an important technology of the 1st Wave; they are based on the simplified logic of if-then-rules:
If it’s Tuesday, this must be Belgium.
As the rules are “fired” (applied), a data base of information called a “knowledge base” is updated making it possible to fire more rules. Major steps in this area include the DENDRAL and the MYCIN expert systems developed at Stanford University in the 1960s and 1970s.
A problem for MYCIN which assisted doctors in the identification of bacteria causing infections was that it had to deal with uncertainty and work with chains of propositions such as:
“Presence of A implies Condition B with 50% certainty”
“Condition B implies Condition C with 50% certainty”
One is tempted to say that presence of A implies C with 25% certainty, but (1) that is not mathematically correct in general and (2) if applied to a few more rules in the chain that 25% will soon be down to an unworkable 1.5%.
Still MYCIN was right about 65% of the time, meaning it performed as well as the expert MDs of the time. Another problem came up, though, when a system derived from MYCIN was being deployed in the 1970s: back then MDs did not type! Still this area of research led to the development of Knowledge Engineering Environments which built rules derived from the knowledge of experts in different fields – here one problem was that the experts (stock brokers, for example) often did not have enough expertise to encode to make the enterprise worthwhile, although they could type!
For all that, Rule Based Systems are widespread today. For example, IBM has a software product marketed as a “Business Rules Management System.” A sample application of this software is that it enables an eCommerce firm to update features of the customer interaction with its web page – such as changing the way to compute the discount on a product – on the fly without degrading performance and without calling IBM or having to recompile the system.
To better deal with reasoning and uncertainty, Bayesian Networks were introduced by UCLA Professor Judea Pearl in 1985 to address the problem of updating probabilities when new information becomes available. The term Bayesian comes from a theorem of the 18th century Presbyterian minister Thomas Bayes on what is called “conditional probability” – here is a example of how Bayes’ Theorem works:
In a footrace, Jim has beaten Bob only 25% of the time but of the 4 days they’ve done this, it was raining twice and Jim was victorious on one of those days. They are racing again tomorrow. What is the likelihood that Jim will win? Oh, one more thing, the forecast is that it will certainly be raining tomorrow.
At first, one would say 25% but given the new information that rain is forecast, a Bayesian Network would update the probability to 50%.
Reasoning under uncertainty is a real challenge. A Nobel Prize in economics was recently awarded to Daniel Kahneman based on his work with the late Amos Tversky on just how ill-equipped humans are to deal with it. (For more on their work, there is Michael Lewis best-selling book The Undoing Project.) As with MYCIN where the human experts themselves were only right 65% of the time, the work of Kahneman and Tversky illustrates that medical people can have a lot of trouble sorting through the likely and unlikely causes of a patient’s condition – these mental gymnastics are just very challenging for humans and we have to hope that AI can come to the rescue.
Bayesian Networks are impressive constructions and play an important role in multiple AI techniques including Machine Learning. Indeed Machine Learning has become an ever more impressive technology and underlies many of the success stories of Connectionism and the 2nd Wave of AI. More to come.