Biological Evolution & Machine Learning Are Similar, Says Turing Award Winner Leslie Valiant

Can machine learning algorithms capture the complexity of the life that has evolved on Earth? Professor Leslie Valiant shares his views at the Global Young Scientists Summit 2016.

AsianScientist (Jan. 29, 2016) – If you are lucky enough to come across a first edition of Charles Darwin’s The Origin of Species, flip to the part where he talks about the Weald of Kent, a chalk deposit in southern England. There, you will find a piece of information that was removed from subsequent editions: Darwin’s estimate that it took 300 million years to erode the chalk to its present state.

That estimate turned out to be a lightning rod for controversy, attracting criticism from the likes of no less than the esteemed physicist William Thomson, better known as Lord Kelvin. Based on the physics of thermodynamics, Lord Kelvin argued that Darwin’s estimate was impossible, proposing the age of the Earth closer to 100 million years at most.

“This was a very serious contradiction. Darwin worried about this so much that he retracted any claim of numbers in later editions of the book,” said 2010 Turing Award winner Professor Leslie Valiant, speaking at the Global Young Scientists Summit 2016 (GYSS@one-north 2016), which was organized by the National Research Foundation of Singapore.

“What’s happened since is that the physicists have been good; they’ve expanded the age of the universe to about four billion years. But is this enough? I would suggest no, because we still don’t have a concrete, quantitative theory of how much time it takes for life as we know it to evolve.”

A theory you can count on

Take for example our DNA, which codes for about a billion amino acids, Valiant continued. Because there are about 20 different amino acids to choose from, to go through all the possible combinations of amino acids to reach the particular sequence that makes each unique individual would take 21,000,000,000 steps, an impossibly large number.

“You may say: ‘Well, this isn’t my theory of evolution.’ Fine, well what then is your theory of evolution? You need to specify as precisely as I did what you believe is the mechanism which results in evolution as we see it. The theory should also explain why it works as fast as it has,” Valiant said.

The problem with the common sense understanding of evolution, he argued, is that it remains an idea with no quantitative foundation, even now, more than 150 years after it was first proposed.

“This gets more and more embarrassing as computer simulations successfully reproduce the effects that theories in other areas of science have predicted. Of course it is possible that there’s some good excuse why biology is different, but I don’t know of one,” Valiant said.

Learning from mistakes

For Valiant, who is well known in the field of artificial intelligence for developing the probably approximately correct (PAC) model of machine learning, the complexities of biology have striking parallels to his own discipline. In fact, he believes that looking at the theory through the lens of machine learning may even finally place evolution on a quantitative footing.

In machine learning, computer programs are ‘trained’ to discover useful generalizations from large amounts of raw data, applying what they have learnt from examples to perform tasks ranging from translation to image recognition. They do this by generating hypotheses based on the initial training data and testing these hypotheses against real-life examples.

“What I want to do now is to persuade you that evolution is more similar to machine learning than one would have thought. The analogy I’ll make is very simple: the genome is the hypothesis, and the examples are experiences,” Valiant said. “As the algorithm evolves, it generates new hypotheses in the next generation, which can be thought of as offspring with random mutations in their DNA.”

“Although evolution seems very unsupervised, it also has the notion of correctness just like machine learning. The supervision is survival, providing feedback on whether each organism survives or not.”

Supposing that the ideal amount of coffee to drink is three cups a day, Valiant continued, over many generations, the genes of people who drink 30 or 100 cups a day will disappear from the population, even though no one may know the ideal number or understand exactly why too much coffee is bad.

“But the fact that it is bad will be fed back into the evolutionary process and your evolutionary algorithm will be learnt. Put simply, our genomes will learn from our experiences,” he said.

A framework at last?

Acknowledging that it may seem impossible to capture a phenomenon as complex as evolution with equations, particularly when variables such as phenotype, environment and fitness are not well understood, Valiant reminded the audience that the same criticism holds true for machine learning algorithms—and yet they work.

“Machine learning works in a great generality of applications even though there’s no knowledge of what makes things work; translating English to French, for example, without any understanding of either,” he said.

“The summary is that biological evolution is just a type of machine learning and the only problem is that the training data has been lost,” he concluded to laughter from the audience.

Asian Scientist Magazine is a media partner of the GYSS@one-north 2016.


Copyright: Asian Scientist Magazine.
Disclaimer: This article does not necessarily reflect the views of AsianScientist or its staff.

Rebecca did her PhD at the National University of Singapore where she studied how macrophages integrate multiple signals from the toll-like receptor system. She was formerly the editor-in-chief of Asian Scientist Magazine.

Related Stories from Asian Scientist