When studying biological evolution, we have to overcome a large obstacle: Evolution is extremely slow. Traditionally, evolutionary biology has therefore been a field dominated by observation and theory, even though some regard the domestication of plants and animals as early, unwitting evolution experiments. Realistically, we can carry out controlled evolution experiments only with organisms that have very short generation times, so that populations can undergo hundreds of generations within a time frame of months or years. With the advances in microbiology, such experiments in evolution have become feasible with bacteria and viruses [ElenaLenski2003, TravisanoRainey2000]. However, even with microorganisms, evolution experiments still take a lot of time to complete and are often cumbersome. In particular, some data can be difficult or impossible to obtain, and it is often impractical to carry out enough replicas for high statistical accuracy.
According to Daniel Dennett, “…evolution will occur whenever and wherever three conditions are met: replication, variation (mutation), and differential fitness (competition)”[Dennett2002]. It seems to be an obvious idea to set up these conditions in a computer, and to study evolution ”in silico” rather than ”in vitro”. In a computer, it is easy to measure any quantity of interest with arbitrary precision, and the time it takes to propagate organisms for several hundred generations is only limited by the processing power available. In fact, population geneticists have long been carrying out computer simulations of evolving loci, in order to test or augment their mathematical theories (see [HartlClark2006, KimStephan2003, McVeanCharlesworth2000, Nowak2006, Orr2000] for some examples). However, the assumptions put into these simulations typically mirror exactly the assumptions of the analytical calculations. Therefore, the simulations can be used only to test whether the analytic calculations are error-free, or whether stochastic effects cause a system to deviate from its deterministic description, but they cannot test the model assumptions on a more basic level.
An approach to studying evolution that lies somewhere in between evolution experiments with biochemical organisms and standard Monte-Carlo simulations is the study of self-replicating and evolving computer programs (digital organisms). These digital organisms can be quite complex and interact in a multitude of different ways with their environment or each other, so that their study is not a simulation of a particular evolutionary theory but becomes an experimental study in its own right. In recent years, research with digital organisms has grown substantially ([AdamiOfriaCollier2000, ComasMoyaGonzalez-Candelas2005, Egri-NagyNehaniv2003, GerleeLundh2005, LenskiOfriaPennock2003, MisevicOfriaLenski2006, WilkeWangOfria2001, YedidBell2001, YedidBell2002, ZhangTravisano2007], see [Adami2006, WilkeAdami2002] for reviews), and is being increasingly accepted by evolutionary biologists [ONeill2003]. (However, as Barton and Zuidema [BartonZuidema2003] note, general acceptance will ultimately hinge on whether artificial life researchers embrace or ignore the large body of population-genetics literature.) Avida is arguably the most advanced software platform to study digital organisms to date, and is certainly the one that has had the biggest impact in the biological literature so far. Having reached version 2.12, it now supports detailed control over experimental settings, a sophisticated system to design and execute experimental protocols, a multitude of possibilities for organisms to interact with their environment (including depletable resources and conversion from one resource into another) and a module to post-process data from evolution experiments (including tools to find the line of descent from the original ancestor to any final organism, to carry out knock-out studies with organisms, to calculate the fitness landscape around a genotype, and to align and compare organisms’ genomes).
History of Digital Life
The most well-known intersection of evolutionary biology with computer science is the genetic algorithm or its many variants (genetic programming, evolutionary strategies, and so on). All these variants boil down to the same basic recipe: (1) create random potential solutions, (2) evaluate each solution assigning it a fitness value to represent its quality, (3) select a subset of solutions using fitness as a key criterion, (4) vary these solutions by making random changes or recombining portions of them, (5) repeat from step 2 until you find a solution
that is sufficiently good.
This technique turns out to be an excellent method for solving problems, but it ignores many aspects of natural living systems. Most notably, natural organisms must replicate themselves, as there is no external force to do so; therefore, their ability to pass their genetic information on to the next generation is the final arbiter of their fitness. Furthermore, organisms in a natural system have the ability to interact with their environment and with each other in ways that are excluded from most algorithmic applications of evolution.
Work on more naturally evolving computational systems began in 1990, when Steen Rasmussen was inspired by the computer game “Core War” [Dewdney1984]. In this game, programs are written in a simplified assembly language and made to compete in the simulated core memory of a computer. The winning program is the one that manages to shut down all processes associated with its competitors. Rasmussen observed that the most successful of these programs were the ones that replicated themselves, so that if one copy were destroyed, others would still persist. In the original Core War game, the diversity of organisms could not increase, and hence no evolution was possible. Rasmussen designed a system similar to Core War in which the command that copied instructions was flawed and would sometimes write a random instruction instead on the one intended [RasmussenKnudsenFeldberg1990]. This flawed copy command introduced ”mutations” into the system, and thus the potential for evolution. Rasmussen dubbed his new program “Core World”, created a simple self-replicating ancestor, and let it run.
Unfortunately, this first experiment was only of limited success. While the programs seemed to evolve initially, they soon started to copy code into each other, to the point where no proper self-replicators survived—the system collapsed into a non-living state. Nevertheless, the dynamics of this system turned out to be intriguing, displaying the partial replication of fragments of code, and repeated occurrences of simple patterns.
The first successful experiment with evolving populations of self-replicating computer programs was performed the following year. Thomas Ray designed a program of his own with significant, biologically-inspired modifications. The result was the Tierra system [Ray1992]. In Tierra, digital organisms must allocate memory before they have permission to write to it, which prevents stray copy commands from killing other organisms. Death only occurs when memory fills up, at which point the oldest programs are removed to make room for new ones to be born.
The first Tierra experiment was initialized with an ancestral program that was 80 lines long. It filled up the available memory with copies of itself, many of which had mutations that caused a loss of functionality. Yet other mutations were neutral and did not affect the organism’s ability to replicate — and a few were even beneficial. In this initial experiment, the only selective pressure on the population was for the organisms to increase their rate of replication. Indeed, Ray witnessed that the organisms were slowly shrinking the length of their genomes, since a shorter genome meant that there was less genetic material to copy, and thus it could be copied more rapidly.
This result was interesting enough on its own. However, other forms of adaptation, some quite surprising, occurred as well. For example, some organisms were able to shrink further by removing critical portions of their genome, and then use those same portions from more complete competitors, in a technique that Ray noted was a form of parasitism. Arms races transpired where hosts evolved methods of eluding the parasites, and they, in turn, evolved to get around these new defenses. Some would-be hosts, known as hyper-parasites, even evolved mechanisms for tricking the parasites into aiding them in the copying of their own genomes. Evolution continued in all sorts of interesting manner, making Tierra seem like a choice system for experimental evolution work.
In 1992, Chris Adami began research on evolutionary adaptation with Ray’s Tierra system. His intent was to have these digital organisms to evolve solutions to specific mathematical problems, without forcing them use a pre-defined approach. His core idea was the following: If he wanted a population of organisms to evolve, for example, the ability to add two numbers together, he would monitor organisms’ input and output numbers. If an output ever was the sum of two inputs, the successful organisms would receive extra CPU cycles as a bonus. As long as the number of extra cycles was greater than the time it took the organism to perform the computation, the leftover cycles could be applied toward the replication process, providing a competitive advantage to the organism. Sure enough, Adami was able to get the organisms to evolve some simple tasks, but faced many limitations in trying to use Tierra to study the evolutionary process.
In the summer of 1993, Charles Ofria and C. Titus Brown joined Adami to develop a new digital life software platform, the Avida system. Avida was designed to have detailed and versatile configuration capabilities, along with high precision measurements to record all aspects of a population. Furthermore, whereas organisms are executed sequentially in Tierra, the Avida system simulates a parallel computer, allowing all organisms to be executed effectively simultaneously. Since its inception, Avida has had many new features added to it, including a sophisticated environment with localized resources, an events system to schedule actions to occur over the course of an experiment, multiple types of CPUs to form the bodies of the digital organisms, and a sophisticated analysis mode to post-process data from an Avida experiment. Avida is under active development at Michigan State University, led by Charles Ofria and David Bryson.
The Scientific Motivation for Avida
Intuitively, it seems that natural systems should be used to best understand how evolution produces the variation in observed in nature, but this can be prohibitively difficult for many questions and does not provide enough detail. Using digital organisms in a system such as Avida can be justified on five grounds:
- ”Artificial life forms provide an opportunity to seek generalizations about self-replicating systems” beyond the organic forms that biologists have studied to date, all of which share a common ancestor and essentially the same chemistry of DNA, RNA and proteins. As John Maynard Smith[Maynard-Smith1992] made the case: “So far, we have been able to study only one evolving system and we cannot wait for interstellar flight to provide us with a second. If we want to discover generalizations about evolving systems, we will have to
look at artificial ones.” Of course, digital systems should always be studied in parallel with natural ones, but any differences we find between
their evolutionary dynamics open up what is perhaps an even more interesting set of questions.
- ”Digital organisms enable us to address questions that are impossible to study with organic life forms”. For example, in one of our current experiments we are investigating the importance of deleterious mutations in adaptive evolution by explicitly reverting all detrimental mutations. Such invasive micromanaging of a population is not possible in a natural system, especially without disturbing other aspects of the evolution. In a digital evolving system, every bit of memory can be viewed without disrupting the system, and changes can be made at the
precise points desired.
- ”Other questions can be addressed on a scale that is unattainable with natural organisms”. In an earlier experiment with digital organisms [LenskiOfriaCollier1999] we examined billions of genotypes to quantify the effects of mutations as well as the form and extent of their interactions. By contrast, an experiment with ”E. coli” was necessarily confined to one level of genomic complexity. Digital organisms also have a speed advantage: population with 10,000 organisms can have 20,000 generations processed per day on a modern desktop computer. A similar experiment with bacteria took over a decade [Lenski2004].
- ”Digital organisms possess the ability to truly evolve, unlike mere numerical simulations”. Evolution is open-ended and the design of the evolved solutions is unpredictable. These properties arise because selection in digital organisms (as in real ones) occurs at the level of the whole-organism’s phenotype; it depends on the rates at which organisms perform tasks that enable them to metabolize resources to convert them to energy, and the efficiency with which they use that energy for reproduction. Genome sizes are sufficiently large that evolving populations cannot test every possible genotype, so replicate populations always find different local optima. A genome typical consists of 50 to 1000 sequential instructions. With commonly 26 possible instructions at each position, there are many more potential genome states than there are atoms in the universe.
- ”Digital organisms can be used to design solutions to computational problems” where it is difficult to write explicit programs that produce the desired behavior [Goldberg2002,Koza2003]. Current evolutionary algorithm approaches are based on a simplistic view of evolution, leaving out many of the factors that are believed to make it such a powerful force. Thus there are new opportunities for biological concepts to have a large impact outside of biology, just as principles of physics and mathematics are often used throughout other fields, including biology.
Last updated: April 1, 2011