Wednesday, August 27, 2008

The purpose of sex

What is the purpose (evolutionarily) of sex? This is a pretty esoteric blog, a detailed technical argument. It's an essay I wrote a while back, but not being a biologist, I couldn't get anyone to read it. Now that I'm a blogger, I'm going to put it out there on the web, so that if nothing else, I can claim I thought of it first.

This article demonstrates why sexual reproduction is a prerequisite to the evolution of complex biological structures. Using information theory and a statistical viewpoint, I also show that while sexual reproduction may be "costly" in the short term, it is beneficial to the survival of a gene in the long run.

I've read a lot about evolution, and as far as I know, nobody has ever explained sex adequately.

Sex and the Single Gene
by Craig A. James, February 2007

The Paradox

One of the apparent paradoxes in evolution is: Why is there sex? Or more properly, why is there sexual reproduction, meaning the process by which two individuals share genetic information in the process of reproducing? Sex cuts the chances for a gene's replication at least in half.

As with everything else in evolution science, a seeming paradox like this means we haven't yet discovered the benefits of sexual reproduction. The fact that it exists in spite of this extremely harsh penalty means there must be an even more potent benefit. A 50% reduction in a gene's chances to reproduce is monstrous. This is the paradox of sex: It seems to defy explanation.

Cross-discipline approaches can often bring new insight into old problems. The problem of sex in evolution, as described Dawkins' The Ancestors Tale, in the chapter The Rotifer's Tale, jumped out at me. As a computer scientist who has studied information theory, the sexual reproduction appears inevitable; I would go so far as to say information theory predicts sexual reproduction.

An Information Science View

The first lesson of information science is that perfect information transcription is theoretically impossible. At the molecular level, Planck's constant is a significant factor, and Heisenberg tells us there will be transcription errors. In addition there are external forces, such as radiation and reactive radicals, that can damage the information contained in DNA.

The second lesson of information science is that you can correct for just about any error rate via redundancy. Even very error-prone copying methods can be made "error free" (where "error free" means reliable to any arbitrarily-small error rate you'd care to pick) with enough redundancy.

Redundancy comes in many forms. Mathematicians and computer scientists have some very efficient error-correcting methods, but these sophisticated mathematical algorithms are beyond the reach of evolution. A much simpler form of redundancy is replication of information, such that if one copy goes bad, other copies are available.

From an information-theory point of view, there are two reasons sex is inevitable (where "transcription errors" are what a biologist would call a mutation):

1. Transcription errors that are bad (detrimental to survival)
2. Transcription errors that are good (enhance survival and reproduction)

With sexual reproduction, the first is mitigated, and the second is amplified. In a nutshell, it boils down to the fact that with asexual reproduction, each gene is on its own, whereas with sexual reproduction, a gene can benefit from good mutations in other genes, and can survive mutations in other genes with which it shares a body. Let me amplify.

To begin, I must clarify the critical concepts on which my arguments stand. I will coin new words to clearly distinguish five very different concepts:

  • A "gene-individual" is a particular molecular fragment that happens to reside on a strand of DNA in one individual.

  • A "gene-class" is collection of identical single genes, spread across a number of individuals, and usually across a number of species. The gene-class also has an abstract (i.e. human) conceptualization as the "perfect" instance of this gene: The base pairs that, when present on a strand of DNA, cause the scientist to say, "this gene is present in this individual".

  • A "gene-pool" is used in the customary sense: A set of genes spread across a number of individuals in a breeding population.

  • A "gene-contingent" is like a gene-pool, but for one specific gene. It is a set of gene-individuals that are in the same gene-class, and additionally are in an interbreeding population such that they may "cross paths" in the future.

The last one, the gene-contingent, is the key to the information-theory argument that sex is beneficial to a gene, in spite of the two-fold penalty of sexual reproduction.

The gene-contingent of sexual and asexual species are critically different: For a sexual species, the gene-contingent to which a gene-instance belongs is spread across the breeding population, whereas for asexual individuals, the gene-contingent and the gene-instance are identical: one individual.

Beneficial and Synergistic Mutations

In The Rotifer's Tale, Dawkins captures the second half of the information-theory argument regarding sex when he says: "... genes are continually being tried out against different genetic backgrounds ... [those that cooperate] tend to be in winning teams." In other words, when a beneficial mutation occurs in any gene-individual of a sexual species, every gene-individual in the "gene river" has the possibility to eventually pair up with the new, better gene.

What is the probability that a beneficial mutation will occur in the gene pool of a sexual species versus an asexual species?

Since an asexual species is always a "species of one", the chances are vastly less. In an asexual species, each gene-individual only benefits from good mutations in the specific individual in which it resides. The chances of a good mutation happening to one of other genes in a specific individual are many orders of magnitude less than the chances of it happening somewhere in the whole species.

By contrast, in a sexual species, a gene-contingent can benefit from any good mutation anywhere in the species.

Now consider synergistic mutations. Suppose there are two beneficial mutations that could occur, that together are also synergistic, or alternatively, where the second mutation's beneficial properties depend on the first mutation being present. In a sexual species, the first beneficial mutation will will propagate through the gene pool, so that when the second mutation occurs, the synergy will be realized.

By contrast, in an asexual species, the two mutations will almost certainly happen in different lines of descent, and the synergy will never be realized. Because of this, we can predict that asexual species will not be nearly as adaptable, nor will they evolve as quickly, as sexual species.


Complexity simply cannot arise in asexual species. Consider the odds: Imagine a very simplified ecosystem that can support one billion individual single-cell creatures that divide once per day. Each day, half of the individuals die, and half go on to the next generation. And image that, on average, one mutation occurs somewhere in the population per day.

Complexity in lifeforms requires a long sequence of mutation and selection. In our hypothetical population, suppose two mutations occur that together would result in a more complex creature. The chances are one in a billion that they will occur in the same individual's line of descent. In other words, the two mutations would never “encounter” one another.

By contrast, if our same population of a billion creatures uses some form of DNA exchange, and if both of these mutations are individually not detrimental (or only slightly detrimental) then the chances approach 100 percent that sooner or later, both mutations will be “inherited” by an individual, increasing the complexity of that individual. The added benefit conferred by the pairing of the two mutations will quickly cause the pair of genes to spread throughout the population. And once this happens, the third and subsequent mutations that further increase the complexity of the creature are again a billion times more likely to encounter the first two in a sexual species than in an asexual species.

Using this logic, we can make a prediction: Any species that changes from sexual to asexual will not evolve significantly once asexual reproduction begins. Or, if it does, it will be at a rate that is billions of times slower than sexual species in similar circumstances. Such a species can survive indefinitely if it was already well adapted to its environment prior to becoming asexual, but it cannot evolve further. We can predict that such species will all become extinct sooner or later, due to a change in the environment, or to an encroaching species that competes in the same ecological niche or preys on the species.

Harmful Mutations

Information redundancy is only available to sexual species, where the gene-contingent spans many individuals. The survival of the gene-contingent is not dependent on any one individual; transcription errors don't terminate the gene-contingent. By contrast, in an asexual creature, mutation of a gene-individual ends the gene-contingent forever.

One might argue that asexual species have information redundancy because the gene-class spans many individuals. Indeed, the loss of one gene-individual does not make the gene-class extinct. But this argument is flawed: The gene-class is not the entity on which evolution operates. Only the gene-contingent matters from an evolutionary perspective.

This goes to the very heart of Darwinian evolution, and what is meant by natural selection. Once speciation occurs, each species' gene-pool only "cares" about its own survival and reproduction. The fact that that two recently-split species share a gene-class is irrelevant; the two gene-contingents are in competition rather than cooperation, and the demise of one can often improve the survival of the other, even though the gene-individuals in each species' gene-contingent are identical. This is reflected Dawkins' statement, "... the entity that is carved into shape ... is the gene pool."

Once a rotifer reproduces, each gene is "on its own" and no longer "cares" whether its "brother and cousin" genes survive or not. In fact, the opposite is true: Speciation occurs at every reproduction for the asexual rotifer, so all gene-instance of a gene-class are in direct competition with one another.

Because of this, there is no information redundancy in an asexual species, no opportunity to correct errors. Each gene is completely on its own, and the chances approach 100% that it won't survive in the long term.

Evolving the Ability to Evolve

Sexual species have an enormous advantage over asexual species because they can support variations.

Although each gene-contingent "wants" to replicate perfectly at each generation, it can (ironically) benefit from the imperfect replication of other gene-individuals in the gene pool. To understand this, we must view evolution from two perspectives: Long term and short term, which roughly translate to "stable environment" and "changing environment".

In the short term, evolution favors uniformity. Suppose we could create a completely stable environment, and we could prevent mutations. For sexual species, sex shuffles the gene combinations, working towards an optimum; after a while, a single "perfect" individual would emerge, and all variability would be lost. For an asexual "species" in the same environment, we would expect after a while for one line of descent to dominate and all other lines to die, resulting in what appeared to be a single species (all identical individuals). The net result in either case is the same for both sexual and asexual species: A uniform gene pool.

(Strictly speaking, this latter case isn't completely true; only the phenotypes would be uniform. Variation in the genome that have no effect on the phenotype are irrelevant, so variations in the genome would remain. But our core argument remains sound: stability results in uniformity.)

However, over the long term, a changing environment favors a certain degree of variability, rather than uniformity, in the gene pool. If the environment is suddenly hotter, more acidic, a competitor arrives, etc., variability increases the odds that at least some of the individuals will survive.

Seen over a very long time scale, each species must be able to adapt to changing conditions, so one would predict that variability itself is an important survival strategy.

This leads to a paradox: Species without variability are less likely to survive over the long haul, yet evolution favors uniformity as long as the environment is stable. In the short term, most variations are bad, but in the long term they ensure survival.

Thus, we predict that evolution should have created a mechanism where stability and faithful reproduction of the genome is ensured, yet variability is not only tolerated, but is actually necessary.

Or, to put it another way: A mutation that produces a competing gene-contingent is bad for the original gene-contingent, yet the gene-pools benefits from its ability to support competing gene-contingents. “You mutate.” “No, YOU mutate. It will be good for us!” “If it's so good, then YOU do it.”

How can we resolve this paradox? Again, information redundancy gives the answer.

For the asexual individual, there is no solution to the paradox. Its genes have no redundancy, so it must favor extremely accurate reproduction of its DNA. In fact, I would predict that asexual species' DNA is more "robust" and resistant to mutation than the DNA of sexual species. Mutation to the DNA of an asexual individual is almost always the end of the line for those genes, so any gene (or any gene's phenotype that is part of the DNA copying and repairing mechanism) that was slightly more susceptible to mutation than a competitor would quickly be eliminated from the population.

By contrast, the redundant information stored across the gene pool of a sexual species allows it withstand many orders of magnitude more errors than an asexual species. Sex allows for imperfect gene replication, which ensures the long-term survival (adaptability) of the species, yet the information redundancy provided by the extended gene pool means that errors are not fatal.

Imperfect replication (mutation) is necessary to ensure for long-term survival, but perfect replication is required for short-term survival. The answer is information redundancy: With sex, information is replicated, errors can be tolerated, and variability within the species gene pool is possible. The paradox is resolved.


Seen from an information-theory point of view, sex has evolved to provide redundancy of information across a gene-contingent, so that transcription errors can be tolerated, and so that beneficial mutations can be shared. This has two important consequences.

First, it provides a mechanism to spread favorable mutations across a gene pool such that sequential mutations "encounter" each other rather than occurring on separate lines of descent.

Second, it makes variability possible, which helps ensure long-term survival in a changing environment, because the redundancy provided by an extended gene-contingent make detrimental mutations more tolerable to the genome.

No comments:

Post a Comment

Dear readers -- I am no longer blogging and after leaving these blogs open for two years have finally stopped accepting comments due to spammers. Thanks for your interest. If you'd like to write to me, click on the "Contact" link at the top. Thanks! -- CJ.

Note: Only a member of this blog may post a comment.