Origin of viruses and how they “invented” DNA
2nd November 2020
Translated from the original article in Catalan
These weeks or months we all are worried by a virus, the SAR-CoV-2 obviously. So, I thought it appropriate to do this small bibliographic search reviewing the intriguing topic of the origin of these organisms so unique and so different from other living beings.
Remember what viruses are
Viruses are non-cellular organisms, that is, they are not cells: neither prokaryotes like bacteria and archaea, nor eukaryotes like protists, fungi, plants and animals. Therefore, viruses do not have a complex internal structure with many components as cells have, and above all viruses do not have all the metabolic activity involved in the maintenance and reproduction of cellular organisms.
From a functional point of view, viruses are submicroscopic infectious agents, which can only reproduce within cells, of other host organisms of course. Therefore, they are intracellular parasites, and are present in all possible cellular organisms, from archaea and bacteria to all types of eukaryotes. Viruses are found in any ecosystem and they are the most abundant biological entities on Earth (Edwards & Rohwer 2005).
Cellular organisms have the characteristics of the definition of living things, such as having a biological cycle, metabolism, growing, adapting to the environment, responding to stimuli, reproducing and evolving. The concept of living thing or living being has also been defined as any autonomous system with evolutionary capabilities (Peretó 2005). In principle, viruses do not have all these traits, which is why it is sometimes questioned whether they can be called “living things.” However, they can reproduce, at the expense of others, and evolve, and being closely related in their biological cycle to cellular organisms, I do not see how they could be considered “non-living things”. It would be like saying that they are non-biological organisms, which is obviously not true.
In their extracellular phase viruses are inert particles, named virions, almost all measuring between 20 and 300 nm, smaller than most bacteria. The structure of virions is limited to a protective layer of protein, the capsid, and the genetic material inside, RNA or DNA. The capsid can be helical, polyhedral or spherical, and gives the morphology observed under the electron microscope. Additionally, the virions of some viruses (especially animals) have an external structure, a membrane-type envelope, with proteins and phospholipids. Other viruses have more complex structures, such as some bacteriophages (Figure 1).
Adhesion and/or entry of virions into the host cell takes place by various methods, so that their genetic material enters in. There the information of this genetic material will be expressed, thanks to the biosynthetic machinery of the cell and will make more copies of the virus, which once outside the cell, will be more virions that can infect other cells.
Figure 1. Some of the different morphological types of viruses (left to right): helical (e.g. tobacco mosaic virus), polyhedral (e.g. adenovirus), spherical (e.g. flu), and complexes such as bacteriophages.
The classification of viruses is mainly based on their kind of genetic material, i. e. the genome, whether it is DNA or RNA, whether it is single or double stranded, and the replication strategy of this genome and biosynthesis of the mRNA (Figure 2). Some examples of these 6 classes of viruses are (Madigan et al, 2019):
- Class I bacteriophages lambda and T4, animal herpes
- Class II bacteriophage fX174
- Class III gastrointestinal rotavirus
- Class IV poliovirus, coronavirus
- Class V flu, rabies
- Class VI retroviruses such as HIV
- Class VII is sometimes added (in the Baltimore classification, the discoverer of retroviruses), which are partially double-stranded DNA viruses and make an intermediate RNA to replicate. Ex: hepatitis B.
Figure 2. The six types of viruses according to their genome (DNA or RNA, double or single strand) and the mRNA replication and generation system. By convention, mRNA is orientation (+) (from Madigan et al., 2019).
Possible theories of the origin of viruses
Their origin has always been somewhat enigmatic, given the characteristics of these non-cellular organisms. Although viruses are very diverse and therefore different points of origin can be thought of independently, the similarity of their structures and a protein capsid that envelops a nucleic acid suggest at least one common mechanism for explaining their origin.
The 3 most referenced hypotheses to explain how the viruses originated are:
a) They would be forms derived from parasitic unicellular organisms, that evolutionarily would have been reduced to the minimum.
b) They would be fragments of genetic material that would have escaped cellular control becoming parasites.
c) They would be relics of precellular forms, that is to say of the protobionts.
In fact, hypothesis a) has the argument in favour of the existence of intracellular parasites such as Mycoplasma(Tenericutes bacteria) or Microsporidia (eukaryotic fungi), but these microorganisms maintain some cellular characteristics, such as the synthesis of proteins. In addition, no intermediate stage between cells and viruses is known.
Hypothesis b) has in favour the existence of plasmids and transposons, which can be considered as viral precursors, and the fact that viruses can often integrate cellular genes. But it is difficult to explain how these released nucleic acids would have incorporated a protein envelope. In addition, evolutionary affinities between viruses and hosts in the same domain should then be expected. For example, bacteriophages (phages) and bacteria, so phages should have some evolutionary similarities to bacteria, and instead phage proteins (such as T4) are more similar to eukaryotic proteins than their bacterial counterparts (Gadelle et al, 2003). Moreover, most viral proteins have no cellular counterparts in any of the 3 domains (Forterre 2006).
Hypothesis c) has against the fact that all current viruses are obligate parasites and require a stage of intracellular development for their development. However, as we will see below, this is the hypothesis that is gaining more and more recognition.
Two clear arguments in favour of hypothesis c) and against the others are:
- There are viruses from all groups of cellular organisms, which makes hypothesis b) difficult to explain.
- There are DNA and RNA viruses, which makes hypothesis a) unlikely.
However, it should be noted that criticisms of any of these hypotheses are made in the context of the current biosphere, where “modern” viruses need “modern” cells to replicate, where current cells cannot revert to viral forms, or where free DNA cannot capture proteins from current cells to form capsids, and so on. But things could have been very different before the formation of “modern” cells of archaea, bacteria, and eukaryotes. In this sense, we are less constrained by the current reality when proposing new evolutionary scenarios for the origin of viruses (Forterre 2006).
Quick review of the origin of life
To see the possibilities of hypothesis c), origin of viruses from protocellular forms, we will review what is today the most likely hypothesis of the origin of living things on Earth. You can see some good reviews of all this in the books by Zubay (2000), Schopf (2002) and Ribas de Pouplana (2004), and in the article by Peretó (2005), among others.
Prebiotic chemistry was the set of reactions by which biological components originated by abiotic synthesis. As is well known, in a first phase, already postulated in 1920-1930 by Oparin (Miller et al., 1997) and Haldane (Tirard 2017), and for which the experiments of Urey and Miller in 1953 (Bada & Lazcano, 2003) provided experimental support, it is assumed that from the basic molecules of the primitive atmosphere (water, methane, ammonium, nitrogen and others) were synthesized organic monomers such as amino acids, monosaccharides and organic acids, with the sources of energy of the primitive Earth. The importance of the contribution of organic matter from comets and meteorites is also becoming increasingly evident (Oró 2001). In a second phase, biogenic macromolecules could have been formed by polymerization of the monomers, probably on an inorganic support.
Once there were enough prebiotic organic compounds (not to be confused with “prebiotics”, a nutrition term for substrates used by the microbiota that give health benefits), the pre-cell protobiont phases had to be related to the 3 basic properties of living things:
- The establishment of wrapping structures, i.e. membranes
- The transformation of nutrients and energy, i.e. a minimum metabolism
- An inherited mechanism, i.e. the ability to replicate and transfer characteristics to offspring.
As we see in the diagram (Figure 3), the current hypothesis is that these phases were in this order, so that in structures with envelopes (amphiphilic vesicles) began to generate mechanisms of protobioenergetic reactions that became autonomous systems, which began to acquire hereditary characteristics based on RNA (the RNA world), and that later, more stable DNA would eventually replace RNA as a genome molecule. The role of RNA is supported by the variety of RNAs existing in cells and by the catalytic characteristics of some of these, ribozymes, in addition to be a gene molecule (Peretó 2005).
Figure 3. Scheme of the hypothetical transition from prebiotic chemistry to cells, without a time scale but which could have been around 4000 M years ago. These protobiont phases include (from left to right) the origin of autonomic systems with protometabolism without genetic material, the first protocells with pre-RNA and then RNA plus proteins (the RNA world, with “ribocytes”), and then the incorporation of DNA, until reaching the LUCA (last common universal ancestor) with known biological characteristics. B: bacteria; A: archaea; E: eukaryotes (from Peretó 2005).
The origin of viruses: relics of protobionts
As discussed above, this is the most likely hypothesis today.
Taking up what we are now discussing about the origin of life, in fact, in this world RNAs could be distinguished as two phases (Figure 4). The first would begin when RNA as such would have become the genetic material carrying information but as we see, before (the “pre-RNA” world) there could have been protobionts (or protocells) with other “genetic” molecules, whether nucleic acids or other molecules, and this first protobiont with RNA would have meant a bottleneck or breaking point, which would have begun to predominate over the previous ones, which would have become extinct. As we know, this phenomenon is very common in evolution. In a second phase of the RNA world, ribosomes would have appeared as protein synthesis machines, which would have allowed a rapid evolution towards more efficient cell forms, as opposed to the use of peptides or other less efficient ways of synthesizing proteins. (Forterre 2005). Finally in some of the protobiont lines, DNA would have ended up replacing RNA as genetic material, also being a breaking point in the evolutionary line.
Well, as we see (Figure 4), some of the lineages could survive by parasitizing successful individuals in the next phase: they would be viruses. At each stage there would be a critical point of origin (breaking points or bottlenecks, black lines) from a novel organism that would give rise to many lineages. Some of these, instead of becoming extinct, could survive as viral lineages (white lines) by parasitizing the successful protocellular lineages of the next phase (Forterre 2005).
Figure 4. Hypothesis of the phases of the protobionts evolution in the RNA world: description in the text (from Forterre 2005).
In both phases of the RNA world very diverse organisms (the lineages depicted in Figure 4) would have coexisted: prey, predators, free-living, and parasites. It is therefore likely that protocells and virus-like entities coexisted, and thus RNA viruses would have originated (Figure 5). We see how in the first phase different lineages of RNA protobionts would coexist, with various protein production mechanisms (the small crossed inner circles, Fig. 5A), including the ancestor of the current ribosome system (the 2 black subunits). This lineage (in blue, Fig. 5B) would have eliminated its competitors. Some lineages of this first phase (green and red) would have survived as intracellular parasites with an extracellular phase in their biological cycle. Eventually these parasites would have lost their own protein synthesis machinery and become RNA viruses (Fig. 5C).
In addition, there are currently single-stranded RNA and double-stranded RNA viruses with different ways of replicating (Figure 2), as should be the case in this RNA world with very diverse lineages. Moreover, double-stranded RNA viruses of bacteria and eukaryotes have similar structures and their RNA-polymerase-RNA-dependent are homologous. This model implies a polyphyletic origin for the different superfamilies of RNA viruses, and when the protobionts would have become DNA, the parasitism of these viruses would have been maintained, giving rise to all the various RNA viruses we currently observe in cellular organisms. This model can be accommodated to explain DNA viruses originated from lineages of DNA protobionts (Forterre 2005).
Figure 5. Hypothesis of the origin of RNA viruses: description in the text (from Forterre 2005).
DNA, invented by some viruses ?
It is known that DNA ended up replacing RNA as genetic material during these early stages of evolution for two reasons: a) DNA is more stable than RNA because the 2’O of ribose is very reactive, being able to break the phosphodiester bond; and b) The deamination of cytosine to uracil is a frequent spontaneous chemical reaction that can be repaired in DNA but not in RNA, for obvious reasons: uracil is native to RNA.
In today’s living things, the DNA precursors, deoxyribonucleotides (dNTPs), are formed mainly by ribonucleotide reductases that reduce ribonucleotide ribose (rNTPs) to deoxyribose. They are synthesized from them thanks to complex enzymes, which should appeared in the second phase of the RNA world, since DNA could not be formed only from RNAs, even if these were ribozymes.
But what about DNA thymine instead of RNA uracil ?
The fact that dTMP is currently produced from dUMP and not by reduction of TTP suggests that U-DNA could have been an intermediary in the transition from RNA to DNA and therefore there would have been a “U-DNA world”. So it turns out that some bacterial viruses have DNA with uracil, which is U-DNA instead of the usual T-DNA. In fact, in today’s viruses we can find quite a diversity of DNAs, some with U-DNA, many others with T-DNA and some others with hydroxymethylcytosine-DNA (Figure 6), and in addition DNA viruses have a wide variety of replication mechanisms, and enzymes to pass from one type to another. That would be what can be called a “virosphere”, which gives rise to think about its origin in these stages of protobionts. Therefore, DNA as we see it with thymine, could have been an invention of some viruses (Forterre 2006).
Figure 6. Scheme of the evolution of genomes from RNA to modified DNA genomes. All types are present in the “virosphere” but only T-DNA is present in cellular organisms. RNR: ribonucleotide reductase; TdS: thymidyl synthase; HmcT: hydroxymethylcytosine transferase. (from Forterre 2006).
Hypothesis of DNA transfer from DNA viruses to cellular organisms
A DNA virus (Figure 7A, red genome) could have infected an RNA (blue genome) protocell and could have co-evolved (Fig. 7B), such that RNA genes would have been progressively transferred to viral DNA by retrotranscription (Fig. 7C, white arrow) and the viral genome would have evolved into a DNA plasmid within an RNA protocell, but eventually (Fig. 7D) the plasmid DNA would predominate over the RNA genome, due to its greater genetic efficiency and would end being a prechromosome of DNA.
A similar mechanism (Figure 7 E-G) could explain the formation of DNA plasmids in DNA protocells. Anyway, these resulting protocells would be prokaryotes. Given all this, the formation of eukaryotic cells, in addition to the well-known theory of the symbiosis of archaea and bacteria, could also be hypothesized as the formation of the nucleus, in RNA protocells, from uptake of DNA viruses enveloped by the formation of intracellular membranes, giving rise to the nuclear membrane, similarly to the formation of animal virus envelopes (Forterre 2005, idem 2006).
Figure 7. Hypothetical models of DNA transfer from viruses to RNA protocells (left A-D) to form DNA cells, and of the formation of plasmids (right E-G) from DNA viruses: description in the text (from Forterre 2005).
There are other possible explanations for explaining the origin of viruses, with arguments in favour of the other hypotheses discussed above, but I think that this where viruses would be forms that come from protobionts, is gaining weight. Another curious hypothesis is that viruses would have originated from self-replicating proteins such as prions that would have coupled with RNA. See Lupi & Dadalti (2007) for more information.
As we have seen, viruses, with their great diversity and being present in all cellular organisms, would be the descendants or remnants or relics of these forms of protobionts from the earliest times of life on Earth. As in other living things, evolution driven by a set of factors of genetic variability, and especially natural selection, would have resulted in what we now see. Viruses have been evolving much longer than we have, are present in all ecosystems on the planet, and are the most abundant biological entities. So we could say that in principle they must be smarter than us. We must continue to learn to live with them, as we are seeing in 2020 with SARS-CoV-2, and all this despite the fact that we doubt about their condition of being living things.
Bada JL, Lazcano A (2003) Prebiotic soup – Revisiting the Miller experiment. Science 300, 745-746.
Edwards RA & Rohwer F (2005) Viral metagenomics. Nature Rev Microbiol 3, 504-510
Forterre P (2005) The two ages of the RNA world, and the transition to the DNA world: a story of viruses and cells. Biochimie 87, 9–10, 793-803.
Forterre P (2006) The origin of viruses and their possible roles in major evolutionary transitions. Virus Research 117, 1, 5-16.
Gadelle D, Filée J, Buhler C, Forterre P (2003) Phylogenomics of type II DNA topoisomerases. Bioessays 25, 232-242
Lupi O, Dadalti P, Cruz E, Goodheart C (2007) Did the first virus self-assemble from self-replicating prion proteins and RNA? Medical Hypotheses 69, 4, 724-730.
Madigan MT, Bender KS, Buckley DH, Sattley WM, Stahl DA (2019) Brock biology of microorganisms. Pearson Ed Ltd.
Miller SL, Schopf JW, Lazcano A (1997) Oparin’s “Origin of Life”: sixty years later. J Mol Evol 44, 351-353.
Oró J (2001) Cometary molecules and Life’s origin. In: Chela-Flores J, Owen T, Raulin F (eds) First steps in the origin of life in the universe. Springer, Dordrecht.
Peretó J (2005) Controversies on the origin of life. Internat Microbiol 8, 23-31
Ribas de Pouplana L (2004) The Genetic Code and the Origin of Life. Ed. Kluwer Academic – Landes Bioscience, ISBN 0-306-47843-9.
Schopf JW (ed.) (2002) Life’s Origin. The beginnings of biological evolution. Univ. California Press, ISBN 0-520-23390-5.
Tirard S (2017) J.B.S. Haldane and the origin of life. J Genetics 96, 735-739.
Zubay G (2000) Origins of life on the Earth and in the Cosmos. Academic Press, ISBN 978-0-12-781910-5.