Thermodynamics of Living Systems

Thermodynamics of Living Systems

It is widely held that in the physical sciences the laws of thermodynamics have had a unifying effect similar to that of the theory of evolution in the biological sciences. What is intriguing is that the predictions of one seem to contradict the predictions of the other. The second law of thermodynamics suggests a progression from order to disorder, from complexity to simplicity, in the physical universe. Yet biological evolution involves a hierarchical progression to increasingly complex forms of living systems, seemingly in contradiction to the second law of thermodynamics. Whether this discrepancy between the two theories is only apparent or real is the question to be considered in the next three chapters. The controversy which is evident in an article published in the American Scientist ¹ along with the replies it provoked demonstrates the question is still a timely one.

The First Law of Thermodynamics

Thermodynamics is an exact science which deals with energy. Our world seethes with transformations of matter and energy. Be these mechanical or chemical, the first law of thermodynamics---the principle of the Conservation of Energy---tells us that the total energy of the universe or any isolated part of it will be the same after any such transformation as it was before. A major part of the science of thermodynamics is accounting---giving an account of the energy of a system that has undergone some sort of transformation. Thus, we derive from the first law of thermodynamics that the change in the energy of a system (E) is equal to the work done on (or by) the system (W) and the heat flow into (or out of) the system (Q) Mechanical work and energy are interchangeable, i.e., energy may be converted into mechanical work as in a steam engine, or mechanical work can be converted into energy as in the heating of a cannon which occurs as its barrel is bored. In mathematical terms (where the terms are as previously defined):

E = Q + W (7-1)

The Second Law of Thermodynamics

The second law of thermodynamics describes the flow of energy in nature in processes which are irreversible. The physical significance of the second law of thermodynamics is that the energy flow in such processes is always toward a more uniform distribution of the energy of the universe. Anyone who has had to pay utility bills for long has become aware that too much of the warm air in his or her home during winter escapes to the outside. This flow of energy from the house to the cold outside in winter, or the flow of energy from the hot outdoors into the air-conditioned home in the summer, is a process described by the second law of thermodynamics. The burning of gasoline, converting energy "rich" compounds (hydrocarbons) into energy "lean" compounds, carbon dioxide (CO₂) and water (H₂0), is a second illustration of this principle.

The concept of entropy (S) gives us a more quantitative way to describe the tendency for energy to flow in a particular direction. The entropy change for a system is defined mathematically as the flow of energy divided by the temperature, or,

S [Q / T] (7-2)

where S is the change in entropy, Q is the heat flow into or out of a system, and T is the absolute temperature in degrees Kelvin (K).

[Note: For a reversible flow of energy such as occurs under equilibrium conditions, the equality sign applies. For irreversible energy flow, the inequality applies.]

A Driving Force

If we consider heat flow from a warm house to the outdoors on a cold winter night, we may apply equation 7-2 as follows:

S_T = S_house + S_outdoors - Q / T₁ + Q / T₂ (7-3)

where S_r is the total entropy change associated with this irreversible heat flow, T₁ is the temperature inside the house, and T₂ is the temperature outdoors. The negative sign of the first term notes loss of heat from the house, while the positive sign on the second term recognizes heat gained by the outdoors. Since it is warmer in the house than outdoors (T₁ > T₂), the total entropy will increase (S_r > 0) as a result of this heat flow. If we turn off the heater in the house, it will gradually cool until the temperature approaches that of the outdoors, i.e., T₁ = T₂. When this occurs, the entropy change (S) associated with heat flow (Q) goes to zero. Since there is no further driving force for heat flow to the outdoors, it ceases; equilibrium conditions have been established.

As this simple example shows, energy flow occurs in a direction that causes the total energy to be more uniformly distributed. If we think about it, we can also see that the entropy increase associated with such energy flow is proportional to the driving force for such energy flow to occur. The second law of thermodynamics says that the entropy of the universe (or any isolated system therein) is increasing; i.e., the energy of the universe is becoming more uniformly distributed.

It is often noted that the second law indicates that nature tends to go from order to disorder, from complexity to simplicity. If the most random arrangement of energy is a uniform distribution, then the present arrangement of the energy in the universe is nonrandom, since some matter is very rich in chemical energy, some in thermal energy, etc., and other matter is very poor in these kinds of energy. In a similar way, the arrangements of mass in the universe tend to go from order to disorder due to the random motion on an atomic scale produced by thermal energy. The diffusional processes in the solid, liquid, or gaseous states are examples of increasing entropy due to random atomic movements. Thus, increasing entropy in a system corresponds to increasingly random arrangements of mass and/or energy.

Entropy and Probability

There is another way to view entropy. The entropy of a system is a measure of the probability of a given arrangement of mass and energy within it. A statistical thermodynamic approach can be used to further quantify the system entropy. High entropy corresponds to high probability. As a random arrangement is highly probable, it would also be characterized by a large entropy. On the other hand, a highly ordered arrangement, being less probable, would represent a lower entropy configuration. The second law would tell us then that events which increase the entropy of the system require a change from more order to less order, or from less-random states to more-random states. We will find this concept helpful in Chapter 9 when we analyze condensation reactions for DNA and protein.

Clausius², who formulated the second law of thermodynamics, summarizes the laws of thermodynamics in his famous concise statement: "The energy of the universe is constant; the entropy of the universe tends toward a maximum." The universe moves from its less probable current arrangement (low entropy) toward its most probable arrangement in which the energy of the universe will be more uniformly distributed.

Life and the Second Law of Thermodynamics

How does all of this relate to chemical evolution? Since the important macromolecules of living systems (DNA, protein, etc.) are more energy rich than their precursors (amino acids, heterocyclic bases, phosphates, and sugars), classical thermodynamics would predict that such macromolecules will not spontaneously form.

Roger Caillois has recently drawn this conclusion in saying, "Clausius and Darwin cannot both be right."³ This prediction of classical thermodynamics has, however, merely set the stage for refined efforts to understand life's origin. Harold Morowitz⁴ and others have suggested that the earth is not an isolated system, since it is open to energy flow from the sun. Nevertheless, one cannot simply dismiss the problem of the origin of organization and complexity in biological systems by a vague appeal to open-system non-equilibrium thermodynamics. The mechanisms responsible for the emergence and maintenance of coherent (organized) states must be defined. To clarify the role of mass and energy flow through a system as a possible solution to this problem, we will look in turn at the thermodynamics of (1) an isolated system, (2) a closed system, and (3) an open system. We will then discuss the application of open-system thermodynamics to living systems. In Chapter 8 we will apply the thermodynamic concepts presented in this chapter to the prebiotic synthesis of DNA and protein. In Chapter 9 this theoretical analysis will be used to interpret the various prebiotic synthesis experiments for DNA and protein, suggesting a physical basis for the uniform lack of success in synthesizing these crucial components for living cells.

Isolated Systems

An isolated system is one in which neither mass nor energy flows in or out. To illustrate such a system, think of a perfectly insulated thermos bottle (no heat loss) filled initially with hot tea and ice cubes. The total energy in this isolated system remains constant but the distribution of the energy changes with time. The ice melts and the energy becomes more uniformly distributed in the system. The initial distribution of energy into hot regions (the tea) and cold regions (the ice) is an ordered, nonrandom arrangement of energy, one not likely to be maintained for very long. By our previous definition then, we may say that the entropy of the system is initially low but gradually increases with time. Furthermore, the second law of thermodynamics says the entropy of the system will continue to increase until it attains some maximum value, which corresponds to the most probable state for the system, usually called equilibrium.

In summary, isolated systems always maintain constant total energy while tending toward maximum entropy, or disorder. In mathematical terms,

E / t = 0

(isolated system)

S / t 0 (7-4)

where E and S are the changes in the system energy and system entropy respectively, for a time interval t. Clearly the emergence of order of any kind in an isolated system is not possible. The second law of thermodynamics says that an isolated system always moves in the direction of maximum entropy and, therefore, disorder.

It should be noted that the process just described is irreversible in the sense that once the ice is melted, it will not reform in the thermos. As a matter of fact, natural decay and the general tendency toward greater disorder are so universal that the second law of thermodynamics has been appropriately dubbed "time's arrow."⁵

Closed Systems near Equilibrium

A closed system is one in which the exchange of energy with the outside world is permitted but the exchange of mass is not. Along the boundary between the closed system and the surroundings, the temperature may be different from the system temperature, allowing energy flow into or out of the system as it moves toward equilibrium. If the temperature along the boundary is variable (in position but not time), then energy will flow through the system, maintaining it some distance from equilibrium. We will discuss closed systems near equilibrium first, followed by a discussion of closed systems removed from equilibrium next.

If we combine the first and second laws as expressed in equations 7-1 and 7-2 and replace the mechanical work term W by P V, where P is pressure and V is volume change, we obtain,

[NOTE: Volume expansion (V> 0) corresponds to the system doing work, and therefore losing energy. Volume contraction
(V 0) corresponds to work being done on the system].

S [E + P V] / [T] (7-5)

Algebraic manipulation gives

E + P V - T S 0 or G 0 (7-6)

where

G = E + P V - T S

The term on the left side of the inequality in equation 7-6 is called the change in the Gibbs free energy (G). It may be thought of as a thermodynamic potential which describes the tendency of a system to change---e.g., the tendency for phase changes, heat conduction, etc. to occur. If a reaction occurs spontaneously, it is because it brings a decrease in the Gibbs free energy (G 0). This requirement is equivalent to the requirement that the entropy of the universe increase. Thus, like an increase in entropy, a decrease in Gibbs free energy simply means that a system and its surroundings are changing in such a way that the energy of the universe is becoming more uniformly distributed.

We may summarize then by noting that the second law of thermodynamics requires,

G / t 0, (closed system) (7-7)

where t indicates the time period during which the Gibbs free energy changed.

The approach to equilibrium is characterized by,

G / t 0, (closed system) (7-8)

The physical significance of equation 7-7 can be understood by rewriting equations 7-6 and 7-7 in the following form:

[S / t] - [ 1 / T (E / t + P V / t)] 0 (7-9)

or

(S / t ) - (1 / T H / t ) 0

and noting that the first term represents the entropy change due to processes going on within the system and the second term represents the entropy change due to exchange of mechanical and/or thermal energy with the surroundings. This simply guarantees that the sum of the entropy change in the system and the entropy change in the surroundings will be greater than zero; i.e., the entropy of the universe must increase. For the isolated system, E + P V = 0 and equation 7-9 reduces to equation 7-4.

A simple illustration of this principle is seen in phase changes such as water transforming into ice. As ice forms, energy (80 calories/gm) is liberated to the surrounding. The change in the entropy of the system as the amorphous water becomes crystalline ice is -0.293 entropy units (eu)/degree Kelvin (K). The entropy change is negative because the thermal and configuration entropy (or disorder) of water is greater than that of ice, which is a highly ordered crystal.

[NOTE: Confirgurational entropy measures randomness in the distribution of matter in much the same way that thermal entropy measures randomness in the distribution of energy].

Thus, the thermodynamic conditions under which water will transform to ice are seen from equation 7-9 to be:

-0.293 - (-80 / T) > 0 (7-l0a)

or

T 273^oK (7-l0b)

For condition of T 273^oK energy is removed from water to produce ice, and the aggregate disordering of the surroundings is greater than the ordering of the water into ice crystals. This gives a net increase in the entropy of the universe, as predicted by the second law of thermodynamics.

It has often been argued by analogy to water crystallizing to ice that simple monomers may polymerize into complex molecules such as protein and DNA. The analogy is clearly inappropriate, however. The E + P V term (equation 7-9) in the polymerization of important organic molecules is generally positive (5 to 8 kcal/mole), indicating the reaction can never spontaneously occur at or near equilibrium.

[NOTE: If E + P V is positive, the entropy term in eq 7 9 must be negative due to the negative sign which preceeds it. The inequality can only be satisfied by S being sufficiently positive, which implies disordenng].

By contrast the E + P V term in water changing to ice is a negative, -1.44 kcal/mole, indicating the phase change is spontaneous as long as T 273^oK, as previously noted. The atomic bonding forces draw water molecules into an orderly crystalline array when the thermal agitation (or entropy driving force, T S) is made sufficiently small by lowering the temperature. Organic monomers such as amino acids resist combining at all at any temperature, however, much less in some orderly arrangement.

Morowitz⁶ has estimated the increase in the chemical bonding energy as one forms the bacterium Escherichia coli from simple precursors to be 0.0095 erg, or an average of 0.27 ev/ atom for the 2 x 10¹⁰ atoms in a single bacterial cell. This would be thermodynamically equivalent to having water in your bathtub spontaneously heat up to 360^oC, happily a most unlikely event. He goes on to estimate the probability of the spontaneous formation of one such bacterium in the entire universe in five billion years under equilibrium conditions to be 10^-1011. Morowitz summarizes the significance of this result by saying that "if equilibrium processes alone were at work, the largest possible fluctuation in the history of the universe is likely to have been no longer than a small peptide."⁷ Nobel Laureate I. Prigogine et al., have noted with reference to the same problem that:

The probability that at ordinary temperatures a macroscopic number of molecules is assembled to give rise to the highly ordered structures and to the coordinated functions characterizing living organisms is vanishingly small. The idea of spontaneous genesis of life in its present form is therefore highly improbable, even on the scale of billions of years during which prebiotic evolution occurred.⁸

It seems safe to conclude that systems near equilibrium (whether isolated or closed) can never produce the degree of complexity intrinsic in living systems. Instead, they will move spontaneously toward maximizing entropy, or randomness. Even the postulate of long time periods does not solve the problem, as "time's arrow" (the second law of thermodynamics) points in the wrong direction; i.e., toward equilibrium. In this regard, H.F. Blum has observed:

The second law of thermodynamics would have been a dominant directing factor in this case [of chemical evolution]; the reactions involved tending always toward equilibrium, that is, toward less free energy, and, in an inclusive sense, greater entropy. From this point of view the lavish amount of time available should only have provided opportunity for movement in the direction of equilibrium.⁹ (Emphasis added.)

Thus, reversing "time's arrow" is what chemical evolution is all about, and this will not occur in isolated or closed systems near equilibrium.

The possibilities are potentially more promising, however, if one considers a system subjected to energy flow which may maintain it far from equilibrium, and its associated disorder. Such a system is said to be a constrained system, in contrast to a system at or near equilibrium which is unconstrained. The possibilities for ordering in such a system will be considered next.

Closed Systems Far from Equilibrium

Energy flow through a system is the equivalent to doing work continuously on the system to maintain it some distance from equilibrium. Nicolis and Prigoginelo have suggested that the entropy change (S) in a system for a time interval (t) may be divided into two components.

S = S_e + S_i (7-11)

where S_e is the entropy flux due to energy flow through the system, and S_i is the entropy production inside the system due to irreversible processes such as diffusion, heat conduction, heat production, and chemical reactions. We will note when we discuss open systems in the next section that S_e includes the entropy flux due to mass flow through the system as well. The second law of thermodynamics requires,

S_i 0 (7-12)

In an isolated system, S_e = 0 and equations 7-11 and 7-12 give,

S =S_i 0 (7-13)

Unlike S_i, S_e in a closed system does not have a definite sign, but depends entirely on the boundary constraints imposed on the system. The total entropy change in the system can be negative (i.e., ordering within system) when,

S_e 0 and | S_e | > S_i (7-14)

Under such conditions a state that would normally be highly improbable under equilibrium conditions can be maintained indefinitely. It would be highly unlikely (i.e., statistically just short of impossible) for a disconnected water heater to produce hot water. Yet when the gas is connected and the burner lit, the system is constrained by energy flow and hot water is produced and maintained indefinitely as long as energy flows through the system.

An open system offers an additional possibility for ordering---that of maintaining a system far from equilibrium via mass flow through the system, as will be discussed in the next section.

An open system is one which exchanges both energy and mass with the surroundings. It is well illustrated by the familiar internal combustion engine. Gasoline and oxygen are passed through the system, combusted, and then released as carbon dioxide and water. The energy released by this mass flow through the system is converted into useful work; namely, torque supplied to the wheels of the automobile. A coupling mechanism is necessary, however, to allow the released energy to be converted into a particular kind of work. In an analagous way the dissipative (or disordering) processes within an open system can be offset by a steady supply of energy to provide for (S) S_e type work. Equation 7-11, applied earlier to closed systems far from equilibrium, may also be applied to open systems. In this case, the S_e term represents the negative entropy, or organizing work done on the system as a result of both energy and mass flow through the system. This work done to the system can move it far from equilibrium, maintaining it there as long as the mass and/or energy flow are not interrupted. This is an essential characteristic of living systems as will be seen in what follows.

Thermodynamics of Living Systems

Living systems are composed of complex molecular configurations whose total bonding energy is less negative than that of their chemical precursors (e.g., Morowitz's estimate of E = 0.27 ev/atom) and whose thermal and configurational entropies are also less than that of their chemical precursors. Thus, the Gibbs free energy of living systems (see equation 7-6) is quite high relative to the simple compounds from which they are formed. The formation and maintenance of living systems at energy levels well removed from equilibrium requires continuous work to be done on the system, even as maintenance of hot water in a water heater requires that continuous work be done on the system. Securing this continuous work requires energy and/or mass flow through the system, apart from which the system will return to an equilibrium condition (lowest Gibbs free energy, see equations 7-7 and 7-8) with the decomposition of complex molecules into simple ones, just as the hot water in our water heater returns to room temperature once the gas is shut off.

In living plants, the energy flow through the system is supplied principally by solar radiation. In fact, leaves provide relatively large surface areas per unit volume for most plants, allowing them to "capture" the necessary solar energy to maintain themselves far from equilibrium. This solar energy is converted into the necessary useful work (negative S_e in equation 7-11) to maintain the plant in its complex, high-energy configuration by a complicated process called photosynthesis. Mass, such as water and carbon dioxide, also flows through plants, providing necessary raw materials, but not energy. In collecting and storing useful energy, plants serve the entire biological world.

For animals, energy flow through the system is provided by eating high energy biomass, either plant or animal. The breaking down of this energy-rich biomass, and the subsequent oxidation of part of it (e.g., carbohydrates), provides a continuous source of energy as well as raw materials. If plants are deprived of sunlight or animals of food, dissipation within the system will surely bring death. Maintenance of the complex, high-energy condition associated with life is not possible apart from a continuous source of energy. A source of energy alone is not sufficient, however, to explain the origin or maintenance of living systems. The additional crucial factor is a means of converting this energy into the necessary useful work to build and maintain complex living systems from the simple biomonomers that constitute their molecular building blocks.

An automobile with an internal combustion engine, transmission, and drive chain provides the necessary mechanism for converting the energy in gasoline into comfortable transportation. Without such an "energy converter," however, obtaining transportation from gasoline would be impossible. In a similar way, food would do little for a man whose stomach, intestines, liver, or pancreas were removed. Without these, he would surely die even though he continued to eat. Apart from a mechanism to couple the available energy to the necessary work, high-energy biomass is insufficient to sustain a living system far from equilibrium. In the case of living systems such a coupling mechanism channels the energy along specific chemical pathways to accomplish a very specific type of work. We therefore conclude that, given the availability of energy and an appropriate coupling mechanism, the maintenance of a living system far from equilibrium presents no thermodynamic problems.

In mathematical formalism, these concepts may be summarized as follows:

(1) The second law of thermodynamics requires only that the entropy production due to irreversible processes within the system be greater than zero; i.e.,

S_i > 0 (7-15)

(2) The maintenance of living systems requires that the energy flow through the system be of sufficient magnitude that the negative entropy production rate (i.e., useful work rate) that results be greater than the rate of dissipation that results from irreversible processes going on within the systems; i.e.,

| S_e | > S_i (7-16)

(3) The negative entropy generation must be coupled into the system in such a way that the resultant work done is directed toward restoration of the system from the disintegration that occurs naturally and is described by the second law of thermodynamics; i.e.,

- S_e = S_i (7-17)

where S_e and S_i refer not only to the magnitude of entropy change but also to the specific changes that occur in the system associated with this change in entropy. The coupling must produce not just any kind of ordering but the specific kind required by the system.

While the maintenance of living systems is easily rationalized in terms of thermodynamics, the origin of such living systems is quite another matter. Though the earth is open to energy flow from the sun, the means of converting this energy into the necessary work to build up living systems from simple precursors remains at present unspecified (see equation 7-17). The "evolution" from biomonomers of to fully functioning cells is the issue. Can one make the incredible jump in energy and organization from raw material and raw energy, apart from some means of directing the energy flow through the system? In Chapters 8 and 9 we will consider this question, limiting our discussion to two small but crucial steps in the proposed evolutionary scheme namely, the formation of protein and DNA from their precursors.

It is widely agreed that both protein and DNA are essential for living systems and indispensable components of every living cell today.¹¹ Yet they are only produced by living cells. Both types of molecules are much more energy and information rich than the biomonomers from which they form. Can one reasonably predict their occurrence given the necessary biomonomers and an energy source? Has this been verified experimentally? These questions will be considered in Chapters 8 and 9.

References

1. Victor F. Weisskopf, 1977. Amer. Sci. 65, 405-11.

2. R. Clausius, 1855. Ann. Phys. 125, 358.

3. R. Caillois, 1976. Coherences Aventureuses. Paris: Gallimard.

4. H.J. Morowitz, 1968. Energy Flow in Biology. New York: Academic Press, p.2-3.

5. H.F. Blum, 1951. Time's Arrow and Evolution. Princeton: Princeton University Press.

6. H.J. Morowitz, Energy Flow, p.66.

7. H.J. Morowitz, Energy Flow, p.68.

8. I. Prigogine, G. Nicolis, and A. Babloyantz, November, 1972. Physics Today, p.23.

9. H.F. Blum, 1955. American Scientist 43, 595.

10. G. Nicolis and I. Prigogine, 1977. Self-Organization in Nonequilibrium Systems. New York: John Wiley, p.24.

11. S.L. Miller and L.E. Crgel, 1974. The Origins of Life on the Earth. Englewood Cliffs, New Jersey: Prentice-Hall, p.162-3.

Thermodynamics and the Origin of Life

Peter Molton has defined life as "regions of order which use energy to maintain their organization against the disruptive force of entropy."¹ In Chapter 7 it has been shown that energy and/or mass flow through a system can constrain it far from equilibrium, resulting in an increase in order. Thus, it is thermodynamically possible to develop complex living forms, assuming the energy flow through the system can somehow be effective in organizing the simple chemicals into the complex arrangements associated with life.

In existing living systems, the coupling of the energy flow to the organizing "work" occurs through the metabolic motor of DNA, enzymes, etc. This is analogous to an automobile converting the chemical energy in gasoline into mechanical torque on the wheels. We can give a thermodynamic account of how life's metabolic motor works. The origin of the metabolic motor (DNA, enzymes, etc.) itself, however, is more difficult to explain thermodynamically, since a mechanism of coupling the energy flow to the organizing work is unknown for prebiological systems. Nicolis and Prigogine summarize the problem in this way:

Needless to say, these simple remarks cannot suffice to solve the problem of biological order. One would like not only to establish that the second law (dS_i 0) is compatible with a decrease in overall entropy (dS < 0), but also to indicate the mechanisms responsible for the emergence and maintenance of coherent states.²

Without a doubt, the atoms and molecules which comprise living cells individually obey the laws of chemistry and physics, including the laws of thermodynamics. The enigma is the origin of so unlikely an organization of these atoms and molecules. The electronic computer provides a striking analogy to the living cell. Each component in a computer obeys the laws of electronics and mechanics. The key to the computer's marvel lies, however, in the highly unlikely organization of the parts which harness the laws of electronics and mechanics. In the computer, this organization was specially arranged by the designers and builders and continues to operate (with occasional frustrating lapses) through the periodic maintenance of service engineers.

Living systems have even greater organization. The problem then, that molecular biologists and theoretical physicists are addressing, is how the organization of living systems could have arisen spontaneously. Prigogine et al., have noted:

All these features bring the scientist a wealth of new problems. In the first place, one has systems that have evolved spontaneously to extremely organized and complex forms. Coherent behavior is really the characteristic feature of biological systems.³

In this chapter we will consider only the problem of the origin of living systems. Specifically, we will discuss the arduous task of using simple biomonomers to construct complex polymers such as DNA and protein by means of thermal, electrical, chemical, or solar energy. We will first specify the nature and magnitude of the "work" to be done in building DNA and enzymes.

[NOTE: Work in physics normally refers to force times displacement. In this chapter it refers in a more general way to the change in Gibbs free energy of the system that accompanies the polymerization of monomers into polymers].

In Chapter 9 we will describe the various theoretical models which attempt to explain how the undirected flow of energy through simple chemicals can accomplish the work necessary to produce complex polymers. Then we will review the experimental studies that have been conducted to test these models. Finally we will summarize the current understanding of this subject.

How can we specify in a more precise way the work to be done by energy flow through the system to synthesize DNA and protein from simple biomonomers? While the origin of living systems involves more than the genesis of enzymes and DNA, these components are essential to any system if replication is to occur. It is generally agreed that natural selection can act only on systems capable of replication. This being the case, the formation of a DNA/enzyme system by processes other than natural selection is a necessary (though not sufficient) part of a naturalistic explanation for the origin of life.

[NOTE: A sufficient explanation for the origin of life would also require a model for the formation of other critical cellular components, including membranes, and their assembly].

Order vs. Complexity in the Question of Information

Only recently has it been appreciated that the distinguishing feature of living systems is complexity rather than order.⁴This distinction has come from the observation that the essential ingredients for a replicating system---enzymes and nucleic acids---are all information-bearing molecules. In contrast, consider crystals. They are very orderly, spatially periodic arrangements of atoms (or molecules) but they carry very little information. Nylon is another example of an orderly, periodic polymer (a polyamide) which carries little information. Nucleic acids and protein are aperiodic polymers, and this aperiodicity is what makes them able to carry much more information. By definition then, a periodic structure has order. An aperiodic structure has complexity. In terms of information, periodic polymers (like nylon) and crystals are analogous to a book in which the same sentence is repeated throughout. The arrangement of "letters" in the book is highly ordered, but the book contains little information since the information presented---the single word or sentence---is highly redundant.

It should be noted that aperiodic polypeptides or polynucleotides do not necessarily represent meaningful information or biologically useful functions. A random arrangement of letters in a book is aperiodic but contains little if any useful information since it is devoid of meaning.

[NOTE: H.P. Yockey, personal communication, 9/29/82. Meaning is extraneous to the sequence, arbitrary, and depends on some symbol convention. For example, the word "gift," which in English means a present and in German poison, in French is meaningless].

Only certain sequences of letters correspond to sentences, and only certain sequences of sentences correspond to paragraphs, etc. In the same way only certain sequences of amino acids in polypeptides and bases along polynucleotide chains correspond to useful biological functions. Thus, informational macro-molecules may be described as being and in a specified sequence.⁵ Orgel notes:

Living organisms are distinguished by their specified complexity. Crystals fail to qualify as living because they lack complexity; mixtures of random polymers fail to qualify because they lack specificity.⁶

Three sets of letter arrangements show nicely the difference between order and complexity in relation to information:

1. An ordered (periodic) and therefore specified arrangement:

THE END THE END THE END THE END

Example: Nylon, or a crystal.

[NOTE: Here we use "THE END" even though there is no reason to suspect that nylon or a crystal would carry even this much information. Our point, of course, is that even if they did, the bit of information would be drowned in a sea of redundancy].

2. A complex (aperiodic) unspecified arrangement:

AGDCBFE GBCAFED ACEDFBG

Example: Random polymers (polypeptides).

3. A complex (aperiodic) specified arrangement:

THIS SEQUENCE OF LETTERS CONTAINS A MESSAGE!

Example: DNA, protein.

Yockey⁷ and Wickens⁵ develop the same distinction, that "order" is a statistical concept referring to regularity such as could might characterize a series of digits in a number, or the ions of an inorganic crystal. On the other hand, "organization" refers to physical systems and the specific set of spatio-temporal and functional relationships among their parts. Yockey and Wickens note that informational macromolecules have a low degree of order but a high degree of specified complexity. In short, the redundant order of crystals cannot give rise to specified complexity of the kind or magnitude found in biological organization; attempts to relate the two have little future.

Information and Entropy

There is a general relationship between information and entropy. This is fortunate because it allows an analysis to be developed in the formalism of classical thermodynamics, giving us a powerful tool for calculating the work to be done by energy flow through the system to synthesize protein and DNA (if indeed energy flow is capable of producing information). The information content in a given sequence of units, be they digits in a number, letters in a sentence, or amino acids in a polypeptide or protein, depends on the minimum number of instructions needed to specify or describe the structure. Many instructions are needed to specify a complex, information-bearing structure such as DNA. Only a few instructions are needed to specify an ordered structure such as a crystal. In this case we have a description of the initial sequence or unit arrangement which is then repeated ad infinitum according to the packing instructions.

Orgel⁹ illustrates the concept in the following way. To describe a crystal, one would need only to specify the substance to be used and the way in which the molecules were to be packed together. A couple of sentences would suffice, followed by the instructions "and keep on doing the same," since the packing sequence in a crystal is regular. The description would be about as brief as specifying a DNA-like polynucleotide with a random sequence. Here one would need only to specify the proportions of the four nucleotides in the final product, along with instructions to assemble them randomly. The chemist could then make the polymer with the proper composition but with a random sequence.

It would be quite impossible to produce a correspondingly simple set of instructions that would enable a chemist to synthesize the DNA of an E. coli bacterium. In this case the sequence matters. Only by specifying the sequence letter-by-letter (about 4,000,000 instructions) could we tell a chemist what to make. Our instructions would occupy not a few short sentences, but a large book instead!

Brillouin,¹⁰ Schrodinger,¹¹ and others¹² have developed both qualitative and quantitative relationships between information and entropy. Brillouin,¹³ states that the entropy of a system is given by

S = k ln (8-1)

where S is the entropy of the system, k is Boltzmann's constant, and corresponds to the number of ways the energy and mass in a system may be arranged.

We will use S_th and S_c to refer to the thermal and configurational entropies, respectively. Thermal entropy, S_th, is associated with the distribution of energy in the system. Configurational entropy S_c is concerned only with the arrangement of mass in the system, and, for our purposes, we shall be especially interested in the sequencing of amino acids in polypeptides (or proteins) or of nucleotides in polynucleotides (e.g., DNA). The symbols _th and _c refer to the number of ways energy and mass, respectively, may be arranged in a system.

Thus we may be more precise by writing

S = k ln_th _c = k ln_th + k ln_c = S_th + S_c (8-2A)

where

S_th = k ln_th (8-2b)

and

S_c = k ln_c (8-2c)

Determining Information: From a Random Polymer to an Informed Polymer

If we want to convert a random polymer into an informational molecule, we can determine the increase in information (as defined by Brillouin) by finding the difference between the negatives of the entropy states for the initial random polymer and the informational molecule:

I = - (S_cm - S_cr) (8-3A),

I = S_cr - S_cm (8-3b),

= k ln_cr - k ln_cm (8-3c)

In this equation, I is a measure of the information content of an aperiodic (complex) polymer with a specified sequence, S_cm represents the configurational "coding" entropy of this polymer informed with a given message, and S_cr represents the configurational entropy of the same polymer for an unspecified or random sequence.

[NOTE: Yockey and Wickens define information slightly differently than Brilloum, whose definition we use in our analysis. The difference is unimportant insofar as our analysis here is concerned].

Note that the information in a sequence-specified polymer is maximized when the mass in the molecule could be arranged in many different ways, only one of which communicates the intended message. (There is a large S_cr from eq. 8-2c since _cr is large, yet S_cm = 0 from eq. 8-2c since _cm = 1.) The information carried in a crystal is small because S_c is small (eq. 8-2c) for a crystal. There simply is very little potential for information in a crystal because its matter can be distributed in so few ways. The random polymer provides an even starker contrast. It bears no information because S_cr, although large, is equal to S_cm (see eq. 8-3b).

In summary, equations 8-2c and 8-3c quantify the notion that only specified, aperiodic macromolecules are capable of carrying the large amounts of information characteristic of living systems. Later we will calculate "_c" for both random and specified polymers so that the configurational entropy change required to go from a random to a specified polymer can be determined. In the next section we will consider the various components of the total work required in the formation of macromolecules such as DNA and protein.

DNA and Protein Formation:

Defining the Work

There are three distinct components of work to be done in assembling simple biomonomers into a complex (or aperiodic) linear polymer with a specified sequence as we find in DNA or protein. The change in the Gibbs free energy, G, of the system during polymerization defines the total work that must be accomplished by energy flow through the system. The change in Gibbs free energy has previously been shown to be

G = E + P V - T S (8-4a)

G = H - T S (8-4b)

where a decrease in Gibbs free energy for a given chemical reaction near equilibrium guarantees an increase in the entropy of the universe as demanded by the second law of thermodynamics.

Now consider the components of the Gibbs free energy (eq. 8-4b) where the change in enthalpy (H) is principally the result of changes in the total bonding energy (E), with the (P V) term assumed to be negligible. We will refer to this enthalpy component (H) as the chemical work. A further distinction will be helpful. The change in the entropy (S) that accompanies the polymerization reaction may be divided into two distinct components which correspond to the changes in the thermal energy distribution (S_th) and the mass distribution (S_c), eq. 8-2. So we can rewrite eq. 8-4b as

G = H - TS_th - T S_c (8-5)

that is,

(Gibbs free energy) = (Chemical work) - (Thermal entropy work) - (Configurational entropy work)

It will be shown that polymerization of macromolecules results in a decrease in the thermal and configurational entropies (S_th 0, S_c 0). These terms effectively increase G, and thus represent additional components of work to be done beyond the chemical work.

Consider the case of the formation of protein or DNA from biomonomers in a chemical soup. For computational purposes it may be thought of as requiring two steps: (1) polymerization to form a chain molecule with an aperiodic but near-random sequence, and (2) rearrangement to an aperiodic, specified information-bearing sequence.

[NOTE: Some intersymbol influence arising from differential atomic bonding properties makes the distribution of matter not quite random. (H.P. Yockey, 1981. J. Theoret. Biol. 91,13)].

The entropy change (S) associated with the first step is essentially all thermal entropy change (S_th), as discussed above. The entropy change of the second step is essentially all configurational entropy reducing change (S_c). In fact, as previously noted, the change in configurational entropy (S_c) = S_c "coding" as one goes from a random arrangement (S_cr) to a specified sequence (S_cm) in a macromolecule is numerically equal to the negative of the information content of the molecule as defined by Brillouin (see eq. 8-3a).

In summary, the formation of complex biological polymers such as DNA and protein involves changes in the chemical energy, H, the thermal entropy, S_th, and the configurational entropy, S_c, of the system. Determining the magnitudes of these individual changes using experimental data and a few calculations will allow us to quantify the magnitude of the required work potentially to be done by energy flow through the system in synthesizing macromolecules such as DNA and protein.

Quantifying the Various Components of Work

1. Chemical Work

The polymerization of amino acids to polypeptides (protein) or of nucleotides to polynucleotides (DNA) occurs through condensation reactions. One may calculate the enthalpy change in the formation of a dipeptide from amino acids to be 5-8 kcal/mole for a variety of amino acids, using data compiled by Hutchens.¹⁴ Thus, chemical work must be done on the system to get polymerization to occur. Morowitz¹⁵ has estimated more generally that the chemical work, or average increase in enthalpy, for macromolecule formation in living systems is 16.4 cal/gm. Elsewhere in the same book he says that the average increase in bonding energy in going from simple compounds to an E. coli bacterium is 0.27 ev/atom. One can easily see that chemical work must be done on the biomonomers to bring about the formation of macromolecules like those that are essential to living systems. By contrast, amino acid formation from simple reducing atmosphere gases (methane, ammonia, water) has an associated enthalpy change (H) of -50 kcal/mole to -250 kcal/ mole,¹⁶ which means energy is released rather than consumed. This explains why amino acids form with relative ease in prebiotic simulation experiments. On the other hand, forming amino acids from less-reducing conditions (i.e., carbon dioxide, nitrogen, and water) is known to be far more difficult experimentally. This is because the enthalpy change (H) is positive, meaning energy is required to drive the energetically unfavorable chemical reaction forward.

2. Thermal Entropy Work

Wickens¹⁷ has noted that polymerization reactions will reduce the number of ways the translational energy may be distributed, while generally increasing the possibilities for vibrational and rotational energy. A net decrease results in the number of ways the thermal energy may be distributed, giving a decrease in the thermal entropy according to eq. 8-2b (i.e., S_th0). Quantifying the magnitude of this decrease in thermal entropy (S_th) associated with the formation of a polypeptide or a polynucleotide is best accomplished using experimental results.

Morowitz¹⁸ has estimated that the average decrease in thermal entropy that occurs during the formation of macromolecules of living systems in 0.218 cal/deg-gm or 65 cal/gm at 298^oK. Recent work by Armstrong et al.,¹⁹ for nucleotide oligomerization of up to a pentamer indicates H and -T S_th values of 11.8 kcal/mole and 15.6 kcal/mole respectively, at 294K. Thus the decrease in thermal entropy during the polymerization of the macromolecules of life increases the Gibbs free energy and the work required to make these molecules, i.e., -T S_th > 0.

3. Configurational Entropy Work

Finally, we need to quantify the configurational entropy change (S_c) that accompanies the formation of DNA and protein. Here we will not get much help from standard experiments in which the equilibrium constants are determined for a polymerization reaction at various temperatures. Such experiments do not consider whether a specific sequence is achieved in the resultant polymers, but only the concentrations of randomly sequenced polymers (i.e., polypeptides) formed. Consequently, they do not measure the configurational entropy (S_c) contribution to the total entropy change (S). However, the magnitude of the configurational entropy change associated with sequencing the polymers can be calculated.

Using the definition for configurational "coding" entropy given in eq. 8-2c, it is quite straightforward to calculate the configurational entropy change for a given polymer. The number of ways the mass of the linear system may be arranged (_c) can be calculated using statistics. Brillouin²⁰ has shown that the number of distinct sequences one can make using N different symbols and Fermi-Dirac statistics is given by

= N! (8-6)

If some of these symbols are redundant (or identical), then the number of unique or distinguishable sequences that can be made is reduced to

_c = N! / n₁!n₂!n₂!...n_i! (8-7)

where n₁ + n₂ + ... + n_i = N and i defines the number of distinct symbols. For a protein, it is i =20, since a subset of twenty distinctive types of amino acids is found in living things, while in DNA it is i = 4 for the subset of four distinctive nucleotides. A typical protein would have 100 to 300 amino acids in a specific sequence, or N = 100 to 300. For DNA of the bacterium E. coli, N = 4,000,000. In Appendix 1, alternative approaches to calculating _c are considered and eq. 8-7 is shown to be a lower bound to the actual value.

For a random polypeptide of 100 amino acids, the configurational entropy, S_cr, may be calculated using eq. 8-2c and eq. 8-7 as follows:

S_cr = k ln_cr

since _cr = N! / n₁!n₂!...n₂₀! = 100! / 5!5!....5! = 100! / (5!)²⁰

= 1.28 x 10¹¹⁵ (8-8)

The calculation of equation 8-8 assumes that an equal number of each type of amino acid, namely 5, are contained in the polypeptide. Since k, or Boltzmann's constant, equals 1.38 x 10^-16 erg/deg, and ln [1.28 x 10¹¹⁵] = 265,

S_cr = 1.38 x 10^-16 x 265 = 3.66 x 10^-14 erg/deg-polypeptide

If only one specific sequence of amino acids could give the proper function, then the configurational entropy for the protein or specified, aperiodic polypeptide would be given by

S_cm = k ln_cm
= k ln 1
= 0
(8-9)

Determining _scin Going from a Random Polymer to an Informed Polymer

The change in configurational entropy, S_c, as one goes from a random polypeptide of 100 amino acids with an equal number of each amino acid type to a polypeptide with a specific message or sequence is:

S_c = S_cm - S_cr

= 0 - 3.66 x 10^-14 erg/deg-polypeptide
= -3.66 x 10^-14 erg/deg-polypeptide (8-10)

The configurational entropy work (-T S_c) at ambient temperatures is given by

-T S_c = - (298^oK) x (-3.66 x 10^-14) erg/deg-polypeptide
= 1.1 x 10^-11 erg/polypeptide
= 1.1 x 10^-11 erg/polypeptide x [6.023 x 10²³ molecules/mole] / [10,000 gms/mole] x [1 cal] / 4.184 x 10⁷ ergs

= 15.8 cal/gm (8-11)

where the protein mass of 10,000 amu was estimated by assuming an average amino acid weight of 100 amu after the removal of the water molecule. Determination of the configurational entropy work for a protein containing 300 amino acids equally divided among the twenty types gives a similar result of 16.8 cal/gm.

In like manner the configurational entropy work for a DNA molecule such as for E. coli bacterium may be calculated assuming 4 x 10⁶ nucleotides in the chain with 1 x 10⁶ each of the four distinctive nucleotides, each distinguished by the type of base attached, and each nucleotide assumed to have an average mass of 339 amu. At 298^oK:

-T S_c = -T (S_cm - S_cr)

= T ( S_cr - S_cm)

= kT ln (_cr - ln_cm)

= kT ln [(4 x 10⁶)! / (10⁶)!(10⁶)!(10⁶)!(10⁶)!] - kT ln 1

= 2.26 x 10^-7 erg/polynucleotide

= 2.39 cal/gm 8-12

It is interesting to note that, while the work to code the DNA molecule with 4 million nucleotides is much greater than the work required to code a protein of 100 amino acids (2.26 x 10^-7 erg/DNA vs. 1.10 x 10^-11 erg/protein), the work per gram to code such molecules is actually less in DNA. There are two reasons for this perhaps unexpected result: first, the nucleotide is more massive than the amino acid (339 amu vs. 100 amu); and second, the alphabet is more limited, with only four useful nucleotide "letters" as compared to twenty useful amino acid letters. Nevertheless, it is the total work that is important, which means that synthesizing DNA is much more difficult than synthesizing protein.

It should be emphasized that these estimates of the magnitude of the configurational entropy work required are conservatively small. As a practical matter, our calculations have ignored the configurational entropy work involved in the selection of monomers. Thus, we have assumed that only the proper subset of 20 biologically significant amino acids was available in a prebiotic oceanic soup to form a biofunctional protein. The same is true of DNA. We have assumed that in the soup only the proper subset of 4 nucleotides was present and that these nucleotides do not interact with amino acids or other soup ingredients. As we discussed in Chapter 4, many varieties of amino acids and nucleotides would have been present in a real ocean---varieties which have been ignored in our calculations of configurational entropy work. In addition, the soup would have contained many other kinds of molecules which could have reacted with amino acids and nucleotides. The problem of using only the appropriate optical isomer has also been ignored. A random chemical soup would have contained a 50-50 mixture of D- and L-amino acids, from which a true protein could incorporate only the Lenantiomer. Similarly, DNA uses exclusively the optically active sugar D-deoxyribose. Finally, we have ignored the problem of forming unnatural links, assuming for the calculations that only CL-links occurred between amino acids in making polypeptides, and that only correct linking at the 3', 5'-position of sugar occurred in forming polynucleotides. A quantification of these problems of specificity has recently been made by Yockey.²¹

The dual problem of selecting the proper composition of matter and then coding or rearranging it into the proper sequence is analogous to writing a story using letters drawn from a pot containing many duplicates of each of the 22 Hebrew consonants and 24 Greek and 26 English letters all mixed together. To write in English the message,

HOW DID I GET HERE?

we must first draw from the pot 2 Hs, 2 Is, 3 Es, 2 Ds, and one each of the letters W, 0, G, T, and R. Drawing or selecting this specific set of letters would be a most unlikely event itself. The work of selecting just these 14 letters would certainly be far greater than arranging them in the correct sequence. Our calculations only considered the easier step of coding while ignoring the greater problem of selecting the correct set of letters to be coded. We thereby greatly underestimate the actual configurational entropy work to be done.

In Chapter 6 we developed a scale showing degrees of investigator interference in prebiotic simulation experiments. In discussing this scale it was noted that very often in reported experiments the experimenter has actually played a crucial but illegitimate role in the success of the experiment. It becomes clear at this point that one illegitimate role of the investigator is that of providing a portion of the configurational entropy work, i.e., the "selecting" work portion of the total -T S_c work.

It is sometimes argued that the type of amino acid that is present in a protein is critical only at certain positions---active sites---along the chain, but not at every position. If this is so, it means the same message (i.e., function) can be produced with more than one sequence of amino acids.

This would reduce the coding work by making the number of permissible arrangements _cm in eqs. 8-9 and 8-10 for S_cm greater than 1. The effect of overlooking this in our calculations, however, would be negligible compared to the effect of overlooking the "selecting" work and only considering the "coding" work, as previously discussed. So we are led to the conclusion that our estimate for S_c is very conservatively low.

Calculating the Total Work: Polymerization of Biomacromolecules

It is now possible to estimate the total work required to combine biomonomers into the appropriate polymers essential to living systems. This calculation using eq. 8-5 might be thought of as occurring in two steps. First, amino acids polymerize into a polypeptide, with the chemical and thermal entropy work being accomplished (H -T S_th). Next, the random polymer is rearranged into a specific sequence which constitutes doing configurational entropy work (-T S_c). For example, the total work as expressed by the change in Gibbs free energy to make a specified sequence is

G = H - T S_th - T S_c (8-13)

where H - T S_th may be assumed to be 300 kcal/mole to form a random polypeptide of 101 amino acids (100 links). The work to code this random polypeptide into a useful sequence so that it may function as a protein involves the additional component of T S_c "coding" work, which has been estimated previously to be 15.9 cal/gm, or approximately 159 kcal/mole for our protein of 100 links with an estimated mass of 10,000 amu per mole. Thus, the total work (neglecting the "sorting and selecting" work) is approximately

G = (300 + 159) kcal/mole = 459 kcal/mole (8-14)

with the coding work representing 159/459 or 35% of the total work.

In a similar way, the polymerization of 4 x 10⁶ nucleotides into a random polynucleotide would require approximately 27 x 10⁶ kcal/mole. The coding of this random polynucleotide into the specified, aperiodic sequence of a DNA molecule would require an additional 3.2 x 10⁶ kcal/mole of work. Thus, the fraction of the total work that is required to code the polymerized DNA is seen to be 8.5%, again neglecting the "sorting and selecting" work.

The Impossibility of Protein Formation under Equilibrium Conditions

It was noted in Chapter 7 that because macromolecule formation (such as amino acids polymerizing to form protein) goes uphill energetically, work must be done on the system via energy flow through the system. We can readily see the difficulty in getting polymerization reactions to occur under equilibrium conditions, i.e., in the absence of such an energy flow.

Under equilibrium conditions the concentration of protein one would obtain from a solution of 1 M concentration in each amino acid is given by:

K= [protein] x [H2 0] / [glycine] [alanine]... (8-15)

where K is the equilibrium constant and is calculated by

K = exp [ - G / RT ] (8-16)

An equivalent form is

G = -RT ln K (8-17)

We noted earlier that G = 459 kcal/mole for our protein of 101 amino acids. The gas constant R = 1.9872 cal/deg-mole and T is assumed to be 298^oK. Substituting these values into eqs. 8-15 and 8-16 gives

protein concentration = 10^-338 M (8-18)

This trivial yield emphasizes the futility of protein formation under equilibrium conditions. In the next chapter we will consider various theoretical models attempting to show how energy flow through the system can be useful in doing the work quantified in this chapter for the polymerization of DNA and protein. Finally, we will examine experimental efforts to accomplish biomacromolecule synthesis.

References

1. Peter M. Molton, 1978. J. Brit. Interplanet. Soc. 31, 147.

2. G. Nicolis and I. Prigogine, 1977. Self Organization in Nonequilibrium Systems. New York: John Wiley, p.25.

3. I. Prigogine, G. Nicolis, and A. Babloyantz, 1972. Physics Today, p.23.

4. L.E. Orgel, 1973. The Origins of Life. New York: John Wiley, p. 189ff; M. Polanyi, 1968. Science 160, 1308; Huberi P. Yockey, 1977. J. Theoret. Biol 67, 377; Jeffrey Wickens, 1978. J. Theoret Biol. 72, 191.

5. Yockey, J. Theoret. Biol, p.383.

6. Orgel, The Origins of Life, p.189.

7. Yockey, J. Theoret. Biol, p.579.

8. Wickens, J. Theoret. Biol., p.191.

9. Orgel, The Origins of Life, p.190.

10. L. Brillouin, 1951. J. Appi. Phys. 22, 334; 1951. J. Appl Phys. 22, 338; 1950. Amer. Sci. 38, 5941949. Amer. Sci. 37, 554.

11. E. Schrodinger, 1945. What is Life? London: Cambridge University Press, and New York: Macmillan.

12. W. Ehrenberg, 1967. Sci. Amer. 217,108; Myron Tribus and Edward C. McIrvine, 1971. Sci. Amer. 225, 197.

13. Brillouin, J. AppL Phys. 22, 885.

14. John 0. Hutchens, 1976. Handbook of Biochemistry and Molecular Biology, 3rd ed., Physical and Chemical Data, Gerald D. Fasman. Cleveland: CRC Press.

15. H. Morowitz, 1968. Energy Flow in Biology. New York: Academic Press, p.79.

16. H. Borsook and H.M. Huffman, 1944. Chemistry of Amino Acids and Proteins, ed. C.L.A. Schmidt. Springfield, Mass.: Charles C. Thomas Co., p.822.

17. Wickens, J. Theoret. Biol, p.191.

18. Morowitz, Energy Flow in Biology, p.79.

19. D.W. Armstrong, F. Nome, J.H. Fendler, and J. Nagyvary, 1977. J. Mol. Evol. 9, 218.

20. Brilllouin, J. AppL Phys. 22, 338.

21. H.P. Yockey, 1981. J. Theoret. Biol 91, 13.

Specifying How Work Is To Be Done

In Chapter 7 we saw that the work necessary to polymerize DNA and protein molecules from simple biomonomers could potentially be accomplished by energy flow through the system. Still, we know that such energy flow is a necessary but not sufficient condition for polymerization of the macromolecules of life. Arranging a pile of bricks into the configuration of a house requires work. One would hardly expect to accomplish this work with dynamite, however. Not only must energy flow through the system, it must be coupled in some specific way to the work to be done. This being so, we devoted Chapter 8 to identifying various components of work in typical polymerization reactions. In reviewing those individual work components, one thing became clear. The coupling of energy flow to the specific work requirements in the formation of DNA and protein is particularly important since the required configurational entropy work of coding is substantial.

Theoretical Models for the Origin of DNA and Protein

A mere appeal to open system thermodynamics does little good. What must be done is to advance a workable theoretical model of how the available energy can be coupled to do the required work. In this chapter various theoretical models for the origin of DNA and protein will be evaluated. Specifically, we will discuss how each proposes to couple the available energy to the required work, particularly the configurational entropy work of coding.

Chance

Before the specified complexity of living systems began to be appreciated, it was thought that, given enough time, "chance" would explain the origin of living systems. In fact, most textbooks state that chance is the basic explanation for the origin of life. For example, Lehninger in his classic textbook Biochemistry states,

We now come to the critical moment in evolution in which the first semblance of "life" appeared, through the chance association of a number of abiotically formed macromolecular components, to yield a unique system of greatly enhanced survival value.¹

More recently the viability of "chance" as a mechanism for the origin of life has been severely challenged.²

We are now ready to analyze the "chance" origin of life using the approach developed in the last chapter. This view usually assumes that energy flow through the system is capable of doing the chemical and the thermal entropy work, while the configurational entropy work of both selecting and coding is the fortuitous product of chance.

To illustrate, assume that we are trying to synthesize a protein containing 101 amino acids. In eq. 8-14 we estimated that the total free energy increase (G) or work required to make a random polypeptide from previously selected amino acids was 300 kcal/mole. An additional 159 kcal/mole is needed to code the polypeptide into a protein. Since the "chance" model assumes no coupling between energy flow and sequencing, the fraction of the polypeptide that has the correct sequence may be calculated (eq. 8-16) using equilibrium thermodynamics, i.e.,

[protein concentration] / [polypeptide concentration] = exp ( - G / RT), eq. (9-1)

= exp (-159,000) / 1.9872 x 298)

or approximately 1 x 10^-117

This ratio gives the fraction of polypeptides that have the right sequence to be a protein.

[NOTE: This is essentially the inverse of the estimate for the number of ways one can arrange 101 amino acids in a sequence (i.e., I / _c in eq. 8-7)].

Eigen³ has estimated the number of polypeptides of molecular weight 10 ⁴ (the same weight used in our earlier calculations) that would be found in a layer 1 meter thick covering the surface of the entire earth. He found it to be 10⁴¹. If these polypeptides reformed with new sequences at the maximum rate at which chemical reactions may occur, namely 10¹⁴/s, for 5 x 10⁹ years [1.6 x 10¹⁷ s], the total number of polypeptides that would be formed during the assumed history of the earth would be

10⁴¹ x 10¹⁴/s x 1.6 x 10¹⁷s = 10⁷² (9-2)

Combining the results of eq. 9-1 and 9-2, we find the probability of producing one protein of 101 amino acids in five billion years is only 1/ 10⁴⁵. Using somewhat different illustrations, Steinman⁴and Cairns-Smith⁵ also come to the conclusion that chance is insufficient.

It is apparent that "chance" should be abandoned as an acceptable model for coding of the macromolecules essential in living systems. In fact, it has been, except in introductory texts and popularizations.

Neo-Darwinian Natural Selection

The widespread recognition of the severe improbability that self-replicating organisms could have formed from purely random interactions has led to a great deal of speculation---speculation that some organizing principle must have been involved. In the company of many others, Crick⁶has considered that the neo-Darwinian mechanism of natural selection might provide the answer. An entity capable of self-replication is necessary, however, before natural selection can operate. Only then could changes result via mutations and environmental pressures which might in turn bring about the dominance of entities with the greatest probabilities of survival and reproduction.

The weakest point in this explanation of life's origin is the great complexity of the initial entity which must form, apparently by random fluctuations, before natural selection can take over. In essence this theory postulates the chance formation of the "metabolic motor" which will subsequently be capable of channeling energy flow through the system. Thus harnessed by coupling through the metabolic motor, the energy flow is imagined to supply not only chemical and thermal entropy work, but also the configurational entropy work of selecting the appropriate chemicals and then coding the resultant polymer into an aperiodic, specified, biofunctioning polymer. As a minimum, this system must carry in its structure the information for its own synthesis, and control the machinery which will fabricate any desired copy. It is widely agreed that such a system requires both protein and nucleic acid.⁷ This view is not unanimous, however. A few have suggested that a short peptide would be sufficient.⁸

One way out of the problem would be to extend the concept of natural selection to the pre-living world of molecules. A number of authors have entertained this possibility, although no reasonable explanation has made the suggestion plausible. Natural selection is a recognized principle of differential reproduction which presupposes the existence of at least two distinct types of self-replicating molecules. Dobzhansky appealed to those doing origin-of-life research not to tamper with the definition of natural selection when he said:

I would like to plead with you, simply, please realize you cannot use the words "natural selection" loosely. Prebiological natural selection is a contradiction in terms.⁹

Bertalanffy made the point even more cogently:

Selection, i.e., favored survival of "better" precursors of life, already presupposes self-maintaining, complex, open systems which may compete; therefore selection cannot account for the origin of such systems.¹⁰

Inherent Self-Ordering Tendencies in Matter

How could energy flow through the system be sufficiently coupled to do the chemical and thermal entropy work to form a nontrivial yield of polypeptides (as previously assumed in the "chance" model)? One answer has been the suggestion that configurational entropy work, especially the coding work, could occur as a consequence of the self-ordering tendencies in matter. The experimental work of Steinman and Cole¹¹ in the late Sixties is still widely cited in support of this model.¹² The polymerization of protein is hypothesized to be a nonrandom process, the coding of the protein resulting from differences in the chemical bonding forces. For example, if amino acids A and B react chemically with one another more readily than with amino acids C, D, and E, we should expect to see a greater frequency of AB peptide bonds in protein than AC, AD, AE, or BC, BD, BE bonds.

Together with our colleague Randall Kok, we have recently analyzed the ten proteins originally analyzed by Steinman and Cole,¹³ as well as fifteen additional proteins whose structures (except for hemoglobin) have been determined since their work was first published in 1967. Our expectation in this study was that one would only get agreement between the dipeptide bond frequencies from Steinman and Cole's work and those observed in actual proteins if one considered a large number of proteins averaged together. The distinctive structures of individual proteins would cause them to vary greatly from Steinman and Cole's data, so only when these distinctives are averaged out could one expect to approach Steinman and Cole's dipeptide bond frequency results. The reduced data presented in table 9-1 shows that Steinman and Cole's dipeptide bond frequencies do not correlate well with the observed peptide bond frequencies for one, ten, or twenty-five proteins. It is a simple matter to make such calculations on an electronic digital computer. We surmise that additional assumptions not stated in their paper were used to achieve the better agreements.

Furthermore, the peptide bond frequencies for the twenty-five proteins approach a distribution predicted by random statistics rather than the dipeptide bond frequency measured by Steinman and Cole. This observation means that bonding preferences between various amino acids play no significant role in coding protein. Finally, if chemical bonding forces were influential in amino acid sequencing, one would expect to get a single sequence (as in ice crystals) or no more than a few sequences, instead of the large variety we observe in living systems. Yockey, with a different analysis, comes to essentially the same conclusion.¹⁴

A similar conclusion may be drawn for DNA synthesis. No one to date has published data indicating that bonding preferences could have had any role in coding the DNA molecules. Chemical bonding forces apparently have minimal effect on the sequence of nucleotides in a polynucleotide.

Table 9-1.

Comparison of Steinman and Cole's experimentally determined dipeptide bond frequencies, and frequencies calculated by Steinman and Cole, and by Kok and Bradley from known protein sequences.

Dipeptide*	Values (relative to Gly-Gly)

	S / C+		K / B #
	exp &	cal	cal-wa	cal-woa
Gly-Gly	1.0	1.0	1.0 (1.0) [1.0]	1.0 (1.0) [1.0]
Gly-Ala	0.8	0.7	1.1 (1.1) [2.0]	2.0 (1.2) [1.0]
Ala-Gly	0.8	0.6	1.0 (1.1) [2.2]	1.5 (1.2) [0.0]
Ala-Ala	0.7	0.6	1.3 (1.5) [4.4]	2.8 (1.5) [0.0]
Gly-Val	0.5	0.2	0.2 (0.3) [0.4]	1.5 (1.2) [1.0]
Val-Gly	0.5	0.3	0.3 (0.3) [0.6]	0.8 (0.6) [0.0]
Gly-Leu	0.5	0.3	0.3 (0.3) [0.2]	1.3 (0.7) [1.0]
Leu-Gly	0.5	0.2	0.3 (0.3) [0.8]	1.3 (1.0) [1.0]
Gly-Ile	0.3	0.1	0.1 (0.2) [0.6]	1.0 (0.8) [0.0]
Ile-Gly	0.3	0.1	0.1 (0.2) [0.2]	0.0 (0.4) [0.0]
Gly-Phe	0.1	0.1	0.1 (0.2) [0.4]	0.5 (0.5) [0.0]
Phe-Gly	0.1	0.1	0.1 (0.1) [0.6]	1.0 (0.5) [1.0]

(Adapted after G. Steinman and M.V. Cole, 1967. Proc. Nat. Acad. Sci. U.S. 58,735).

* The dipeptides are listed in terms of increasing volume of the side chains of the constituent residues. Gly = glycine, Ala = alanine, Val = valine, Leu = leucine, Ile = isoleucine and Phe = phenylalanine. Example: Gly-Ala = glycylalanine.

+ Steinman and Cole's (S/C) experimentally determined dipeptide bond frequencies were normalized and compared to the calculated frequencies obtained by counting actual peptide bond frequencies in ten proteins, assuming all seryl and threonyl residues are counted as glycine and all aspartyl and glutamyl residues are counted as alanine. The ten proteins used were: egg lysozyme, ribonuclease, sheep insulin, whale myoglobin, yeast cytochrome c, tobacco mosaic virus, beta-corticotropin, glucagon, melanocyte-stimulating hormone, and chymotrypsinogen. Because of ambiguity regarding sequences used by S/C, all sequences are those shown in Atlas of Protein Sequence and Structure, 1972. Vol. V (ed. by M.O. Dayhoff). National Biomedical Research Foundation, Georgetown University Medical Center, Washington, D.C.

& The experimentally determined dipeptide frequencies were obtained with aqueous solutions containing 0.01 M each amino acid, 0.125 N HCl, 0.1 M sodium dicyanamide.

#Kok and Bradley's (K/Bcalculated dipeptide frequencies were obtained by counting S?Cassumptions. The numbers in brackets are for one protein, enterotoxin B, with actual peptide bond frequencies for the same ten proteins with (wa) and without (woa) S/C assumptions. The numbers in parentheses are for twenty-five proteins with (wa) and without (woa) S/C assumptions. The twenty-five proteins are the ten used S/C and alpha S1 Casein (bovine); azurin (bordetella bronchisetica); carboxypeptidase A (bovine); cytochrome b5 (bovine); enterotoxin B; elastase (pig); glyceraldehyde 3-phosphate dehydrogenase (lobster); human growth hormone; human hemoglobin beta chain; histone 11B2 (bovine); immunoglobulin gamma-chain 1, V-I (human EU); penicillinase (bacillus licheniformis 749/c); sheep prolactin; subtilisin (bacillus amyloliquefaciens); and tryptophan synthetase alpha chain (E-coh K-i 2). Sequences are those shown in Atlas of Protein Sequence and Structure, 1972. Vol. V (ed. by M.O. Dayhoff). Note disagreement S/C K/B calculated results. Also S/C calculated results are at variance with S/C experimental values for one, ten or twenty-five proteins, with (wa) or without (woa) S/C assumptions.

Mineral Catalysis

Mineral catalysis is often suggested as being significant in prebiotic evolution. In the experimental investigations reported in the early 1970's¹⁵ mineral catalysis in polymerization reactions was found to operate by adsorption of biomonomers on the surface or between layers of clay. Monomers were effectively concentrated and protected from rehydration so that condensation polymerization could occur. There does not appear to be any additional effect. In considering this catalytic effect of clay, Hulett has advised, "It must be remembered that the surface cannot change the free energy relationships between reactants and products, but only the speed with which equilibrium is reached."¹⁶

Is mineral catalysis capable of doing the chemical work and/or thermal entropy work? The answer is a qualified no. While it should assist in doing the thermal entropy work, it is incapable of doing the chemical work since clays do not supply energy. This is why successful mineral catalysis experiments invariably use energy-rich precursors such as aminoacyl adenylates rather than amino acids.¹⁷

Is there a real prospect that mineral catalysis may somehow accomplish the configurational entropy work, particularly the coding of polypeptides or polynucleotides? Here the answer is clearly no. In all experimental work to date, only random polymers have been condensed from solutions of selected ingredients. Furthermore, there is no theoretical basis for the notion that mineral catalysis could impart any significant degree of information content to polypeptides or polynucleotides. As has been noted by Wilder-Smith,¹⁸ there is really no reason to expect the low-grade order resident on minerals to impart any high degree of coding to polymers that condense while adsorbed on the mineral's surface. To put it another way, one cannot get a complex, aperiodic-sequenced polymer using a very periodic (or crystalline) template.

In summary, mineral catalysis must be rejected as a mechanism for doing either the chemical or configurational entropy work required to polymerize the macromolecules of life. It can only assist in polymerizing short, random chains of polymers from selected high-energy biomonomers by assisting in doing the thermal entropy work.

Nonlinear, Nonequilibrium Processes

1. Ilya Prigogine

Prigogine has developed a more general formulation of the laws of thermodynamics which includes nonlinear, irreversible processes such as autocatalytic activity. In his book Self Organization in Nonequilibrium Systems (1977)¹⁹ co-authored with Nicolis, he summarized this work and its application to the organization and maintenance of highly complex structures in living things. The basic thesis in the book is that there are some systems which obey non-linear laws---laws that produce two distinct kinds of behavior. In the neighborhood of thermodynamic equilibrium, destruction of order prevails (entropy achieves a maximum value consistent with the system constraints). If these same systems are driven sufficiently far from equilibrium, however, ordering may appear spontaneously.

Heat flow by convection is an example of this type of behavior. Heat conduction in gases normally occurs by the random collision of gas molecules. Under certain conditions, however, heat conduction may occur by a heat-convection current---the coordinated movement of many gas molecules. In a similar way, water flow out of a bathtub may occur by random movement of the water molecules under the influence of gravity. Under certain conditions, however, this random movement of water down the drain is replaced by the familiar soapy swirl---the highly coordinated flow of the vortex. In each case random movements of molecules in a fluid are spontaneously replaced by a highly ordered behavior. Prigogine et al.,²⁰ Eigen,²¹ and others have suggested that a similar sort of self-organization may be intrinsic in organic chemistry and can potentially account for the highly complex macromolecules essential for living systems.

But such analogies have scant relevance to the origin-of-life question. A major reason is that they fail to distinguish between order and complexity. The highly ordered movement of energy through a system as in convection or vortices suffers from the same shortcoming as the analogies to the static, periodic order of crystals. Regularity or order cannot serve to store the large amount of information required by living systems. A highly irregular, but specified, structure is required rather than an ordered structure. This is a serious flaw in the analogy offered. There is no apparent connection between the kind of spontaneous ordering that occurs from energy flow through such systems and the work required to build aperiodic information-intensive macromolecules like DNA and protein. Prigogine, et al.²² suggest that the energy flow through the system decreases the system entropy, leading potentially to the highly organized structure of DNA and protein. Yet they offer no suggestion as to how the decrease in thermal entropy from energy flow through the system could be coupled to do the configurational entropy work required.

A second reason for skepticism about the relevance of the models developed by Prigogine, et al.²³ and others is that ordering produced within the system arises through constraints imposed in an implicit way at the system boundary. Thus, the system order, and more importantly the system complexity, cannot exceed that of the environment.

Walton²⁴illustrates this concept in the following way. A container of gas placed in contact with a heat source on one side and a heat sink on the opposite side is an open system. The flow of energy through the system from the heat source to the heat sink forms a concentration relative to the gas in the cooler region. The order in this system is established by the structure: source-intermediate systems-sink. If this structure is removed, allowing the heat source to come into contact with the heat sink, the system decays back to equilibrium. We should note that the information induced in an open system doesn't exceed the amount of information built into the structural environment, which is its source.

Condensation of nucleotides to give polynucleotides or nucleic acids can be brought about with the appropriate apparatus (i.e., structure) and supplies of energy and matter. Just as in Walton's illustration, however, Mora²⁵has shown that the amount of order (not to mention specified complexity) in the final product is no greater than the amount of information introduced in the physical structure of the experiment or chemical structure of the reactants. Non-equilibrium thermodynamics does not account for this structure, but assumes it and then shows the kind of organization which it produces. The origin and maintenance of the structure are not explained, and as Harrison²⁶ correctly notes this question leads back to the origin of structure in the universe. Science offers us no satisfactory answer to this problem at present.

Nicolis and Prigogine²⁷ offer their trimolecular model as an example of a chemical system with the required nonlinearity to produce self ordering. They are able to demonstrate mathematically that within a system that was initially homogeneous, one may subsequently have a periodic, spatial variation of concentration. To achieve this low degree of ordering, however, they must require boundary conditions that could only be met at cell walls (i.e., at membranes), relative reaction rates that are atypical of those observed in condensation reactions, a rapid removal of reaction flow products, and a trimolecular reaction (the highly unlikely simultaneous collision of three atoms). Furthermore the trimolecular model requires chemical reactions that are essentially irreversible. But condensation reactions for polypeptides or polynucleotides are highly reversible unless all water is removed from the system.

They speculate that the low degree of spatial ordering achieved in the simple trimolecular model could potentially be orders of magnitude greater for the more complex reactions one might observe leading up to a fully replicating cell. The list of boundary constraints, relative reaction rates, etc. would, however, also be orders of magnitude larger. As a matter of fact, one is left with so constraining the system at the boundaries that ordering is inevitable from the structuring of the environment by the chemist. The fortuitous satisfaction of all of these boundary constraints simultaneously would be a its miracle in its own right.

It is possible at present to synthesize a few proteins such as insulin in the laboratory. The chemist supplies not only energy to do the chemical and thermal entropy work, however, but also the necessary chemical manipulations to accomplish the configurational entropy work. Without this, the selection of the proper composition and the coding for the right sequence of amino acids would not occur. The success of the experiment is fundamentally dependent on the chemist.

Finally, Nicolis and Prigogine have postulated that a system of chemical reactions which explicitly shows autocatalytic activity may ultimately be able to circumvent the problems now associated with synthesis of prebiotic DNA and protein. It remains to be demonstrated experimentally, however, that these models have any real correspondence to prebiotic condensation reactions. At best, these models predict higher yields without any mechanism to control sequencing. Accordingly, no experimental evidence has been reported to show how such models could have produced any significant degree of coding. No, the models of Prigogine et al., based on non-equilibrium thermodynamics, do not at present offer an explanation as to how the configurational entropy work is accomplished under prebiotic conditions. The problem of how to couple energy flow through the system to do the required configurational entropy work remains.

2. Manfred Eigen

In his comprehensive application of nonequilibrium thermodynamics to the evolution of biological systems, Eigen²⁸ has shown that selection could produce no evolutionary development in an open system unless the system were maintained far from equilibrium. The reaction must be autocatalytic but capable of self-replication. He develops an argument to show that in order to produce a truly self-replicating system the complementary base-pairing instruction potential of nucleic acids must be combined with the catalytic coupling function of proteins. Kaplan²⁹ has suggested a minimum of 20-40 functional proteins of 70-100 amino acids each, and a similar number of nucleic acids would be required by such a system. Yet as has previously been noted, the chance origin of even one protein of 100 amino acids is essentially zero.

The shortcoming of this model is the same as for those previously discussed; namely, no way is presented to couple the energy flow through the system to achieve the configurational entropy work required to create a system capable of replicating itself.

Periodically we see reversions (perhaps inadvertent ones) to chance in the theoretical models advanced to solve the problem. Eigen's model illustrates this well. The model he sets forth must necessarily arise from chance events and is nearly as incredible as the chance origin of life itself. The fact that generally chance has to be invoked many times in the abiotic sequence has been called by Brooks and Shaw "a major weakness in the whole chemical evolutionary theory."³⁰

Experimental Results in Synthesis of Protein and DNA

Thus far we have reviewed the various theoretical models proposed to explain how energy flow through a system might accomplish the work of synthesizing protein and DNA macromolecules, but found them wanting. Nevertheless, it is conceivable that experimental Support for a spontaneous origin of life can be found in advance of the theoretical explanation for how this occurs. What then can be said of the experimental efforts to synthesize protein and DNA macromolecules? Experimental efforts to this end have been enthusiastically pursued for the past thirty years. In this section, we will review efforts toward the prebiotic syntheses of both protein and DNA, considering the three forms of energy flow most commonly thought to have been available on the early earth. These are thermal energy (volcanoes), radiant energy (sun), and chemical energy in the form of either condensing agents or energy-rich precursors. (Electrical energy is excluded at this stage of evolution as being too "violent," destroying rather than joining the biomonomers.)

Thermal Synthesis

Sidney Fox³¹ has pioneered the thermal synthesis of polypeptides, naming the products of his synthesis proteinoids. Beginning with either an aqueous solution of amino acids or dry ones, he heats his material at 2000^oC for 6-7 hours.

[NOTE: Fox has modified this picture in recent years by developing "low temperature" syntheses, i.e., 90-120^oC. See S. Fox, 1976. J Mol Evol 8, 301; and D. Rohlfing, 1976. Science 193, 68].

All initial solvent water, plus water produced during Polymerization, is effectively eliminated through vaporization. This elimination of the water makes possible a small but significant yield of polypeptides, some with as many as 200 amino acid units. Heat is introduced into the system by conduction and convection and leaves in the form of steam. The reason for the success of the polypeptide formation is readily seen by examining again equations 8-15 and 8-16. Note that increasing the temperature would increase the product yield through increasing the value of exp (- G / RT. But more importantly, eliminating the water makes the reaction irreversible, giving an enormous increase in yield over that observed under equilibrium conditions by the application of the law of mass action.

Thermal syntheses of polypeptides fail, however, for at least four reasons. First, studies using nuclear magnetic resonance (NMR) have shown that thermal proteinoids "have scarce resemblance to natural peptidic material because beta, gamma, and epsilon peptide bonds largely predominate over alpha-peptide bonds."³²

[NOTE: This quotation refers to peptide links involving the beta-carboxyl group of aspartic acid, the gamma-carboxyl group of glutamic acid, and the epsilon-amino group of lysine which are never found in natural proteins. Natural proteins use alpha-peptide bonds exclusively].

Second, thermal proteinoids are composed of approximately equal numbers of L- and D-amino acids in contrast to viable proteins with all L-amino acids. Third, there is no evidence that proteinoids differ significantly from a random sequence of amino acids, with little or no catalytic activity. [It is noted, however, that Fox has long disputed this.] Miller and Orgel have made the following observation with regard to Fox's claim that proteinoids resemble proteins:

The degree of nonrandomness in thermal polypeptides so far demonstrated is minute compared to nonrandomness of proteins. It is deceptive, then, to suggest that thermal polypeptides are similar to proteins in their nonrandomness.³³

Fourth, the geological conditions indicated are too unreasonable to be taken seriously. As Folsome has commented, "The central question [concerning Fox's proteinoids] is where did all those pure, dry, concentrated, and optically active amino acids come from in the real, abiological world?"³⁴

There is no question that thermal energy flow through the system including the removal of water is accomplishing the thermal entropy and chemical work required to form a polypeptide (300 kcal/mole in our earlier example). The fact that polypeptides are formed is evidence of the work done. It is equally clear that the additional configurational entropy work required to convert an aperiodic unspecified polypeptide into a specified, aperiodic polypeptide which is a functional protein has not been done (159 kcal/mole in our earlier example).

It should be remembered that this 159 kcal/mole of configurational entropy work was calculated assuming the sequencing of the amino acids was the only additional work to be done. Yet the experimental results of Temussi et al.,³⁵ indicate that obtaining all Lamino acids from a racemic mixture and getting alpha-linking between the amino acids are quite difficult. This requirement further increases the configurational entropy work needed over that estimated to do the coding work (159 kcal/mole). We may estimate the magnitude of this increase in the configurational entropy work term by returning to our original calculations (eq. 8-7 and 8-8).

In our original calculation for a hypothetical protein of 100 amino acid units, we assumed the amino acids were equally divided among the twenty types. We calculated the number of possible amino acid sequences as follows:

_cr = 100! / 5! 5! 5!....5! = 100! / (5!)²⁰ = 1.28 x 10¹¹⁵ (9-3)

If we note that at each site the probability of having an L-amino acid is 50%, and make the generous assumption that there is a 50% probability that a given link will be of the alpha-type observed in true proteins, then the number of ways the system can be arranged in a random chemical reaction is given by

_cr = 1.28 x 10¹¹⁵ x 2¹⁰⁰ x 2⁹⁹ = 10¹⁷⁵ (9-4)

where 2¹⁰⁰ refers to the number of additional arrangements possible, given that each site could contain an L- or D-amino acid, and 2⁹⁹ assumes the 99 links between the 100 amino acids in general are equally divided between the natural alpha-links and the unnatural beta-, gamma-, or epsilon-links.

[NOTE Some studies indicate less than 50% alpha-links in peptides formed by reacting random mixtures of amino acids. (P.A. Temussi, L. Paolillo, F.E. Benedetti, and S. Andini, 1976. J. Mol. Evol. 7, 105.)].

The requirements for a biologically functional protein molecule are: (1) all L-amino acids, (2) all alpha-links, and (3) a specified sequence. This being so, the calculation of the configurational entropy of the protein molecule using equation 8-8 is unchanged except that the number of ways the system can be arranged, (_cr), is increased from 1.28 x 10¹¹⁵ to 1.0 x 10¹⁷⁵ as shown in equations 9-3 and 9-4. We may use the relationships of equations 8-7 and 8-8 but with the number of permutations modified as shown here to find a total configurational entropy work. When we do, we get a total configurational entropy work of 195 kcal/mole, of which 159 kcal/mole is for sequencing and 36 kcal/mole to attain all L-amino acids and all alpha-links. Finally, it should be recognized that Fox and others who use his approach avoid a much larger configurational entropy work term by beginning with only amino acids, i.e., excluding other organic chemicals and thereby eliminating the "selecting work" which is not accounted for in the 195 kcal/mole calculated above.

In summary, undirected thermal energy is only able to do the chemical and thermal entropy work in polypeptide synthesis, but not the coding (or sequencing) portion of the configurational entropy work. Protenoids are just globs of random polymers. That a polymer composed exclusively of amino acids (but without exclusively peptide bonds) was formed is a result of the fact that only amino acids were used in the experiment. Thus, the portion of the configurational entropy work that was done---the selecting work---was accomplished not by natural forces but by illegitimate investigator interference. It is difficult to imagine how one could ever couple random thermal energy flow through the system to do the required configurational entropy work of selecting and sequencing. Finally, this approach is of very questionable geological significance, given the many fortuitous events that are required, as others have noted.

Solar Energy

Direct photochemical (UV) polymerization reactions to form polypeptides and polynucleotides have occasionally been discussed in the literature. The idea is to drive forward the otherwise thermodynamically unfavorable polymerization reaction by allowing solar energy to flow through the aqueous system to do the necessary work. It is worth noting that minor yields of small peptides can be expected to form spontaneously, even though the reaction is unfavorable (see eq. 8-16), but that greater yields of larger peptides can be expected only if energy is somehow coupled to the reaction. Fox and Dose have examined the peptide results of Bahadur and Ranganayaki³⁶ and concluded that UV irradiation did not couple with the reaction. They comment, "The authors do not show that they have done more than accelerate an approach to an unfavorable equilibrium. They may merely have reaffirmed the second law of thermodynamics."³⁷ Other attempts to form polymers directly under the influence of UV light have not been encouraging because of this lack of coupling. Neither the chemical nor the thermal entropy work, and definitely not any configurational entropy work, has been accomplished using solar energy.

Chemical Energy (Energy-Rich Condensing Agents)

Through the use of condensing agents, the energetically unfavorable dipeptide reaction (G₁ = + 3000 cal/mole) is made energetically favorable (G₃ < 0) by coupling it with a second reaction which is sufficiently favorable energetically (G₂< 0), to offset the energy requirement of the dipeptide reaction:

dipeptide reaction

A - OH + H - B A - B + H₂0 G₁ > 0 (9-5)

condensing agent reaction

C + H₂0 D G₂ < 0 (9-6)

coupled reaction

A - OH + H - B + C A - B + D G₃ < 0 (9-7)

As in thermal proteinoid formation, the free water is removed. However, in this case, it is removed by chemical reaction with a suitable poly- condensing agent-one which has a sufficient decrease in Gibbs free energy to drive the reaction forward (i.e., G₂ 0 and | G₂ | |G₁ | so that G₁ + G₂ = G₃ 0.

Unfortunately, it has proved difficult to find condensing agents work. for these macromolecule syntheses that could have originated on the primitive earth and functioned properly under mild conditions in an aqueous environment.³⁸ Meanwhile, other condensing agents which are not prebiotically significant (e.g., polymetaphosphates) are used in experiments. The plausible cyanide derivative candidates for condensing agents on the early earth hydrolyze readily in aqueous solutions (see Chapter 4). In the process, they do not couple preferentially with the H₂0 from the condensation-dehydration reaction. Condensing agents observed in living systems today are produced only by living systems, and thus are not prebiotically significant. Moreover, enzyme activity in living systems first activates amino acids and then brings about condensation of these activated species, thus avoiding the problem of indiscriminate reaction with water.

Notice that if we could solve the very significant problems associated with the prebiotic synthesis of polypeptides by using condensing agents, we would still succeed only in polymerizing random polypeptides. Only the chemical and thermal entropy work would be accomplished by an appropriate coupling of the condensing agent to the condensation reaction. There is no reason to believe that condensing agents could have any effect on the selecting or sequencing of the amino acids. Thus, condensing agents are eliminated as a possible means of doing the configurational entropy work of coding a protein or DNA.

Chemical Energy (Energy-Rich Precursors)

Because the formation of even random polypeptides from amino acids is so energetically unfavorable (G = 300 kcal/mole for 100 amino acids), some investigators have attempted to begin with energy-rich precursors such as HCN and form polypeptides directly, a scheme which is "downhill" energetically, i.e., G < 0. There are advantages to such an approach; namely, there is no chemical work to be done since the bonding energy actually decreases as the energy-rich precursors react to form more complex molecules. This decrease in bonding energy will drive the reaction forward, effectively doing the thermal entropy work as well. The fly in the ointment, however, is that the configurational entropy work is enormous in going from simple molecules (e.g., HCN) directly to complex polymers in a single step (without forming intermediate biomonomers).

The stepwise scheme of experiments is to react gases such as methane, ammonia, and carbon dioxide to form amino acids and other compounds and then to react these to form polymers in a subsequent experiment. In these experiments the very considerable selecting-work component of the configurational entropy work is essentially done by the investigator who separates, purifies, and concentrates the amino acids before attempting to polymerize them. Matthews³⁹ and co-workers, however, have undertaken experiments where this intermediate step is missing and the investigator has no opportunity to contribute even obliquely to the success of the experiment by assisting in doing the selecting part of the configurational entropy work. In such experiments-undoubtedly more plausible as true prebiotic simulations-the probability of success is, however, further reduced from the already small probabilities previously mentioned. Using HCN as an energy-rich precursor, and ammonia as a catalyst, Matthews and Moser⁴⁰ have claimed direct synthesis of a large variety of chemicals under anhydrous conditions. After treating the polymer with water, even peptides are said to be among the products obtained. But as Ferris et al.,⁴¹ have shown, the HCN polymer does not release amino acids upon treatment with proteolytic (protein splitting) enzymes; nor does it give a positive biuret reaction (color test for peptides). In short, it is very hard to reconcile these results with a peptidic structure.

Ferris⁴² and Matthews⁴³ have agreed that direct synthesis of polypeptides has not yet been demonstrated. While some peptide bonds may form directly, it would be quite surprising to find them in significant numbers. Since HCN gives rise to other organic compounds, and various kinds of links are possible, the formation of polypeptides with exclusively alpha-links is most unlikely. Furthermore, no sequencing would be expected from this reaction, which is driven forward and "guided" only by chemical energy.

While we do not believe Matthews or others will be successful in demonstrating a single step synthesis of polypeptides from HCN, this approach does involve the least investigator interference, and thus, represents a very plausible prebiotic simulation experiment. The approach of Fox and others, which involves reacting gases to form many organic compounds, separating out amino acids, purifying, and finally polymerizing them, is more successful because it involves a greater measure of investigator interference. The selecting portion of the configurational entropy work is being supplied by the scientist. Matthew's lack of demonstrable success in producing polypeptides is a predictable indication of the enormity of the problem of prebiotic synthesis when it is not overcome by illegitimate investigator interference.

Mineral Catalysis

A novel synthesis of polypeptides has been reported⁴⁴which employs mineral catalysis. An aqueous solution of energy-rich aminoacyl adenylates (rather than amino acids) is used in the presence of certain layered clays such as those known as montmorillonites. Large amounts of the energy-rich reactants are adsorbed both on the surface and between the layers of clay. The catalytic effect of the clay may result primarily from the removal of reactants from the solution by adsorption between the layers of clay. This technique has resulted in polypeptides of up to 50 units or more. Although polymerization definitely occurs in these reactions, the energy-rich aminoacyl adenylate (fig. 9-1) is of very doubtful prebiotic significance per the discussion of competing reactions in Chapter 4. Furthermore, the use of clay with free amino acids will not give a successful synthesis of polypeptides. The energy-rich aminoacyl adenylates lower their chemical or bonding energy as they polymerize, driving the reaction forward, and effectively doing the thermal entropy work as well. The role of the clay is to concentrate the reactants and possibly to catalyze the reactions. Once again, we are left with no apparent means to couple the energy flow, in this case in the form of prebiotically questionable energy-rich precursors, to the configurational entropy work of selecting and sequencing required in the formation of specified aperiodic polypeptides, or proteins.

Figure 9-1.
Aminoacyl adenylate.

Summary of Experimental Results on Prebiotic Synthesis of protein

In summary, we have seen that it is possible to do the thermal entropy work and chemical work necessary to form random polypeptides, e.g., Fox's proteinoids. In no case, though, has anyone been successful in doing the additional configurational entropy work of coding necessary to convert random polypeptides into proteins. Virtually no mechanism with any promise for coupling the random flow of energy through the system to do this very specific work has come to light. The prebiotic plausibility of the successful synthesis of polypeptides must be questioned because of the considerable configurational entropy work of selecting done by the investigator prior to the polymer synthesis. Surely no suggestion is forthcoming that the right composition of just the subset of amino acids found in living things was "selected" by natural means, or that this subset consists only of L-a-amino acids. This is precisely why a large measure of the credit in forming proteinoids must go to Fox and others rather than nature.

Summary of Experimental Results on Prebiotic Synthesis of DNA

The prebiotic synthesis of DNA has proved to be even more difficult than that of protein. The problems that beset protein synthesis apply with greater force to DNA synthesis. Energy flow through the system may cause the nucleotides to chemically react and form a polymer chain, but it is very difficult to get them to attach themselves together in a specified way. For example, 3' - 5' links on the sugar are necessary for the DNA to form a helical structure (see fig. 9-2). Yet 2'-5' links predominate in most prebiotic simulation experiments.⁴⁵ The sequencing of the bases in DNA is also crucial, as is the amino acid sequence in proteins. Both of these requirements are problems in doing the configurational entropy work. It is one thing to get molecules to chemically react; it is quite another to get them to link up in the right arrangement. To date, researchers have only succeeded in making oligonucleotides, or relatively short chains of nucleotides, with neither consistent 3'-5' links nor specific base sequencing.

Figure 9-2.
A section from a DNA chain showing the sequence AGCT.

Miller and Orgel summarized their chapter on prebiotic condensation reactions by saying:

This chapter has probably been confusing to the reader. We believe that is because of the limited progress that has been made in the study of prebiotic condensation. Many interesting scraps of information are available, but no correct pathways have yet been discovered.⁴⁶

The situation is much the same today.

Summary Discussion of Experimental Results

There is an impressive contrast between the considerable success in synthesizing amino acids and the consistent failure to synthesize protein and DNA. We believe the reason is the large difference in the magnitude of the configurational entropy work required. Amino acids are quite simple compared to protein, and one might reasonably expect to get some yield of amino acids, even where the chemical reactions that occur do so in a rather random fashion. The same approach will obviously be far less successful in reproducing complex protein and DNA molecules where the configurational entropy work term is a nontrivial portion of the whole. Coupling the energy flow through the system to do the chemical and thermal entropy work is much easier than doing the configurational entropy work. The uniform failure in literally thousands of experimental attempts to synthesize protein or DNA under even questionable prebiotic conditions is a monument to the difficulty in achieving a high degree of information content, or specified complexity from the undirected flow of energy through a system.

We must not forget that the total work to create a living system goes far beyond the work to create DNA and protein discussed in this chapter. As we stated before, a minimum of 20-40 proteins as well as DNA and RNA are required to make even a simple replicating system. The lack of known energy-coupling means to do the configurational entropy work required to make DNA and protein is many times more crucial in making a living system. As a result, appeals to chance for this most difficult problem still appear in the literature in spite of the fact that calculations give staggeringly low probabilities, even on the scale of 5 billion years. Either the work---especially the organizational work---was coupled to the flow of energy in some way not yet understood, or else it truly was a miracle.

Summary of Thermodynamics Discussion

Throughout Chapters 7-9 we have analyzed the problems of complexity and the origin of life from a thermodynamic point of view. Our reason for doing this is the common notion in the scientific literature today on the origin of life that an open system with energy and mass flow is a priori a sufficient explanation for the complexity of life. We have examined the validity of such an open and constrained system. We found it to be a reasonable explanation for doing the chemical and thermal entropy work, but clearly inadequate to account for the configurational entropy work of coding (not to mention the sorting and selecting work). We have noted the need for some sort of coupling mechanism. Without it, there is no way to convert the negative entropy associated with energy flow into negative entropy associated with configurational entropy and the corresponding information. Is it reasonable to believe such a "hidden" coupling mechanism will be found in the future that can play this crucial role of a template, metabolic motor, etc., directing the flow of energy in such a way as to create new information?

References

1. Albert L. Lehninger, 1970. Biochemistry. New York: Worth Publishers, p.782.

2. H.P. Yockey, 1977. J. Theoret. Biol. 67, 377; R.W. Kaplan, 1974. Rad. Environ. Biophys. 10, 31.

3. M. Eigen, 1971. Die Naturwiss. 58, 465.

4. G. Steinman, 1967. Arch. Biochem. Biophys. 121, 533. 5. A.G. Cairns-Smith, 1971. The Life Puzzle. Edinburgh: Oliver and Boyd.

6. F. Crick, 1966. Of Molecules and Men. Seattle: University of Washington Press, p. 6-7.

7. Eigen, Die Naturwiss., p. 465; S.L. Miller and L.E. Orgel, 1974. The Origins of Life on the Earth. Englewood Cliffs, New Jersey: Prentice Hall.

8. J.B.S. Haldane, 1965. In The Origins of Prebiological Systems and of Their Molecular Matrices, ed. S.W. Fox. New York: Academic Press, p.11.

9. T. Dobzhansky, 1965. In The Origins of Prebiological Systems and of Their Molecular Matrices, p.310.

10. Ludwig von Bertalanffy, 1967. Robots, Men and Minds. New York: George Braziller, p.82.

11. G. Steinman and M. Cole, 1967. Proc. Nat. Acad. Sci. U.S. 58, 735; Steinman, Arch. Biochem. Biophys. , p.533.

12. A. Katchalsky, 1973. Die Naturwiss. 60,215; M. Calvin, 1975. Amer. Sci. 63, 169; C.E. Folsome, 1979. The Origin of Life. San Francisco: W.H. Freeman, p.104; K. Dose, 1983. Naturwiss. 70, 378.

13. Steinman, Arch. Biochem. Biophys. 121, 533; Steinman and Cole, Proc. Nat. Acad. Sci. U.S. 5, p.735.

14. H.P. Yockey, 1981. J. Theoret. Biol 91, 13.

15. Katchalsky, Die Naturwiss., p.215.

16. H.R. Hulett, 1969. J. Theoret. Biol 24, 56.

17. Katchalsky, Die Naturwisa., p.216.

18. A.E. Wilder-Smith, 1970. The Creation of Life. Wheaton, Ill.: Harold Shaw, p.67.

19. G. Nicolis and I. Prigogine, 1977. Self Organization in Nonequilibrium Systems. New York: Wiley.

20. I. Prigogine, G. Nicolis, and A. Babloyantz, 1972. Physics Today , p.23-31.

21. Eigen, Die Naturwiss., p.465.

22. Prigogine, Nicolis, and Babloyantz, Physics Today, p.23-31.

23. Ibid; Nicolis and Prigogine, Self Organization in Nonequilibrium Systems.

24. J.C. Walton, 1977. Origins, 4, 16.

25. P.T. Mora, 1965. In The Origins of Prebiological Systems and of Their Molecular Matrices, p.39.

26. E.R. Harrison, 1969. In Hierarchical Structures. ed. L.L. Whyte, A.G. Wilson, and D. Wilson, New York: Elsevier, p.87.

27. Nicolis and Prigogine, Self Organization in Nonequilibrium Systems.

28. Eigen, Die Naturwiss., p.465; 1971. Quart. Rev. Biophys. 4, 149.

29. Kaplan, Rad. Environ. Biophysics, p.31.

30. J. Brooks and G. Shaw, 1973. Origin and Development of Living Systems. New York: Academic Press, p.209.

31. S.W. Fox and K. Dose, 1977. Molecular Evolution and the Origin of Life. New York: Marcel Dekker.

32. P.A. Temussi, L. Paolillo, L. Ferrera, L. Benedetti, and S. Andini, 1976. J. Mol Evol 7, 105.

33. S.L. Miller and L.E. Orgel, 1974. The Origins of Life on Earth Englewood Cliffs, New Jersey: Fn. p. 144.

34. C.E. Folsome, 1979. The Origin of Life. San Francisco: W.H. Freeman, p.87.

35. Temussi, Paolillo, Ferrera, Benedetti, and Andini, J. Mol. Evol., p.105.

36. K. Bahadur and S. Ranganayaki, 1958. Proc. Nat. Acad. Sci. (India) 27A, 292.

37. S.W. Fox and K. Dose, 1972. Molecular Evolution and the Origin of Life. San Francisco: W.H. Freeman, p.142.

38. J. Hulshof and C. Ponnamperuma, 1976. Origins of Life 7, 197.

39. C.N. Matthews and R.E. Moser, 1966. Proc. Nat. Acad. Sci. U.S. 56, 1087; C.N. Matthews, 1975. Origins of Life 6, 155; C. Matthews, J. Nelson, P. Varma, and R. Minard, 1977. Science 198 622; C.N. Matthews, 1982. Origins of Life 12, 281.

40. C.N. Matthews, and R.E. Moser, 1967. Nature 215,1230.

41. J.P. Ferris, D.B. Donner, and A.P. Lobo, 1973. J. Mol Biol. 74, 499.

42. J.P. Ferris, 1979. Science 203, 1135.

43. C.N. Matthews, 1979. Science 203, 1136.

44. Katchalsky, Die Naturwiss., p.215.

45. R.E. Dickerson, September 1978. Sci. Amer., p.70.

46. Miller and Orgel, The Origins of Life on the Earth, p.148.