PMB 20
Microbes Make the World Go 'Round
Norm Pace (nrpace@nature)
643-2571
371 Koshland Hall
1/26, 2/2/1999

A MOLECULAR VIEW OF BIOLOGICAL DIVERSITY

 

 

Diversity (Slide Show of Microbial Organisms)

References: Madigan et al., "Brock Biology of Microorganisms," 8th ed., C. 15, pp. 606-634.

Pace, "A molecular view of microbial diversity and the biosphere,", 276:734-740, 1997.

Forward

We are in the midst of a "paradigm shift" in evolutionary biology; microbial organisms are key to understanding our new perspective on biological diversity. This is an exciting period for microbial biology in terms of enhanced access to the microbial world, and a way to describe it. We can, only now, begin to learn the makeup of microbial ecosystems and tap the resources of the natural microbial world with unprecedented abilities.

1. How to define 'biological diversity'? Best done in terms of evolutionary relatedness -- phylogeny.

• Closely related organisms -- Similar features.

• Evolutionarily distant -- may have novelties.

• "Homologous" features in different organisms -- Also a trait of their common ancestor.

Definition: Homologous = of common ancestry

 

2. Common textbook pic of evolutionary relationships among lifeforms:

The "Five Kingdoms" scheme, a la Haeckel (1860s), Whittaker (1960s),

Margulis/Copeland (1960s-70s).

 

A. Much truth, particularly among multicellular forms ("higher" orgs.):

1. Have developmental traits, complex morphologies to

relate by.

2. Note subtle aspects: cyanobacteria --> chloroplasts

bacteria --> mitochondria

B. But many problems with traditional pic:

1. Relationships among microbes, "prokaryotes" and "eukaryotes", are speculative (at best).
a. And yet the broadest diversity of lifeforms (evolutionarily) lies in the microbial world (below).

2. No criteria to relate organisms between "kingdoms" -- a "universal" phylogeny was impossible.

3. Note implicit timeline -- prokaryotes, protists "primitive" -- but no such thing as a "primitive" (evolutionary predecessor) organism alive today. Simple, yes, but not primitive (unless carefully defined). We are all the products of 4 billion years of evolution.

4. The suggestion that eukaryotes derived from fusion of two prokaryotes (a nuclear" component and a "cytoplasmic" component), is fundamentally incorrect.

a. Mitos and chloroplasts indeed bacterial in origin, but not the nuclear line of descent.

b. Often-cited time of 1.5 billion years ago for origin of eukaryotes is B.S., based primarily on speculation and beliefs of decades-ago.

 

3. Studies in molecular phylogeny over the past two decades have changed the paradigm - mostly hasn't hit the texts yet. (But see microbiological texts later than 1990.)

A. By comparing macromolecular sequences, can extract quantitative evolutionary relationships - evolutionary distances - between organisms.

 

4. The goal of Molecular Phylogeny -- to relate molecules (hence, in principle, organisms) quantitatively, so as to define their evolutionary histories, e.g. as a "phylogenetic tree".

A. Many ways to "relate" molecules:
e.g.: Immunological methods,
DNA-DNA hybridization, others.

B. Sequence comparisons provide "precise" numbers for defining relationships between molecules (organisms).

 

5. For homologous nucleic acid (or protein) sequences:

A. Consider:

Organism A is 70% identical to organism B,
Fractional identity is 0.70,
30% different.

1. Note that term "homology" is commonly, and incorrectly, used, when identity/similarity is meant.

2. You don't compare sequences unless they are homologous - of common ancestry. Homologous sequences are not necessarily identical and identical sequences are not necessarily homologous.

 

B. To build relationships, can construct a "distance matrix" for organisms A-E:

 

A B C D E

A - 0.1 0.2 0.2 0.4 Fractional

B 0.9 - 0.2 0.2 0.4 Difference

C 0.8 0.8 - 0.1 0.4

Fractional D 0.8 0.8 0.9 - 0.4

Identity E 0.6 0.6 0.6 0.6 -

 

Can relate in a "tree"-like fig., a "dendrogram":

•Note that organism-to-node distance is 1/2 of organism-to-organism distance.

•Note that tree is a single dimension.

2. It is common to see sequence divergence presented in terms of time, but this is not legitimate unless have a fossil record with which to calibrate. (to use time as a phylogenetic coordinate is not smart - you always know the direction, never the rate
• "Clockspeeds" of organisms vary -- the evolutionary clock is not constant.

3. The "root" of the tree may or may not be the "deepest" branchpoint.

 

6. What molecule(s) to use for molecular phylogeny?

A. Doesn't really matter so long as:
1. Homologous molecule occurs in all organisms considered.

2. Sufficient number of nucleotides or amino acids to be statistically significant.

3. Changes span evolutionary distance inspected -- i.e. sequences not randomized.

4. No lateral transfer - evolution of gene should reflect evolution of organisms. ( For instance, penicillinase would be a lousy choice for molecular phylogenetic analysis - Why? - it is subject to ready lateral transfer)

B. Considerable work in past with protein seqs.

1. E.g. cytochrome C, hemoglobins, etc.

2. But proteins hard to get and sequence - now (post 1980) much easier to isolate/sequence genes.

3. Turns out also that many/most proteins are "shallow" clocks - e.g. E. coli doesn't have hemoglobin.

C. Choice of molecules for comprehensive phylogeny - ribosomal RNAs (rRNA).

1. Ribosome - effects protein synthesis:

2. rRNAs present (and identifiable/alignable) in all organisms and the major organelles (mitos and chloroplasts).

3. Most effort on 16S rRNA (SSU = small subunit rRNA):

• Is particularly conservative - ca. 50% identity between
E. coli and us (over alignable nts)

• Large enough for reasonable statistics.

• First used for all-life phylogenetic trees by Carl Woese

(University of Illinois)

 

7. Given 16S sequences of multiple organisms:

A. Align sequences, count number of changes: is some function of evolutionary distance between the pairs of organisms.

B. Correct for multiple, back mutations at any given site:

• Number of fixed mutations per nt = evolutionary distance

C. Computer-fit overall tree topology to best-fit pairwise evolutionary distances.

 

8. The Big Tree that emerges (with representative organisms - ca. 3000 sequences are now available for analysis).

Note: This "network" is different from the "dendrogram" above. Here, organism - to - organism evolutionary distance is along line segments (see scale). This is a quantitative "map" of evolutionary relatedness, of "biological diversity". It is not a model, not a hypothesis -- it is a map based on discrete data.

 

9. Some lessons from the Big Tree:

A. Note that this tree is "unrooted" - you cannot know from this analysis which lineage is "ancestral".
1. The root of the tree cannot be inferred from rRNA - need "outgroup" to establish root.

2. New (and somewhat controversial) analyses suggest root is on the bacterial line - this is consistent with cellular properties.

B. There was a single origin of life (on Earth) -- all lifeforms related.

C. Three "primary lines of evolutionary descent" - primary "Domains" (maybe "primary kingdoms", but usage ambiguous - there is no "official" usage at this level of biological scale).

1. Eukaryote nuclear line of descent as old as the "prokaryote" lines.

2. Two lines of prokaryotes - as different from one another as either is from eukaryotes. This means that the term "prokaryote" has no real meaning: it refers to two enormously different "kinds" of organisms that are similar only in the cellular stategy of not wrapping a bag around their DNA. This absence of a property is no way to group organisms!

3. A 4th (or 5th, or more) Domain? Maybe -- Note that we know remarkably little about the natural microbial world, including the"kinds" of organisms that make it up..

D. Note that lines connecting organisms to nodes are not all the same length - the evolutionary clock is not constant for all organisms (and differs with different molecules).

1. E.g., long line segments (many nt changes) indicate rapid evolution: the time since a node is the same for sister lineages.

2. Extracting time from sequence change is chancy- even fatuous (tirade above).

3. Note domain-level tendencies:

• Eucarya - fast clocks

• Archaea - slow clocks

• Bacteria - intermediate rates of evolution

Manefestations of "punctuated equilibrium"?

E. Note that the phylogenetic region occupied by multicellular eukaryotes is shallow and limited, but exhibits enormous diversity in phenotype.

• What gimmick was invented that permitted such diversity? Maybe a bookkeeping invention that allowed accumulation of large genomes?

F. The rRNA data prove that mitochondria, chloroplasts were of bacterial origin. The "endosymbiosis hypothesis" for the origin of the organelles (stemming from the last century) is now well proven fact. (I don't know why the general biology books continue to present this as controversial.)

G. Note how deep is the divergence of Giardia, Tritrichomonas, and Vairimorpha in the eucaryal line. These types of organisms lack mitochondria. They presumably diverged from the main eucaryal line of descent before the mitos came in.

H. There is enormous uncharacterized diversity in the microbial euks (and other lines of descent).

 

10. Note that the Big Tree shown is a limited set of specific organisms: ca. 10,000 could now be put in the Big Tree.

A. Database of rRNA sequences=Ribosomal Database Project (U. Illinois), http://rdp.life.uiuc.edu

You can download trees.

 

11. A few more trees:

A. Bacteria ("eubacteria"):

1. Depth of wedge is depth of deepest known branch.

2. ca. 38 "main groups"/"phyla"/"kingdoms" of cultivated types so far detected; ca. 13-14 have no cultured representative - discovered in environmental gene studies.

-Note imprecision of group-definitions based on classical terms.

3. "Gram positive" and "Gram negative", e.g. do not describe coherent relatedness groups. Mostly microbiologists mean "proteobacteria" ("purple bacterial and relatives") phylogenetic group when they say "typical Gram negative bacterium" .

B. Archaea ("archaebacteria - bad term to use at this stage; they are not "bacteria"):

1. Three main lineages/"phyla"/"kingdoms" known, only two have cultivated representatives.
Crenarchaeota (cultivated types are high-temperature, but uncultivated low-temp forms are abundant in the environment.)

Euryarchaeota (halophiles and methanogens best known).

Korarchaeota (Only detected by gene analysis in environmental samples.)

C. Eucarya (eucaryotes):

1. Note that microbial eucs are vastly more diverse than the popular three -- fungi, plants and animals. What is "diversity?"

2. Ca. 12 "main lineages"/"phyla"/"kingdoms" of Eucarya known (e.g. plants, animals, fungi, alveolates, trypanosomatids, microsporidians, etc)

3. Note that most eucaryal diversity is microbial and we don't know much about it. Grad student Scott Dawson - recently has identified seven, new, kingdom-level eucaryal lineages in anaerobic muds from Indiana lake sediment and Berkeley's (Emeryville??) Aquatic Park)

 

12. Little is known about naturally occurring microbial populations. As these are analyzed (using rRNA gene cloning methods - ask if interested, see Science article), many new (and deeply diverging) lineages are being discovered -- e.g. the clone numbers in some of the trees. Who's out there in the real world, anyway??