This week, I found out that an Antarctic scientist I greatly admired, Professor Craig Cary of the University of Waikato, had died suddenly. He wasn’t someone I knew well – I’d only met him a few times – but his research fascinated me and it ended up validating some of the work that I did on biosecurity for Antarctica. As a result, he occasionally included me in discussions on research projects. I’ve got a list of people I want to interview about their work so I can write about it for you, and he was on that list. I’m sure he would have obliged had I asked but I’ve got something of a backlog of interviews I still need to write up, so I never did.
Craig’s research was on microbes which live in extreme environments, such as Antarctic soils, hot springs and hydrothermal vents. Since microbes are, by definition, tiny, they are hard to study. The techniques we use, and the science behind those techniques, is important for a lot of reasons, not just studying the microbes of extreme environments. It’s important for understanding COVID-19 and vaccines. It’s important for understanding antibiotic resistance, another topic I plan to write about at some stage. It’s also important for understanding genetic modification, another topic I want to learn more about. So, in honour of Craig’s memory, today I’m writing about microbes and how we can understand them.
When we look around us, we see plants and animals, and occasionally fungi. The diversity is dizzying – if I sat down and tried to list all the plants, animals and fungi I could think of, I’d be occupied for hours. And yet they represent only a small proportion of the diversity of life on earth. If we use the highest level of classification, which splits living things into three groups known as Divisions, then plants, animals and fungi all belong to the same group, known as the Eukarya. This group also includes many microbes, such as one which has been killing kauri trees, the parasites which cause malaria, the yeasts used to make bread, beer, wine, and some cheeses, and many of the plankton on which ocean species depend.
The other two Divisions of life consist entirely of microbes. One Division is the Bacteria, which has a remarkable range of species adapted to many different environments. This Division includes microbes which cause diseases such as whooping cough, bubonic plague and strep throat. It includes the cyanobacteria which can poison New Zealand lakes and rivers during the summer. It includes the microbes which live in the roots of plants such as gorse, clover and beans, helping them grow on infertile soils. It includes the microbes which make yoghurt, sauerkraut and salami (although salami also contains some microbes which belong to the same Division as we do). There are many, many more types of bacteria too, so many it’s overwhelming. I could write pages listing all the things bacteria do, but that wouldn’t be particularly interesting.
The third Division is the most obscure. Until the 1970s, members of this group, known as the Archaea, were classified along with bacteria. Superficially, they look like bacteria – if you have a very powerful microscope. Of course, without one you can’t see them at all. But in terms of evolution, scientists are still debating whether they are more closely related to Bacteria or the Division which we belong to, the Eukarya. Initially, scientists thought that Archaea were confined to extreme environments, such as the mud of swamps where there was no oxygen, hot springs such as those at Yellowstone National Park and the extreme cold of Antarctica. Now we know they occur almost everywhere, including the rumens of cows and sheep, where they produce the methane which contributes to climate change. However, they don’t appear to be important as a cause of disease, at least not as far as we’ve shown so far.
Before I go any further, I should mention two other groups which fit under the category of ‘microbes’. These are the viruses and the prions. They are different again, because they aren’t usually classified as living things. If that sounds weird, that they are microbes but aren’t actually alive, I won’t disagree. Viruses and prions are profoundly weird (that’s my scientific opinion) and I would love to explain their weirdness to you at some stage, but Craig didn’t work on them so I’m leaving them out of this discussion.
The challenge with microbes, something I grappled with in my masters degree, is that because they are so small, we need technology to even see them, let alone understand them. The first technology to do this was the microscope, invented by Antonie van Leeuwenhoek in the 1670s. He was the first to see what he described as ‘very little animalcules’ from a diverse range of sources including pondwater, the human mouth and semen. (Van Leeuwenhoek was not, however, the man who confused the field of reproductive biology for many years by claiming to have seen a tiny human curled up in the head of a sperm. That was one of his students, Nicolaas Hartsoeker.)
But microscopes only took us so far when it came to the classification of microbes. They were great for the larger microbes, particularly the ones belonging to the same Division as we do, but they were limited in how much they could tell us about bacteria and archaea. We could see the basic shapes, but not much more. Even with the invention of more powerful microscopes, known as electron microscopes, in the 1930s, bacteria and archaea were difficult to identify by appearance.
Scientists had to resort to more indirect ways of classifying microbes. One such classification was using what is known as the Gram stain, named after Hans Gram, who invented it in the late 1800s. He discovered that some bacteria stained purple when treated with a particular dye, and others did not. When I was at university in the early 1990s, we were still taught to do this test.
While individual microbes can’t be seen without a microscope, if there are enough of them together we can see them. Given the right conditions and food, some bacteria, as well as fungi and other microbes, can be grown in a lab, in shallow dishes on a layer of agar. Agar is a jelly made from seaweed. The microbes grow to form colonies, and the appearance of the colonies was another method by which scientists identified them. Most Archaea, though, proved resistant to our attempts to grow them in the lab.
Then came a revolution. It had been building for a century, since 1869 when Swiss chemist Friedrich Miescher discovered a molecule which he called nuclein. This molecule was found in every individual cell, but he didn’t know what it did – nobody would until the 1940s. By this time the molecule was known as deoxyribose nucleic acid, which was such a mouthful that scientists abbreviated the name to DNA. In the 1940s, Oswald Avery and his colleagues worked out that DNA was the molecule which carried hereditary information or genes. In the 1950s, a group of biophysicists at King’s College in London worked out the structure – you may have heard of James Watson, Francis Crick and Rosalind Franklin, but there were others involved too. (It’s an interesting story, and it’s well-worth checking out this recent article which documents Franklin’s role in far more depth than most media accounts.)
I’ll get to the revolution in a moment, but, before I do, I need to explain what the work of Watson, Crick, Franklin and others showed.
DNA is a highly complex molecule. Its structure is described as a double helix – imagine a ladder which has been twisted, only the ladder is incredibly long, with hundreds of thousands to hundreds of millions of rungs. The side rails of the DNA ladder are made up of two types of smaller molecules, a sugar and a phosphate, which alternate. The rungs are made up of a pair of molecules known as nucleobases, but they are usually just called bases. They are always attached to the sugars, never the phosphates. There are four types of these bases, and when talking about DNA, scientists generally just abbreviate the names to A, C, G and T. Crucially, if one side of the rung has an A, then the other side must have a T. If one side of the rung has a C, then the other must have a G. But going up and down the ladder, the bases can be in any order at all. One side could have the bases in the order ATTCGC, and then the other would have TAAGCG. Or one side could have ACACTT, then the other would have TGTGAA. (If you prefer a video explanation, I’ve linked to one here.)
These four letters, A, C, G and T, form the basis for the code of life. Three of these letters together make up one unit of code, for example AAG. There are sixty-four possible combinations of the three letters. Sixty-one of these combinations code for the molecules which are the building blocks of proteins. But there are only twenty of these building blocks, which means more than one combination of letters may code for the same building block. For example, a building block called proline is coded by four different combinations of letters, including CCA and CCC. This still leaves three combinations of letters. These are stop signals. They signal the end of the segment of DNA which codes for a particular protein. (I’ve included a link to a video which explains this in more detail.)
Every living thing is built from these proteins. They are the main working molecules of cells, as well as making up the structure of cartilage and hair. Other molecules may strengthen bone and wood, but it’s still protein which does most of the work.
Segments of the enormously long ladders of DNA which code for different traits, such as eye colour, are known as genes. This means that if you read the order of the letters on the DNA, you are directly reading an organism's genes, its code of life. That is what scientists first managed to do in the early 1970s. Instead of identifying living things by appearance, now they could look directly at their genes, a technique known as sequencing. That was the revolution. It changed the way scientists looked at every living thing.
Before sequencing, scientists could only make educated guesses on what was related to what based on their apparent similarities and differences. But now they could read the genes, they could see what was related to what, based on how similar the genes were. The more similar the genes, the more closely related two organisms were. If one organism had a piece of code TAAGCG and one had a piece of code which was TAAGCT, then they were closely related, because they only had one letter different in their sequence. But a species with the sequence AGCCTT would not be closely related.
Scientists need longer sequences than this to draw meaningful conclusions, but perhaps not as long as you might think. Human DNA fingerprinting, for example, uses sequences 100-300 bases long. One of the most important genes used for classifying bacteria is around 1550 bases long.
Reading the sequences sometimes overturned centuries of scientific knowledge. For example, when I was at university, magnolias and waterlilies were classified as belonging to a plant group called dicots, along with species like roses, buttercups and daisies. This grouping had first been identified in the 1600s. But by reading the sequences, and comparing how similar the sequences were, scientists realised that neither magnolias nor waterlilies were remotely related to the other dicots. As someone who learned much of my botany before many of the major discoveries which resulted from sequencing plants, I’m constantly having to revise what I thought I knew.
The ability to read sequences was particularly useful for organisms like bacteria, which were hard to identify in other ways. Archaea were discovered precisely because scientists looked at the genes of certain methane-generating microbes and found that they bore no resemblance to bacteria at all.
Just as the amount of data we can store on computers has increased at a staggering rate, so has our ability to analyse sequences. Initially, it was a slow, complicated process. The first sequence, from a type of yeast, was produced in 1965, but the first whole gene wasn’t sequenced until 1972. That gene was from a virus, and the whole virus wasn’t sequenced until 1976. The whole length of all the genes, collectively known as the genome, only added up to 3569 bases. In comparison, humans have a genome which is 3.2 billion bases long.
Between those two sequences was a crucial development – the automation of the sequencing process in 1987. This made the whole process much more practical. With the process automated, scientists decided to try sequencing the human genome. They began in 1990, and it took them until 2003 to sequence the whole thing.
Today, this all seems hard to imagine. In China, the first cases of COVID-19 were seen at the start of December 2019, perhaps a little earlier. The disease was reported to the World Health Organisation on the 31st of December 2019. China reported that they had determined the cause to be a new coronavirus on the 7th of January. On the 11th of January, Chinese scientists reported the whole sequence of the new virus, depositing it in a publicly available global repository of sequences known as GenBank. More sequences would rapidly follow as the virus spread to other countries – French scientists reported the sequence by the 29th of January, with the first case only reported on the 24th. Admittedly, the whole genome is only around 30,000 bases long, but it shows how far we have come.
Craig Cary and his colleagues applied these techniques to microbes living in extreme environments, such as the soil of Antarctica’s Dry Valleys. At the time, the accepted wisdom on the Dry Valley microbes was that they didn’t differ much from place to place. A paper that he published in 2010 overturned that idea. There was a remarkable diversity of life in those harsh conditions, but until that point we simply didn’t have the ability to see it. He demonstrated dramatic impacts from seemingly tiny changes in the environment, such as the seals which occasionally got lost and died in the valleys. The community of microbes under the dead seals was completely different from that in the surrounding soil.
There’s much more that Craig did, but documenting all of this is not my intention here. People who knew him better will do that. I simply wanted to put his work into context, to explain what he studied and why it’s such a tricky business working with the microbes that he helped us understand.
Thank you for this lengthy explanation. It makes me appreciate so much more the folks who spend their days in labs.
Wonderful wonderful piece on Craig. And so well explained. Thanks for this knowledge shared.