This chapter has been quite a journey. We began with a discussion of the fundamental problem of determining a protein’s structure. Because experimental methods for identifying protein structure are costly and time consuming, we transitioned to cover algorithmic approaches that do a good job of predicting a protein’s structure from its sequence of amino acids.
We then discussed how to compare protein structures, with a lengthy case study comparing the SARS-CoV and SARS-CoV-2 spike proteins. The problem of quantifying the difference between two structures is challenging, and we established both global and local structure comparison metrics. We applied these approaches to isolate three candidate regions of interest of the SARS-CoV-2 spike protein that differ from the SARS-CoV spike protein when complexed with the ACE2 enzyme, and we quantified the binding of this complex using a localized energy function.
We concluded with a study of molecular dynamics. If we hope to fully understand a protein’s function, then we need to know how it flexes and bends within its environment, sometimes in order to interact with other molecules.
Despite covering a great deal of ground, we have left just as much unstudied. For one example, the surface of viruses and host cells are “fuzzy” because they are covered by structures called glycans, or numerous monosaccharides linked together. SARS-CoV and SARS-CoV-2 have a “glycan shield”, in which glycosylation of surface antigens allows the virus to hide from antibody detection. Researchers have found that the SARS-CoV-2 spike protein is heavily glycosylated, shielding around 40% of the protein from antibody recognition1.
Finally, although the study of proteins is in some sense the first major area of biological research to become influenced by computational approaches, we hope that this chapter has made it clear that the analysis of protein structure and dynamics is booming. With the COVID-19 pandemic reiterating the importance of biomedical research, the dawn of Alphafold showing the power of supercomputing to solve age-old problems, and the promise of computational modeling to discover the medications of the future, new companies and organizations are rising up to study proteins like bamboo shoots after the rain.
Thus concludes our discussion of protein analysis. In the course’s final module, we will turn our attention to a very different type of problem. To fight a virus like SARS-CoV-2, your body employs a cavalry of white blood cells. Maintaining healthy levels of these cells is vital to a strong immune system, and blood reports run counts of these cells to ensure they are within normal ranges. Can we teach a computer to run this analysis automatically?
Note: Although we have covered a great deal in this chapter, there is still much more to say about SARS-CoV-2. What happens after the spike protein binds to ACE2? How does the virus enter the cell and replicate? How does it fight our immune systems, and how should we design a vaccine to fight back? If you are interested in an online course covering some of these questions, then check out the free online course SARS Wars: A New Hope by Christopher Langmead.
Grant, O. C., Montgomery, D., Ito, K., & Woods, R. J. Analysis of the SARS-CoV-2 spike protein glycan shield: implications for immune recognition. bioRxiv : the preprint server for biology, 2020.04.07.030445. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7217288/ ↩