Rosetta Covid 19 Seminar
David Baker Webinar - Recent Advances in Protein Folding
Video address of the online seminar
The screenshots in the article are from the presentation of the seminar.
Since I am not a biology major, and I am interested in viral protein research, I learned some basic knowledge about protein folding, and then I watched the seminar, the first 40 minutes could be roughly understood, but in the second half was confused, perhaps because of the lack of molecular biology knowledge, I share some of my own knowledge from the video (multiple images).
When the protein is unfolded, the energy is high, and after the folding is complete, the energy of the whole protein tends to be the global minimum. Here we can see that protein folding is theoretically a matter of calculating the amino acid sequence of a given peptide chain, and how the final peptide chain folds into a three-dimensional structure that tends to minimize the energy of the protein.
The amino acid sequences of proteins in nature are actually encoded in our gene sequences, which through transcription and translation eventually synthesize a variety of amino acids together through a certain sequence, which we call the polypeptide chain.
If, through the reverse operation, we study how the structure of the protein can fulfill a specific biological function, then we work backwards to derive the amino acid sequence of the protein, because the primary structure of the polypeptide chain determines the three-dimensional structure of the protein, we can encode the amino acid sequence of these proteins into the gene pool and then synthesize the protein. If we simply consider that a peptide chain has 100 amino acids, and there are 20 common amino acids, then the number of possible peptide chains we need to try to calculate is 20^100.
The number of possible peptide chains is 20^100.
To calculate the energy function of a protein, we have not yet understood the equation, but roughly understand that the total energy is thermal energy + structural energy.
The total energy is roughly understood as thermal energy + structural energy.
During the spread of the virus, the following diagram can be understood as the entire Worflow of the researcher:
- sending notifications and warnings ⚠️ to the outside world.
Analysis of the genetic material of the virus 🧬 sequence. 3.
-
modeling of viral proteins for research.
-
diagnosis -> treatment -> vaccine development.
The first case of COVID-19 was diagnosed in early December, and the gene sequence of the virus was published in early January.
The figure on the right shows a comparison of the 3D structures obtained by Robetta and X-ray, respectively, with an RMSD of less than 1 Angstrom.
The RMSD is used to describe the difference between the 3D structures of the two proteins and is represented by the mathematical symbols (v,w represent the set of atoms of the two 3D spatial structures, respectively):
The calculated predicted 3D structure is already very close to the actual structure measured in the laboratory.
The difference is expressed in mathematical notation.
Design of a new detection system for COVID-19 diagnosis.
The picture below is not very intelligible. Guess that’s how to accurately identify and detect SARS-COV-2.
The second part describes the development of a drug that binds to proteins on the surface of SARS-COV-2, thereby depriving the virus of its ability to invade human cells.
The computer-generated drug binds to the Spike protein to inhibit the spread of the virus.
Find a solution to this inhibitory drug. 1:
- finding the target of the viral protein. 2. docking structure selection.
Select the docking structure. 3.
Simulate the docking of the assembled structure to the target. 4.
- use Rosetta to predict the amino acid sequence of the docking structure.
This can be understood as designing a mini protein binder to prevent the entry of SARS-COV-2 into human cells via the ACE2 receptor.
The next step is to introduce self-assembling systems. In the first picture, we can understand how a system evolves and assembles itself is determined by the properties that have been encoded in each part of the system. Complex and irregular protein structures make it a challenge to design new intermolecular interactions, and the figure below shows the structure of an autonomously loaded protein, which through its simple structure allows the protein to evolve complex morphologies.
With the Rosetta tool, we can study the folding and docking of biomolecular polymers as well as the design of proteases, the folding of RNA, and other topics.
The figure below shows a comparison between the designed model and the crystal structure measured in the laboratory using the self-assembly system approach.
The paper was published in the 2016 issue of Science: [Accurate design of megadalton-scale two-component icosahedral protein complexes](https://science. sciencemag.org/content/353/6297/389)
This method was used to develop nanoparticles that bind to viral proteins and was used to treat SARS-COV-2.
In conclusion, in the next global pandemic, researchers will be able to more quickly and accurately determine the structure of proteins, quickly diagnose infected populations, quickly treat patients with a deeper understanding of molecular interactions, and finally gain a deeper understanding of the immune system in the development of vaccines, and finally use deep learning in the Rosetta program to more accurately model and design proteins.
Personal opinion
From the macrocosm to the microcosm, we may only understand how the world works, but not the precise calculations behind it. The collision of black holes in the universe, the bending of space-time, and time travel may seem irrelevant to our lives, but when it comes to the microscopic world, it has everything to do with human flourishing. If the immune system is suppressed, then the disease is catastrophic, perhaps in the microscopic world, which is another magnificent “universe” viewed from another hidden dimension of space-time. The study of the quantum world can help humanity to move forward.
If we want to make a product, we will first start from commercial interests, and then we will speculate that the startup will not survive long and then close down. The starting point has to be good.