A new computational model combines artificial intelligence with biology to design and fold proteins in a method that could yield specialized proteins for different applications, such as an edible coating for crops that reduces food waste.
The model starts with raw amino acids, known as the building blocks of all life, to create new proteins. The machine learning system can accurately predict dihedral angles, which are the angles that connect atoms within chains of proteins. Knowing these angles can help researchers quickly find new proteins that are structurally stable — and boast a huge variety of applications.
"Proteins may represent the key construction materials of the living world," the model's inventors note in their patent application, "and offer a powerful platform for potential use in bioengineering and materials science." This invention, published by the U.S. Patent and Trademark Office on April 1, combines bioengineering and materials with protein-based edible coatings for food. It is a collaboration between researchers from IBM, the Massachusetts Institute of Technology and other institutions.
30-40% of the total food supply in the United States is ultimately wasted. Microbes can infect fresh produce in the field, temperature changes and poor handling during transport can make the product unsafe for human consumption and incorrect storage at home can spoil fresh food. In the U.S. alone, $162 billion worth of edible food is not eaten. The researchers' hope is to design new proteins that can cover vegetables or fruits in a thin layer, preventing produce from spoiling and eventually saving millions of dollars now lost to food waste.
The novel method, computer system and program can design complex proteins from a sequence of amino acids and then fold them. Tengfei Ma, co-inventor and a research staff member at IBM, likens these sequences to language. Just as humans learn how to combine words to form more complex grammatical patterns, such as sentences and paragraphs, machines can learn how to compose sequences of amino acids into proteins. The process of folding these structural proteins makes a three-dimensional protein chain functional.
"Once we have a way to use AI to make a fast prediction of the 3D structure of proteins, hopefully we can accelerate the whole scientific discovery process," Pin-Yu Chen told The Academic Times. Chen is a co-inventor and an IBM research staff member who works in the company's Trusted AI Group. Chen, Ma and their collaborators are already on the way to discovering new proteins through artificial intelligence.
The researchers at IBM developed a machine learning technique to speed up the protein design problem. That technique incorporated a lot of knowledge from biology experts at MIT. Adding biology to the AI model allowed the researchers "to make the prediction more feasible, as the team at MIT has the ability to synthesize all the proteins we generate," Chen explained. "Accurate angle prediction may accelerate the process with less than six orders of magnitude time — from days or weeks to second or minutes," the authors stated in the patent application. AI has been known to allow for a faster turnaround time in the long process of protein engineering.
In a paper published in 2019, the researchers successfully designed a simpler type of protein with a computer model. The team created an algorithm called a multi-scale neighborhood-based neural network, which could predict a certain type of protein many times faster than current methods. "If you run these simulations to figure out the 3D structures, it's actually very expensive and takes a lot of time to run, but if you use our tool, it can give you roughly similar predictions in milliseconds," said Chen. With this model, the researchers were able to fold a structural protein called an alpha-helical protein from a basic amino acid sequence. Alpha-helical proteins are a single coiled chain of amino acids and are comparatively easier to design.
"It's safe to say that our algorithm can solve the alpha helix prediction problem," Chen noted, "but our current goal is to develop an algorithm to make complex proteins." The team's ultimate ambition is to design new proteins with specific functions, not only to curb food waste but to advance other areas of biology as well.
Ma hopes to collaborate with other experts in the medical field to improve their predictions. He thinks that unique 3D structures of proteins can help detect disease in the future, though he emphasizes that there's still a lot of research to be done. Chen doesn't think that AI can replace scientific discovery entirely but said that it is a great low-cost tool for predicting proteins and running simulations.
Future research from the team is also, surprisingly, musical. Co-inventor Markus Jochen Buehler is looking into the audible qualities of COVID-19 proteins and discovering a way to "play the protein," Chen mentioned. Buehler recently created otherworldly music from digital scans of spiderwebs.
"Part of the IBM idea is to contribute to our society," Chen said "We carry that gene of doing something good for science." The company funds research with its Science for Social Good project that looks at how machine learning can solve some of humanity's most pressing issues, including the opioid epidemic and Zika virus. Ma adds that, "Artificial intelligence research can make a real social impact on the world" by sorting through available data to find solutions.
The application for this patent, "Designing and folding structural proteins from the primary amino acid sequence," was filed Sept. 27, 2019 with the U.S. Patent and Trademark Office. It was published April 1 with the application number 16/585679. The inventors of the pending patent are Lingfei Wu, Siyu Huo, Tengfei Ma and Pin-Yu Chen, IBM Research; Zhao Qin, Syracuse University; Eugene Jungsup Lim, Hui Sun, Benedetto Marelli, and Markus Jochen Buehler, Massachusetts Institute of Technology; and Francisco Javier Martin-Martinez, Universitat Autònoma de Barcelona. The applicants are the International Business Machines Corporation and the Massachusetts Institute of Technology.
Parola Analytics provided technical research for this story.