Salesforce project is creating proteins with generative AI
Artificial proteins are as good as the natural onesBy Alfonso Maruccia
Why it matters: Generative AI technology can be much more than a mere threat to the livelihood of countless artists and writers. According to Salesforce, a perfectly trained ML algorithm can help with the creation of artificial proteins useful for healthcare or environmental sustainability.
Salesforce Research has been working on an innovative bio-application for artificial intelligence algorithms. ProGen is an AI model designed to create synthetic proteins. It was trained with hundreds of millions of protein sequences in textual form resulting in artificial proteins that are as efficient as the natural ones at removing waste.
ProGen has been featured in this month's edition of Nature Biotech, where the researchers also described how the first known 3D structure of an artificial protein was fully designed by an AI system. ProGen is a language model that can "generate protein sequences with a predictable function across large protein families," just like ChatGPT can put different chunks of text together to get "grammatically and semantically correct natural language sentences on diverse topics."
The new AI model was trained on 280 million protein sequences from more than 19,000 families, the researchers said, and it has been "augmented" with control tags containing protein properties. With enough homologous samples from different protein families, ProGen can be further fine-tuned to "improve controllable generation performance of proteins."
In other words, ProGen's generative AI gives researchers the ability to design highly-tailored proteins "with desired properties" by using a controllable tool. ProGen "learned" protein synthesis rules by looking at the protein sequence database, and it was capable of generating a bunch of proteins that researches later tested for their actual antibacterial properties in a lab.
According to Salesforce, test results showed that 73 percent of artificial proteins generated by ProGen were "functional" compared to 59 percent of natural proteins.
ProGen shows how scientists can develop a "proactive approach" to protein design, Salesforce said. In the future, the company hopes this new approach could help speed up the development of "treatments for diseases" and enzymes with industrial or environmental applications. Plastic-eating proteins, another potential application, could be a game changer.
Salesforce said it is already leveraging the ProGen generative model to identify "potential treatments" for neurological and autoimmune disorders such as rheumatoid arthritis and multiple sclerosis.