“DALL-E 2 of biology” designs proteins for new drugs

"Now that we have this ability, the possibilities of what we can produce are endless."

The recent release of powerful text-to-image AIs like DALL-E 2 has given anyone the ability to generate photorealistic images based on nothing but short text prompts. 

Now, the same AI technique is being used to generate complex, never-seen-before proteins on-demand — and these “programmable proteins” could one day be used to treat countless medical conditions.

Proteins 101: Proteins are hugely important to life — these molecules give our cells shape, power the immune system, help build and repair our tissues, transport oxygen throughout the body, and so much more.

A protein’s structure determines its function — and there are a mind-boggling number of possible structures.

Each protein is made up of a long string of chemical compounds called “amino acids.” Twenty types of amino acids can be found in proteins, and a single protein chain can be thousands of amino acids long.

The amino acids in a protein cause it to fold into a complex three-dimensional structure. A protein’s structure determines its function, so if we know that, we have a better idea of what the protein does and how it works — and that can have huge implications for medicine.

Our understanding of the structure of the coronavirus’ spike protein, for example, was key to the development of COVID-19 vaccines. Monoclonal antibodies, meanwhile, are clones of proteins that we make in the lab; they’re used to treat infections, cancers, Alzheimer’s, and more.

AI advances: There are a mind-boggling number of possible protein structures — one estimate puts the number at a googol cubed, or 1 followed by 300 zeroes — and traditionally, the process of identifying a single protein’s structure has been expensive and time-consuming. 

“It is akin to learning how to ​’write’ in the mysterious language of proteins.”

Gevorg Grigoryan

That changed with the development of AlphaFold, an AI that can accurately predict how a protein will fold based on its sequence of amino acids. AlphaFold was a huge boon for research, giving scientists access to the basic structures of all 200 million known proteins.

But now, Boston-based startup Generate Biomedicines is further advancing our understanding and use of proteins by training an AI called “Chroma” to create proteins with structures no one has ever seen before.

“We believe our model will have revolutionary implications,” said Gevorg Grigoryan, Generate’s co-founder and CTO. ​”It is akin to learning how to ​‘write’ in the mysterious language of proteins. Now that we have this ability, the possibilities of what we can produce are endless.”

protein folding AI
Examples of proteins generated by Chroma. Credit: Generate Biomedicines

How it works: Generate described Chroma to MIT Technology Review as the “DALL-E 2 of biology,” and as is the case with the text-to-image AI, the generation process starts with a user submitting a request — they might ask for a protein with a certain size, shape, or function, for example. 

The AI will then use the same technique utilized by DALL-E 2 — diffusion modeling — to generate a protein that contains the right amino acids folded in the right way to meet the constraints of the prompt.

In a paper now available as a preprint, the Generate team showed how Chroma could be used to design proteins in the shapes of all 26 letters of the Latin alphabet and the numerals 0 through 9.

protein folding AI
Proteins designed to match the shapes of letters and numbers. Credit: Generate Biomedicines

They also demonstrated how the system can be used to generate giant proteins with thousands of amino acids and “complexes” containing multiple proteins of different shapes.

protein folding AI
A protein containing 2,000 amino acids (left) and a complex containing multiple proteins (right). Credit: Generate Biomedicines

The big picture: Just like DALL-E 2 wasn’t the first text-to-image AI, Generate’s Chroma isn’t the first AI designed to generate new proteins, but it is trained on more data than past efforts and gives researchers more control over the type of protein produced.

“It may be fair to say that this is more like DALL-E because of how they’ve scaled things up,” Namrata Anand, who shared a paper in May 2022 detailing a protein-generating AI she’d co-developed, told MIT Tech.

“At the end of the day what matters is whether we can make medicines that work or not.”

Gevorg Grigoryan

Designing new proteins is just the first step to revolutionizing healthcare, though.

The Generate team is now focusing on recreating some of their AI’s designs in the lab. After that will come the lengthy process of developing therapies using the novel proteins and then testing them in animals and humans.

“We’re a drug company,” Grigoryan told MIT Tech. “At the end of the day what matters is whether we can make medicines that work or not.”

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at tips@freethink.com.

Related
Insulin grown in lettuce can be taken orally
New synthetic insulin harvested from lettuce plants can be made cheaply, taken orally, and transported at room temperature.
Farmers can fight invasive insects with AI and a robotic arm
As the invasive spotted lanternfly threatens to expand its range, Carnegie Mellon researchers are developing a robot to fight back.
Google unveils AI try-on feature for shopping
Google’s AI-powered virtual try-on feature lets shoppers see what an article of clothing would look like on a wide range of models.
Cancer med appears to prevent brain aneurysms
Japanese researchers have discovered that the cancer drug sunitinib can prevent the formation of brain aneurysms in mice.
SpaceX successfully launches world’s first “space factory”
SpaceX has successfully deployed a “space factory” developed by startup Varda Space Industries to manufacture drugs in microgravity.
Up Next
Subscribe to Freethink for more great stories