close
close

New chemical language model can predict drug candidates with dual targets

Researchers at the University of Bonn have trained an AI process to predict potential active ingredients with special properties. So they derived a chemical language model – a kind of ChatGPT for molecules. After a training period, the AI ​​was able to accurately reproduce the chemical structures of compounds with known dual-target activity, which may be particularly effective drugs. The study has now been published Cell Reports Physical Science. Do not publish until Wednesday, October 23rd at 5:00 p.m. CEST!

If you want to delight your grandma with a poem for her 90th birthday, you no longer have to be a poet: a short prompt in ChatGPT is enough and within a few seconds the AI ​​will spit out a long list of rhyming words with the name of the birthday child. If you wish, a sonnet can even be produced to accompany it.

Researchers at the University of Bonn implemented a similar model in their study – a so-called chemical language model. However, there are no rhymes. Instead, the AI ​​displays the structural formulas of chemical compounds that may have a particularly desirable property: they are able to bind to two different target proteins. For example, in the organism they can inhibit two enzymes at the same time.

Wanted: Active ingredients with dual effects

Such active ingredients are very popular in pharmaceutical research due to their polypharmacology.”


Prof. Dr. Jürgen Bajorath

The expert in computational chemistry heads the AI ​​in Life Sciences department at the Lamarr Institute for Machine Learning and Artificial Intelligence and the Life Science Informatics course at b-it (Bonn-Aachen International Center for Information Technology) at the University of Bonn. “Because compounds with desirable multi-target activity influence several intracellular processes and signaling pathways at the same time, they are often particularly effective – for example in the fight against cancer.” In principle, this effect can also be achieved by administering different drugs at the same time. However, there is a risk of adverse drug interactions and different compounds often break down at different rates in the body, making co-administration difficult.

Finding a molecule that specifically influences the action of a single target protein is no easy task. It is even more complicated to design compounds that have a predefined dual effect. Chemical language models could help here in the future. ChatGPT is trained on billions of pages of written text and learns to formulate sentences on its own. Chemical language models work similarly, but only have comparatively very small amounts of data to learn from. Basically, they are also fed with texts, for example with so-called SMILES strings, which represent organic molecules and their structure as a sequence of letters and symbols. “We have now trained our chemical language model with string pairs,” says Sanjana Srinivasan from Bajorath’s research group. “One of the strings described a molecule that we know only works against one target protein. The other represented a compound that, in addition to this protein, also affects a second target protein.”

AI learns chemical relationships

The model was fed with more than 70,000 of these pairs. In doing so, it gained implicit knowledge about how the normal active ingredients differ from those with dual action. “When we then fed it a compound against a target protein, it suggested molecules that would act not only against that protein, but also against another one,” explains Bajorath.

The dual action training ingredients often target proteins that are similar and therefore perform a similar function in the body. However, pharmaceutical research also looks for active ingredients that influence completely different classes of enzymes or receptors. In order to prepare the AI ​​for this task, fine-tuning was carried out after the general learning phase. The researchers used several dozen special training pairs to teach the algorithm which different classes of proteins the proposed compounds should target. This is a bit like telling ChatGPT not to create a sonnet this time, but a limerick.

After fine-tuning, the model actually spit out molecules that have already been shown to work against the desired combinations of target proteins. “This shows that the process works,” says Bajorath. However, in his opinion, the strength of the approach is not that it can immediately find new compounds that outperform existing drugs. “What’s more interesting from my point of view is that AI often suggests chemical structures that most chemists wouldn’t immediately think of,” he explains. “In a sense, it triggers out-of-the-box ideas and produces original solutions that can lead to new design hypotheses and approaches.”

Involved institutions and funding:

The study was carried out at the University of Bonn at the Lamarr Institute and b-it.

Source:

Magazine reference:

Srinivasan, S & Bajorath, J., (2024) Generation of dual-target compounds using a transformational chemical language model. Cell Reports Physical Science. doi.org/10.1016/j.xcrp.2024.102255.