This event has passed.

Large language models for scientific discovery in molecular property prediction – Aqsa Awan

Name: Large language models for scientific discovery in molecular property prediction – Aqsa Awan
Start: 2025-06-20T11:00:00+09:00
End: 2025-06-20T12:30:00+09:00
Location: B232 Seminar Room, IBS

June 20, 2025 @ 11:00 am - 12:30 pm KST

https://www.ibs.re.kr, 55 Expo-ro Yuseong-gu
Daejeon, Daejeon 34126 Korea, Republic of + Google Map

https://www.ibs.re.kr/bimag/event/large-language-models-for-scientific-discovery-in-molecular-property-prediction-aqsa-awan/

Speaker

Aqsa Awan
KAIST

In this talk, we discuss the paper “Large language models for scientific discovery in molecular property prediction” by Yizhen Zheng et.al., nature machine intelligence, 2025.

Abstract

Large language models (LLMs) are a form of artificial intelligence system encapsulating vast knowledge in the form of natural language. These systems are adept at numerous complex tasks including creative writing, storytelling, translation, question-answering, summarization and computer code generation. Although LLMs have seen initial applications in natural sciences, their potential for driving scientific discovery remains largely unexplored. In this work, we introduce LLM4SD, a framework designed to harness LLMs for driving scientific discovery in molecular property prediction by synthesizing knowledge from literature and inferring knowledge from scientific data. LLMs synthesize knowledge by extracting established information from scientific literature, such as molecular weight being key to predicting solubility. For inference, LLMs identify patterns in molecular data, particularly in Simplified Molecular Input Line Entry System-encoded structures, such as halogen-containing molecules being more likely to cross the blood–brain barrier. This information is presented as interpretable knowledge, enabling the transformation of molecules into feature vectors. By using these features with interpretable models such as random forest, LLM4SD can outperform the current state of the art across a range of benchmark tasks for predicting molecular properties. We foresee it providing interpretable and potentially new insights, aiding scientific discovery in molecular property prediction.

Details

Date: June 20, 2025
Time:
11:00 am - 12:30 pm KST
Event Category: Journal Club

Organizer

Jae Kyoung Kim
Email jaekkim@kaist.ac.kr

Venue

B232 Seminar Room, IBS
55 Expo-ro Yuseong-gu
Daejeon, Daejeon 34126 Korea, Republic of + Google Map
View Venue Website