Sufeng Niu Papers

ICML Conference 2020 Conference Paper

Learning to Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning

Sai Krishna Gottipati
Boris Sattarov
Sufeng Niu
Yashaswi Pathak
Haoran Wei
Shengchao Liu
Simon Blackburn
Karam M. J. Thomas

Over the last decade, there has been significant progress in the field of machine learning for de novo drug design, particularly in generative modeling of novel chemical structures. However, current generative approaches exhibit a significant challenge: they do not ensure that the proposed molecular structures can be feasibly synthesized nor do they provide the synthesis routes of the proposed small molecules, thereby seriously limiting their practical applicability. In this work, we propose a novel reinforcement learning (RL) setup for de novo drug design: Policy Gradient for Forward Synthesis (PGFS), that addresses this challenge by embedding the concept of synthetic accessibility directly into the de novo drug design system. In this setup, the agent learns to navigate through the immense synthetically accessible chemical space by subjecting initial commercially available molecules to valid chemical reactions at every time step of the iterative virtual synthesis process. The proposed environment for drug discovery provides a highly challenging test-bed for RL algorithms owing to the large state space and high-dimensional continuous action space with hierarchical actions. PGFS achieves state-of-the-art performance in generating structures with high QED and clogP. Moreover, we validate PGFS in an in-silico proof-of-concept associated with three HIV targets. Finally, we describe how the end-to-end training conceptualized in this study represents an important paradigm in radically expanding the synthesizable chemical space and automating the drug discovery process.

Details

AAAI Conference 2018 Conference Paper

Generalized Value Iteration Networks:Life Beyond Lattices

Sufeng Niu
Siheng Chen
Hanyu Guo
Colin Targonski
Melissa Smith
Jelena Kovačević

In this paper, we introduce a generalized value iteration network (GVIN), which is an end-to-end neural network planning module. GVIN emulates the value iteration algorithm by using a novel graph convolution operator, which enables GVIN to learn and plan on irregular spatial graphs. We propose three novel differentiable kernels as graph convolution operators and show that the embedding-based kernel achieves the best performance. Furthermore, we present episodic Q-learning, an improvement upon traditional n-step Q-learning that stabilizes training for VIN and GVIN. Lastly, we evaluate GVIN on planning problems in 2D mazes, irregular graphs, and realworld street networks, showing that GVIN generalizes well for both arbitrary graphs and unseen graphs of larger scale and outperforms a naive generalization of VIN (discretizing a spatial graph into a 2D image).

PDF Details

Possible papers

Learning to Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning

Generalized Value Iteration Networks:Life Beyond Lattices