Author name cluster

Jason Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers

2 author rows

ICML Conference 2025 Conference Paper

DOLPHIN: A Programmable Framework for Scalable Neurosymbolic Learning

Aaditya Naik
Jason Liu
Claire Wang
Amish Sethi
Saikat Dutta 0001
Mayur Naik
Eric Wong 0001

Neurosymbolic learning enables the integration of symbolic reasoning with deep learning but faces significant challenges in scaling to complex symbolic programs, large datasets, or both. We introduce DOLPHIN, a framework that tackles these challenges by supporting neurosymbolic programs in Python, executing complex symbolic reasoning on the CPU while vectorizing probabilistic computations and gradient propagation on the GPU. Across 13 benchmarks spanning tasks over text, image, and video data, with symbolic reasoning features like recursion and blackbox functions, DOLPHIN converges to state-of-the-art accuracies on the more complex benchmarks while existing frameworks such as Scallop, ISED, and IndeCateR+ fail to converge within the time limit. On simpler benchmarks, DOLPHIN matches their performance, while achieving these results 1. 71x to 62x faster than the baselines. Overall, DOLPHIN advances the scalability of neurosymbolic frameworks, achieving state-of-the-art efficiency and convergence on difficult benchmarks where existing frameworks struggle. The code is published at https: //github. com/Dolphin-NeSy/Dolphin.

Details

AAAI Conference 2024 Conference Paper

Relational Programming with Foundational Models

Ziyang Li
Jiani Huang
Jason Liu
Felix Zhu
Eric Zhao
William Dodds
Neelay Velingker
Rajeev Alur

Foundation models have vast potential to enable diverse AI applications. The powerful yet incomplete nature of these models has spurred a wide range of mechanisms to augment them with capabilities such as in-context learning, information retrieval, and code interpreting. We propose Vieira, a declarative framework that unifies these mechanisms in a general solution for programming with foundation models. Vieira follows a probabilistic relational paradigm and treats foundation models as stateless functions with relational inputs and outputs. It supports neuro-symbolic applications by enabling the seamless combination of such models with logic programs, as well as complex, multi-modal applications by streamlining the composition of diverse sub-models. We implement Vieira by extending the Scallop compiler with a foreign interface that supports foundation models as plugins. We implement plugins for 12 foundation models including GPT, CLIP, and SAM. We evaluate Vieira on 9 challenging tasks that span language, vision, and structured and vector databases. Our evaluation shows that programs in Vieira are concise, can incorporate modern foundation models, and have comparable or better accuracy than competitive baselines.

PDF Details DOI

ICML Conference 2022 Conference Paper

Learning inverse folding from millions of predicted structures

Chloe Hsu
Robert Verkuil
Jason Liu
Zeming Lin
Brian Hie
Tom Sercu
Adam Lerer
Alexander Rives

We consider the problem of predicting a protein sequence from its backbone atom coordinates. Machine learning approaches to this problem to date have been limited by the number of available experimentally determined protein structures. We augment training data by nearly three orders of magnitude by predicting structures for 12M protein sequences using AlphaFold2. Trained with this additional data, a sequence-to-sequence transformer with invariant geometric input processing layers achieves 51% native sequence recovery on structurally held-out backbones with 72% recovery for buried residues, an overall improvement of almost 10 percentage points over existing methods. The model generalizes to a variety of more complex tasks including design of protein complexes, partially masked structures, binding interfaces, and multiple states.

Details

NeurIPS Conference 2021 Conference Paper

Language models enable zero-shot prediction of the effects of mutations on protein function

Joshua Meier
Roshan Rao
Robert Verkuil
Jason Liu
Tom Sercu
Alex Rives

Modeling the effect of sequence variation on function is a fundamental problem for understanding and designing proteins. Since evolution encodes information about function into patterns in protein sequences, unsupervised models of variant effects can be learned from sequence data. The approach to date has been to fit a model to a family of related sequences. The conventional setting is limited, since a new model must be trained for each prediction task. We show that using only zero-shot inference, without any supervision from experimental data or additional training, protein language models capture the functional effects of sequence variation, performing at state-of-the-art.

PDF Details

ICML Conference 2021 Conference Paper

MSA Transformer

Roshan Rao
Jason Liu
Robert Verkuil
Joshua Meier
John F. Canny
Pieter Abbeel
Tom Sercu
Alexander Rives

Unsupervised protein language models trained across millions of diverse sequences learn structure and function of proteins. Protein language models studied to date have been trained to perform inference from individual sequences. The longstanding approach in computational biology has been to make inferences from a family of evolutionarily related sequences by fitting a model to each family independently. In this work we combine the two paradigms. We introduce a protein language model which takes as input a set of sequences in the form of a multiple sequence alignment. The model interleaves row and column attention across the input sequences and is trained with a variant of the masked language modeling objective across many protein families. The performance of the model surpasses current state-of-the-art unsupervised structure learning methods by a wide margin, with far greater parameter efficiency than prior state-of-the-art protein language models.

Details

ICRA Conference 2002 Conference Paper

300mm Full Automation Integration Test Methodology and Experience

Ming Wang
Eric Chang
Jason Liu

Taiwan Semiconductor Manufacturing Company launched the first 300 mm fully automation foundry in 1999. The newly introduced technology will automate the 300 mm fab for better manufacturing control and less human labor. Based on the accumulated TSMC data from year 1999 to year 2000, it is found that to maintain is more difficult than to build up the system. It is a great challenge to integrate so many hardware and software components closely. In fact, few 300 mm foundries are successful from the maintenance point of view. TSMC is still continuing to improve the full fab automation by strict integration test methodology and refer to daily problem analysis system. TSMC enlarges the acceptance test scope to cover the process equipment, transportation, material control and manufacturing system. In order to make sure the integration is smooth and on the right track, TSMC has developed a methodology to execute hardware and software integration. Here TSMC exposes some of the critical issues and shares the experience for unmanned factory. TSMC believes these can help the 300 mm fab to speed up the system construction task and find out their own problems by TSMC's methodology and experience.

Details