Author name cluster

Longqi Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

1 author row

NeurIPS Conference 2025 Conference Paper

Struct-Bench: A Benchmark for Differentially Private Structured Text Generation

Shuaiqi Wang
Vikas Raunak
Arturs Backurs
Victor Reis
Pei Zhou
Sihao Chen
Longqi Yang
Zinan Lin

Differentially private (DP) synthetic data generation is a promising technique for utilizing private datasets that otherwise cannot be exposed for model training or other analytics. While much research literature has focused on generating private unstructured text and image data, in enterprise settings, structured data (e. g. , tabular) is more common, often including natural language fields or components. Existing synthetic data evaluation techniques (e. g. , FID) struggle to capture the structural properties and correlations of such datasets. In this work, we propose Struct-Bench, a framework and benchmark for evaluating synthetic datasets derived from structured datasets that contain natural language data. The Struct-Bench framework requires users to provide a representation of their dataset structure as a Context-Free Grammar (CFG). Our benchmark comprises 5 real-world and 2 synthetically generated datasets. We show that these datasets demonstrably present a great challenge even for state-of-the-art DP synthetic data generation methods. Struct-Bench provides reference implementations of different metrics and a leaderboard, offering a standardized platform to benchmark and investigate privacy-preserving synthetic data methods. We also present a case study showing how Struct-Bench improves the synthetic data quality of Private Evolution (PE) on structured data. The benchmark and the leaderboard have been publicly made available at https: //struct-bench. github. io.

PDF Details

IJCAI Conference 2023 Conference Paper

Learning Causal Effects on Hypergraphs (Extended Abstract)

Jing Ma
Mengting Wan
Longqi Yang
Jundong Li
Brent Hecht
Jaime Teevan

Hypergraphs provide an effective abstraction for modeling multi-way group interactions among nodes, where each hyperedge can connect any number of nodes. Different from most existing studies which leverage statistical dependencies, we study hypergraphs from the perspective of causality. Specifically, we focus on the problem of individual treatment effect (ITE) estimation on hypergraphs, aiming to estimate how much an intervention (e. g. , wearing face covering) would causally affect an outcome (e. g. , COVID-19 infection) of each individual node. Existing works on ITE estimation either assume that the outcome of one individual should not be influenced by the treatment of other individuals (i. e. , no interference), or assume the interference only exists between connected individuals in an ordinary graph. We argue that these assumptions can be unrealistic on real-world hypergraphs, where higher-order interference can affect the ITE estimations due to group interactions. We investigate high-order interference modeling, and propose a new causality learning framework powered by hypergraph neural networks. Extensive experiments on real-world hypergraphs verify the superiority of our framework over existing baselines.

PDF Details DOI

NeurIPS Conference 2021 Conference Paper

Graph Adversarial Self-Supervised Learning

Longqi Yang
Liangliang Zhang
Wenjing Yang

This paper studies a long-standing problem of learning the representations of a whole graph without human supervision. The recent self-supervised learning methods train models to be invariant to the transformations (views) of the inputs. However, designing these views requires the experience of human experts. Inspired by adversarial training, we propose an adversarial self-supervised learning (\texttt{GASSL}) framework for learning unsupervised representations of graph data without any handcrafted views. \texttt{GASSL} automatically generates challenging views by adding perturbations to the input and are adversarially trained with respect to the encoder. Our method optimizes the min-max problem and utilizes a gradient accumulation strategy to accelerate the training process. Experimental on ten graph classification datasets show that the proposed approach is superior to state-of-the-art self-supervised learning baselines, which are competitive with supervised models.

PDF Details

TIST Journal 2018 Journal Article

Characterizing User Skills from Application Usage Traces with Hierarchical Attention Recurrent Networks

Longqi Yang
Chen Fang
Hailin Jin
Matthew D. Hoffman
Deborah Estrin

Predicting users’ proficiencies is a critical component of AI-powered personal assistants. This article introduces a novel approach for the prediction based on users’ diverse, noisy, and passively generated application usage histories. We propose a novel bi-directional recurrent neural network with hierarchical attention mechanism to extract sequential patterns and distinguish informative traces from noise. Our model is able to attend to the most discriminative actions and sessions to make more accurate and directly interpretable predictions while requiring 50× less training data than the state-of-the-art sequential learning approach. We evaluate our model with two large scale datasets collected from 68K Photoshop users: a digital design skill dataset where the user skill is determined by the quality of the end products and a software skill dataset where users self-disclose their software usage skill levels. The empirical results demonstrate our model’s superior performance compared to existing user representation learning techniques that leverage action frequencies and sequential patterns. In addition, we qualitatively illustrate the model’s significant interpretative power. The proposed approach is broadly relevant to applications that generate user time-series analytics.

Details DOI