Arrow Research search

Author name cluster

Longqi Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers
1 author row

Possible papers

4

NeurIPS Conference 2025 Conference Paper

Struct-Bench: A Benchmark for Differentially Private Structured Text Generation

  • Shuaiqi Wang
  • Vikas Raunak
  • Arturs Backurs
  • Victor Reis
  • Pei Zhou
  • Sihao Chen
  • Longqi Yang
  • Zinan Lin

Differentially private (DP) synthetic data generation is a promising technique for utilizing private datasets that otherwise cannot be exposed for model training or other analytics. While much research literature has focused on generating private unstructured text and image data, in enterprise settings, structured data (e. g. , tabular) is more common, often including natural language fields or components. Existing synthetic data evaluation techniques (e. g. , FID) struggle to capture the structural properties and correlations of such datasets. In this work, we propose Struct-Bench, a framework and benchmark for evaluating synthetic datasets derived from structured datasets that contain natural language data. The Struct-Bench framework requires users to provide a representation of their dataset structure as a Context-Free Grammar (CFG). Our benchmark comprises 5 real-world and 2 synthetically generated datasets. We show that these datasets demonstrably present a great challenge even for state-of-the-art DP synthetic data generation methods. Struct-Bench provides reference implementations of different metrics and a leaderboard, offering a standardized platform to benchmark and investigate privacy-preserving synthetic data methods. We also present a case study showing how Struct-Bench improves the synthetic data quality of Private Evolution (PE) on structured data. The benchmark and the leaderboard have been publicly made available at https: //struct-bench. github. io.

IJCAI Conference 2023 Conference Paper

Learning Causal Effects on Hypergraphs (Extended Abstract)

  • Jing Ma
  • Mengting Wan
  • Longqi Yang
  • Jundong Li
  • Brent Hecht
  • Jaime Teevan

Hypergraphs provide an effective abstraction for modeling multi-way group interactions among nodes, where each hyperedge can connect any number of nodes. Different from most existing studies which leverage statistical dependencies, we study hypergraphs from the perspective of causality. Specifically, we focus on the problem of individual treatment effect (ITE) estimation on hypergraphs, aiming to estimate how much an intervention (e. g. , wearing face covering) would causally affect an outcome (e. g. , COVID-19 infection) of each individual node. Existing works on ITE estimation either assume that the outcome of one individual should not be influenced by the treatment of other individuals (i. e. , no interference), or assume the interference only exists between connected individuals in an ordinary graph. We argue that these assumptions can be unrealistic on real-world hypergraphs, where higher-order interference can affect the ITE estimations due to group interactions. We investigate high-order interference modeling, and propose a new causality learning framework powered by hypergraph neural networks. Extensive experiments on real-world hypergraphs verify the superiority of our framework over existing baselines.

NeurIPS Conference 2021 Conference Paper

Graph Adversarial Self-Supervised Learning

  • Longqi Yang
  • Liangliang Zhang
  • Wenjing Yang

This paper studies a long-standing problem of learning the representations of a whole graph without human supervision. The recent self-supervised learning methods train models to be invariant to the transformations (views) of the inputs. However, designing these views requires the experience of human experts. Inspired by adversarial training, we propose an adversarial self-supervised learning (\texttt{GASSL}) framework for learning unsupervised representations of graph data without any handcrafted views. \texttt{GASSL} automatically generates challenging views by adding perturbations to the input and are adversarially trained with respect to the encoder. Our method optimizes the min-max problem and utilizes a gradient accumulation strategy to accelerate the training process. Experimental on ten graph classification datasets show that the proposed approach is superior to state-of-the-art self-supervised learning baselines, which are competitive with supervised models.

TIST Journal 2018 Journal Article

Characterizing User Skills from Application Usage Traces with Hierarchical Attention Recurrent Networks

  • Longqi Yang
  • Chen Fang
  • Hailin Jin
  • Matthew D. Hoffman
  • Deborah Estrin

Predicting users’ proficiencies is a critical component of AI-powered personal assistants. This article introduces a novel approach for the prediction based on users’ diverse, noisy, and passively generated application usage histories. We propose a novel bi-directional recurrent neural network with hierarchical attention mechanism to extract sequential patterns and distinguish informative traces from noise. Our model is able to attend to the most discriminative actions and sessions to make more accurate and directly interpretable predictions while requiring 50× less training data than the state-of-the-art sequential learning approach. We evaluate our model with two large scale datasets collected from 68K Photoshop users: a digital design skill dataset where the user skill is determined by the quality of the end products and a software skill dataset where users self-disclose their software usage skill levels. The empirical results demonstrate our model’s superior performance compared to existing user representation learning techniques that leverage action frequencies and sequential patterns. In addition, we qualitatively illustrate the model’s significant interpretative power. The proposed approach is broadly relevant to applications that generate user time-series analytics.