Author name cluster

Sihao Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

Struct-Bench: A Benchmark for Differentially Private Structured Text Generation

Shuaiqi Wang
Vikas Raunak
Arturs Backurs
Victor Reis
Pei Zhou
Sihao Chen
Longqi Yang
Zinan Lin

Differentially private (DP) synthetic data generation is a promising technique for utilizing private datasets that otherwise cannot be exposed for model training or other analytics. While much research literature has focused on generating private unstructured text and image data, in enterprise settings, structured data (e. g. , tabular) is more common, often including natural language fields or components. Existing synthetic data evaluation techniques (e. g. , FID) struggle to capture the structural properties and correlations of such datasets. In this work, we propose Struct-Bench, a framework and benchmark for evaluating synthetic datasets derived from structured datasets that contain natural language data. The Struct-Bench framework requires users to provide a representation of their dataset structure as a Context-Free Grammar (CFG). Our benchmark comprises 5 real-world and 2 synthetically generated datasets. We show that these datasets demonstrably present a great challenge even for state-of-the-art DP synthetic data generation methods. Struct-Bench provides reference implementations of different metrics and a leaderboard, offering a standardized platform to benchmark and investigate privacy-preserving synthetic data methods. We also present a case study showing how Struct-Bench improves the synthetic data quality of Private Evolution (PE) on structured data. The benchmark and the leaderboard have been publicly made available at https: //struct-bench. github. io.

PDF Details

ICLR Conference 2024 Conference Paper

The Trickle-down Impact of Reward Inconsistency on RLHF

Lingfeng Shen
Sihao Chen
Linfeng Song
Lifeng Jin
Baolin Peng
Haitao Mi
Daniel Khashabi
Dong Yu 0001

Standard practice within Reinforcement Learning from Human Feedback (RLHF) involves optimizing against a Reward Model (RM), which itself is trained to reflect human preferences for desirable generations. A notable subject that is understudied is the (in-)consistency of RMs --- whether they can recognize the semantic changes to different prompts and appropriately adapt their reward assignments --- and their impact on the downstream RLHF model. In this paper, we visit a series of research questions relevant to RM inconsistency: (1) How can we measure the consistency of reward models? (2) How consistent are the existing RMs and how can we improve them? (3) In what ways does reward inconsistency influence the chatbots resulting from the RLHF model training? We propose **Contrast Instruction** -- a benchmarking strategy for the consistency of RM. Each example in **Contrast Instruction** features a pair of lexically similar instructions with different ground truth responses. A consistent RM is expected to rank the corresponding instruction and response higher than other combinations. We observe that current RMs trained with the standard ranking objective fail miserably on \contrast{} compared to average humans. To show that RM consistency can be improved efficiently without using extra training budget, we propose two techniques **ConvexDA** and **RewardFusion**, which enhance reward consistency through extrapolation during the RM training and inference stage, respectively. We show that RLHF models trained with a more consistent RM yield more useful responses, suggesting that reward inconsistency exhibits a trickle-down effect on the downstream RLHF process.

Details

EAAI Journal 2023 Journal Article

A novel grey prediction model based on tensor higher-order singular value decomposition and its application in short-term traffic flow

Derong Xie
Sihao Chen
Haotong Duan
Xinwei Li
Caotong Luo
Yuxuan Ji
Huiming Duan

Short-term traffic flow prediction is the key to traffic guidance and control and can directly affect the performance of intelligent transportation systems. Traffic flow data has the characteristics of volatility, chaos, and randomness, which affect the accuracy of the prediction model. Based on the multi-dimensional spatio-temporal data characteristics of traffic flow data, this paper combines the tensor higher-order singular value decomposition theory and the modeling mechanism of the classical grey model GM(1, 1) model and establishes the grey GM(1, 1) model with tensor higher-order singular values. The tensor higher-order singular value decomposition reflects the periodic, multi-modal, and holistic nature of traffic flow data, which can mitigate the volatility and randomness of traffic flow data and improve the accuracy of the model. Then, the new model is applied to highway short-time traffic flow prediction, analyzing the spatio-temporal nature of traffic flow data, giving the detailed steps of model modeling, and analyzing the correlation between the original traffic flow data and tensor approximation data using grey correlation degrees. There are three cases to illustrate the effectiveness of the model. Case 1 shows that the results of MAPE from nine modeling objects are stable at about 5%, which indicates that the new model has some stability; Case 2 shows that the new model is more adaptable to short-time traffic flow prediction based on the results of three different modeling and prediction objects; Case 3 uses the new model to compare it with two traditional grey forecasting models and two optimization models, and the results indicate that the new model has a total MAPE value of 5. 172%, which is better than the other four grey forecasting models. Finally, the new model is applied to short-time traffic flow prediction, and its prediction results are consistent with the trend of the original traffic flow data, indicating that it can reveal the real-time characteristics of the traffic system to a certain extent and provide a reliable theoretical basis for traffic planning, control, and optimization.

Details DOI