Author name cluster

Xiaobin Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers

2 author rows

ICLR Conference 2025 Conference Paper

Benchmarking Agentic Workflow Generation

Shuofei Qiao
Runnan Fang
Zhisong Qiu
Xiaobin Wang
Ningyu Zhang 0001
Yong Jiang 0005
Pengjun Xie
Fei Huang 0002

Large Language Models (LLMs), with their exceptional ability to handle a wide range of tasks, have driven significant advancements in tackling reasoning and planning tasks, wherein decomposing complex problems into executable workflows is a crucial step in this process. Existing workflow evaluation frameworks either focus solely on holistic performance or suffer from limitations such as restricted scenario coverage, simplistic workflow structures, and lax evaluation standards. To this end, we introduce WorfBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures. Additionally, we present WorfEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms to accurately quantify the LLM agent's workflow generation capabilities. Through comprehensive evaluations across different types of LLMs, we discover distinct gaps between the sequence planning capabilities and graph planning capabilities of LLM agents, with even GPT-4 exhibiting a gap of around 15%. We also train two open-source models and evaluate their generalization abilities on held-out tasks. Furthermore, we observe that the generated workflows can enhance downstream tasks, enabling them to achieve superior performance with less time during inference. Code and dataset are available at https://github.com/zjunlp/WorfBench.

Details

AAAI Conference 2024 Conference Paper

EcomGPT: Instruction-Tuning Large Language Models with Chain-of-Task Tasks for E-commerce

Yangning Li
Shirong Ma
Xiaobin Wang
Shen Huang
Chengyue Jiang
Hai-Tao Zheng
Pengjun Xie
Fei Huang

Recently, instruction-following Large Language Models (LLMs), represented by ChatGPT, have exhibited exceptional performance in general Natural Language Processing (NLP) tasks. However, the unique characteristics of E-commerce data pose significant challenges to general LLMs. An LLM tailored specifically for E-commerce scenarios, possessing robust cross-dataset/task generalization capabilities, is a pressing necessity. To solve this issue, in this work, we proposed the first E-commerce instruction dataset EcomInstruct, with a total of 2.5 million instruction data. EcomInstruct scales up the data size and task diversity by constructing atomic tasks with E-commerce basic data types, such as product information, user reviews. Atomic tasks are defined as intermediate tasks implicitly involved in solving a final task, which we also call Chain-of-Task tasks. We developed EcomGPT with different parameter scales by training the backbone model BLOOMZ with the EcomInstruct. Benefiting from the fundamental semantic understanding capabilities acquired from the Chain-of-Task tasks, EcomGPT exhibits excellent zero-shot generalization capabilities. Extensive experiments and human evaluations demonstrate that EcomGPT outperforms ChatGPT in term of cross-dataset/task generalization on E-commerce tasks. The EcomGPT will be public at https://github.com/Alibaba-NLP/EcomGPT.

PDF Details DOI

AAAI Conference 2024 Conference Paper

SeqGPT: An Out-of-the-Box Large Language Model for Open Domain Sequence Understanding

Tianyu Yu
Chengyue Jiang
Chao Lou
Shen Huang
Xiaobin Wang
Wei Liu
Jiong Cai
Yangning Li

Large language models (LLMs) have shown impressive abilities for open-domain NLP tasks. However, LLMs are sometimes too footloose for natural language understanding (NLU) tasks which always have restricted output and input format. Their performances on NLU tasks are highly related to prompts or demonstrations and are shown to be poor at performing several representative NLU tasks, such as event extraction and entity typing. To this end, we present SeqGPT, a bilingual (i.e., English and Chinese) open-source autoregressive model specially enhanced for open-domain natural language understanding. We express all NLU tasks with two atomic tasks, which define fixed instructions to restrict the input and output format but still ``open'' for arbitrarily varied label sets. The model is first instruction-tuned with extremely fine-grained labeled data synthesized by ChatGPT and then further fine-tuned by 233 different atomic tasks from 152 datasets across various domains. The experimental results show that SeqGPT has decent classification and extraction ability, and is capable of performing language understanding tasks on unseen domains. We also conduct empirical studies on the scaling of data and model size as well as on the transfer across tasks. Our models are accessible at https://github.com/Alibaba-NLP/SeqGPT.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Three Heads Are Better than One: Improving Cross-Domain NER with Progressive Decomposed Network

Xuming Hu
Zhaochen Hong
Yong Jiang
Zhichao Lin
Xiaobin Wang
Pengjun Xie
Philip S. Yu

Cross-domain named entity recognition (NER) tasks encourage NER models to transfer knowledge from data-rich source domains to sparsely labeled target domains. Previous works adopt the paradigms of pre-training on the source domain followed by fine-tuning on the target domain. However, these works ignore that general labeled NER source domain data can be easily retrieved in the real world, and soliciting more source domains could bring more benefits. Unfortunately, previous paradigms cannot efficiently transfer knowledge from multiple source domains. In this work, to transfer multiple source domains' knowledge, we decouple the NER task into the pipeline tasks of mention detection and entity typing, where the mention detection unifies the training object across domains, thus providing the entity typing with higher-quality entity mentions. Additionally, we request multiple general source domain models to suggest the potential named entities for sentences in the target domain explicitly, and transfer their knowledge to the target domain models through the knowledge progressive networks implicitly. Furthermore, we propose two methods to analyze in which source domain knowledge transfer occurs, thus helping us judge which source domain brings the greatest benefit. In our experiment, we develop a Chinese cross-domain NER dataset. Our model improved the F1 score by an average of 12.50% across 8 Chinese and English datasets compared to models without source domain data.

PDF Details DOI

ICLR Conference 2021 Conference Paper

Prototypical Representation Learning for Relation Extraction

Ning Ding 0002
Xiaobin Wang
Yao Fu
Guangwei Xu
Rui Wang 0005
Pengjun Xie
Ying Shen 0001
Fei Huang 0002

Recognizing relations between entities is a pivotal task of relational learning. Learning relation representations from distantly-labeled datasets is difficult because of the abundant label noise and complicated expressions in human language. This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data that are effective in different settings, including supervised, distantly supervised, and few-shot learning. Instead of solely relying on the supervision from noisy labels, we propose to learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations. Prototypes are representations in the feature space abstracting the essential semantics of relations between entities in sentences. We learn prototypes based on objectives with clear geometric interpretation, where the prototypes are unit vectors uniformly dispersed in a unit ball, and statement embeddings are centered at the end of their corresponding prototype vectors on the surface of the ball. This approach allows us to learn meaningful, interpretable prototypes for the final classification. Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art models. We further demonstrate the robustness of the encoder and the interpretability of prototypes with extensive experiments.

Details

ICML Conference 2019 Conference Paper

Learning Models from Data with Measurement Error: Tackling Underreporting

Roy Adams
Yuelong Ji
Xiaobin Wang
Suchi Saria

Measurement error in observational datasets can lead to systematic bias in inferences based on these datasets. As studies based on observational data are increasingly used to inform decisions with real-world impact, it is critical that we develop a robust set of techniques for analyzing and adjusting for these biases. In this paper we present a method for estimating the distribution of an outcome given a binary exposure that is subject to underreporting. Our method is based on a missing data view of the measurement error problem, where the true exposure is treated as a latent variable that is marginalized out of a joint model. We prove three different conditions under which the outcome distribution can still be identified from data containing only error-prone observations of the exposure. We demonstrate this method on synthetic data and analyze its sensitivity to near violations of the identifiability conditions. Finally, we use this method to estimate the effects of maternal smoking and heroin use during pregnancy on childhood obesity, two import problems from public health. Using the proposed method, we estimate these effects using only subject-reported drug use data and refine the range of estimates generated by a sensitivity analysis-based approach. Further, the estimates produced by our method are consistent with existing literature on both the effects of maternal smoking and the rate at which subjects underreport smoking.

Details

AAAI Conference 2019 Conference Paper

Unsupervised Learning Helps Supervised Neural Word Segmentation

Xiaobin Wang
Deng Cai
Linlin Li
Guangwei Xu
Hai Zhao
Luo Si

By exploiting unlabeled data for further performance improvement for Chinese word segmentation, this work makes the first attempt at exploring adding unsupervised segmentation information into neural supervised segmenter. We survey various effective strategies, including extending the character embedding, augmenting the word score and applying multi-task learning, for leveraging unsupervised information derived from abundant unlabeled data. Experiments on standard data sets show that the explored strategies indeed improve the recall rate of out-of-vocabulary words and thus boost the segmentation accuracy. Moreover, the model enhanced by the proposed methods outperforms state-of-theart models in closed test and shows promising improvement trend when adopting three different strategies with the help of a large unlabeled data set. Our thorough empirical study eventually verifies the proposed approach outperforms the widelyused pre-training approach in terms of effectively making use of freely abundant unlabeled data.

PDF Details