Author name cluster

Haibin Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models

Daoyuan Chen
Yilun Huang
Xuchen Pan
Jiang Nana
Haibin Wang
Yilei Zhang
Ce Ge
Yushuo Chen

Foundation models demand advanced data processing for their vast, multimodal datasets. However, traditional frameworks struggle with the unique complexities of multimodal data. In response, we present Data-Juicer 2. 0, a data processing system backed by 100+ data processing operators spanning text, image, video, and audio modalities, supporting more critical tasks including data analysis, synthesis, annotation, and foundation model post-training. With seamless compatibility and dedicated optimization for popular dataset hubs like Hugging Face and computing engines like Ray, it improves upon its predecessor in terms of usability, efficiency, and programmability. It features an easily accessible user interface layer that supports decoupled Python interactions, RESTful APIs, and conversational commands. Its new runtime layer offers adaptive execution across diverse scales and environments, abstracting away system complexities. Extensive empirical evaluations demonstrate Data-Juicer 2. 0's remarkable performance and scalability, highlighting its capability to efficiently process TB-level data with 10k+ CPU cores. The system is publicly available and has been widely adopted in diverse research fields and real-world products such as Alibaba Cloud PAI. We actively maintain the system and share practical insights to foster research and applications of next-generation foundation models.

PDF Details

ICML Conference 2025 Conference Paper

Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development

Daoyuan Chen
Haibin Wang
Yilun Huang 0004
Ce Ge
Yaliang Li
Bolin Ding
Jingren Zhou 0001

The emergence of multimodal large models has advanced artificial intelligence, introducing unprecedented levels of performance and functionality. However, optimizing these models remains challenging due to historically isolated paths of model-centric and data-centric developments, leading to suboptimal outcomes and inefficient resource utilization. In response, we present a new sandbox suite tailored for integrated data-model co-development. This sandbox provides a feedback-driven experimental platform, enabling cost-effective iteration and guided refinement of both data and models. Our proposed “Probe-Analyze-Refine” workflow, validated through practical use cases on multimodal tasks such as image-text pre-training with CLIP, image-to-text generation with LLaVA-like models, and text-to-video generation with DiT-based models, yields transferable and notable performance boosts, such as topping the VBench leaderboard. A comprehensive set of over 100 experiments demonstrated the suite’s usability and extensibility, while also uncovering insights into the interplay between data quality, diversity, model behavior, and computational costs. All codes, datasets, and models are open-sourced to foster future research and applications that would otherwise be infeasible due to the lack of a dedicated co-development infrastructure.

Details

AIJ Journal 2023 Journal Article

Multi resource allocation with partial preferences

Haibin Wang
Sujoy Sikdar
Xiaoxi Guo
Lirong Xia
Yongzhi Cao
Hanpin Wang

We provide efficient, fair, and non-manipulable mechanisms for the multi-type resource allocation problems (MTRAs) and multiple assignment problems where agents have partial preferences over bundles consisting of multiple divisible items. We uncover a natural reduction from multiple assignment problems to MTRAs, which preserves the properties of MTRA mechanisms. We extend the well-known random priority (RP) and probabilistic serial (PS) mechanisms to MTRAs with partial preferences as multi-type PS (MPS) and multi-type RP (MRP) and propose a new mechanism, multi-type general dictatorship (MGD), which combines the ideas of MPS and MRP. We show that for the unrestricted domain of partial order preferences, unfortunately, no mechanism satisfies both sd-efficiency and sd-envy-freeness, even as they each satisfy different weaker notions of the desirable properties of efficiency, fairness, and non-manipulability we consider. Notwithstanding this impossibility result, our main message is positive: When agents' preferences are represented by acyclic CP-nets, MRP satisfies ex-post-efficiency, sd-strategyproofness, and upper invariance, while MPS satisfies sd-efficiency, sd-envy-freeness, ordinal fairness, and upper invariance, recovering the properties of RP and PS; the MGD satisfies sd-efficiency, equal treatment of equals, and decomposability under the unrestricted domain of partial preferences. We introduce a natural domain of bundle net preferences, which generalizes previously studied domain restrictions of partial preferences for multiple assignment problems and is incomparable to the domain of acyclic CP-nets. We show that MRP and MPS satisfy all properties of the RP and PS under bundle net preferences as well.

Details DOI

ICML Conference 2023 Conference Paper

PreNAS: Preferred One-Shot Learning Towards Efficient Neural Architecture Search

Haibin Wang
Ce Ge
Hesen Chen
Xiuyu Sun

The wide application of pre-trained models is driving the trend of once-for-all training in one-shot neural architecture search (NAS). However, training within a huge sample space damages the performance of individual subnets and requires much computation to search for a optimal model. In this paper, we present PreNAS, a search-free NAS approach that accentuates target models in one-shot training. Specifically, the sample space is dramatically reduced in advance by a zero-cost selector, and weight-sharing one-shot training is performed on the preferred architectures to alleviate update conflicts. Extensive experiments have demonstrated that PreNAS consistently outperforms state-of-the-art one-shot NAS competitors for both Vision Transformer and convolutional architectures, and importantly, enables instant specialization with zero search cost. Our code is available at https: //github. com/tinyvision/PreNAS.

Details

JBHI Journal 2022 Journal Article

Classification of Wideband Tympanometry by Deep Transfer Learning With Data Augmentation for Automatic Diagnosis of Otosclerosis

Leixin Nie
Chao Li
Franck Marzani
Haibin Wang
Francois Thibouw
Alexis Bozorg Grayeli

Otosclerosis is a common disease of the middle ear leading to stapedial fixation. Its rapid and non-invasive diagnosis could be achieved through wideband tympanometry (WBT), but the interpretation of the raw data provided by this tool is complex and time-consuming. Convolutional neural networks (CNN) could potentially be applied to this situation to help the clinicians categorize WBT data. A dataset containing 135 samples from 80 patients with otosclerosis and 55 controls was obtained. We designed a lightweight CNN to categorize samples into the otosclerosis and control. Receiver operating characteristic (ROC) analysis showed an area under the curve (AUC) of $0. 95 \pm 0. 011$, and the F1-score was $0. 89 \pm 0. 031$ ( $r=10$ ). The performance was further improved by data augmentation schemes and transfer learning strategies (AUC: $0. 97 \pm 0. 010$, F1-score: $0. 94 \pm 0. 016$, $p< 0. 05$, ANOVA). Finally, the most relevant diagnostic features employed by the CNN were assessed via the activation pattern heatmaps. These results are crucial for the visual interpretation of WBT graphic outputs which clinicians use in routine, and for a better understanding of the WBT signal in relation to the ossicular mechanics.

Details DOI

AAMAS Conference 2022 Conference Paper

Designing Efficient and Fair Mechanisms for Multi-Type Resource Allocation

Xiaoxi Guo
Sujoy Sikdar
Haibin Wang
Lirong Xia
Yongzhi Cao
Hanpin Wang

In the multi-type resource allocation problem (MTRA), there are 𝑑 ≥ 2 types of items, and 𝑛 agents who each demand one unit of items of each type and have strict linear preferences over bundles consisting of one item of each type. For MTRAs with indivisible items, we first present an impossibility result that no mechanism can satisfy both sd-efficiency and sd-envy-freeness. We show that this impossibility result is circumvented under the natural assumption of lexicographic preferences by providing lexicographic probabilistic serial (LexiPS) as an extension of the probabilistic serial (PS) mechanism. We also prove that LexiPS satisfies sd-efficiency and sd-envy-freeness. Moreover, LexiPS satisfies sd-weak-strategy proofness when agents are not allowed to misreport their importance orders. The multi-type probabilistic serial cannot deal with indivisible items, but provides a stronger efficiency guarantee under the unrestricted domain of strict linear preferences for divisible items, while also retaining desirable fairness guarantees.

PDF

AAMAS Conference 2021 Conference Paper

Sequential Mechanisms for Multi-type Resource Allocation

Sujoy Sikdar
Xiaoxi Guo
Haibin Wang
Lirong Xia
Yongzhi Cao

Several resource allocation problems involve multiple types of resources, with a different agency being responsible for “locally” allocating the resources of each type, while a central planner wishes to provide a guarantee on the properties of the final allocation given agents’ preferences. We study the relationship between properties of the local mechanisms, each responsible for assigning all of the resources of a designated type, and the properties of a sequential mechanism which is composed of these local mechanisms, one for each type, applied sequentially, under lexicographic preferences, a well studied model of preferences over multiple types of resources in artificial intelligence and economics. We show that when preferences are 𝑂-legal, meaning that agents share a common importance order on the types, sequential mechanisms satisfy the desirable properties of anonymity, neutrality, non-bossiness, or Pareto-optimality if and only if every local mechanism also satisfies the same property, and they are applied sequentially according to the order 𝑂. Our main results are that under 𝑂-legal lexicographic preferences, every mechanism satisfying strategyproofness and a combination of these properties must be a sequential composition of local mechanisms that are also strategyproof, and satisfy the same combinations of properties.

PDF

AAAI Conference 2020 Conference Paper

Multi-Type Resource Allocation with Partial Preferences

Haibin Wang
Sujoy Sikdar
Xiaoxi Guo
Lirong Xia
Yongzhi Cao
Hanpin Wang

We propose multi-type probabilistic serial (MPS) and multitype random priority (MRP) as extensions of the well-known PS and RP mechanisms to the multi-type resource allocation problems (MTRAs) with partial preferences. In our setting, there are multiple types of divisible items, and a group of agents who have partial order preferences over bundles consisting of one item of each type. We show that for the unrestricted domain of partial order preferences, no mechanism satisﬁes both sd-efﬁciency and sd-envy-freeness. Notwithstanding this impossibility result, our main message is positive: When agents’ preferences are represented by acyclic CPnets, MPS satisﬁes sd-efﬁciency, sd-envy-freeness, ordinal fairness, and upper invariance, while MRP satisﬁes ex-postefﬁciency, sd-strategyproofness, and upper invariance, recovering the properties of PS and RP. Besides, we propose a hybrid mechanism, multi-type general dictatorship (MGD), combining the ideas of MPS and MRP, which satisﬁes sd-efﬁciency, equal treatment of equals and decomposability under the unrestricted domain of partial order preferences.

PDF Details

IJCAI Conference 2018 Conference Paper

Cross-Modality Person Re-Identification with Generative Adversarial Training

Pingyang Dai
Rongrong Ji
Haibin Wang
Qiong Wu
Yuyu Huang

Person re-identification (Re-ID) is an important task in video surveillance which automatically searches and identifies people across different cameras. Despite the extensive Re-ID progress in RGB cameras, few works have studied the Re-ID between infrared and RGB images, which is essentially a cross-modality problem and widely encountered in real-world scenarios. The key challenge lies in two folds, i. e. , the lack of discriminative information to re-identify the same person between RGB and infrared modalities, and the difficulty to learn a robust metric towards such a large-scale cross-modality retrieval. In this paper, we tackle the above two challenges by proposing a novel cross-modality generative adversarial network (termed cmGAN). To handle the issue of insufficient discriminative information, we leverage the cutting-edge generative adversarial training to design our own discriminator to learn discriminative feature representation from different modalities. To handle the issue of large-scale cross-modality metric learning, we integrates both identification loss and cross-modality triplet loss, which minimize inter-class ambiguity while maximizing cross-modality similarity among instances. The entire cmGAN can be trained in an end-to-end manner by using standard deep neural network framework. We have quantized the performance of our work in the newly-released SYSU RGB-IR Re-ID benchmark, and have reported superior performance, i. e. , Cumulative Match Characteristic curve (CMC) and Mean Average Precision (MAP), over the state-of-the-art works [Wu et al. , 2017], respectively.

PDF Details