Author name cluster

Bin Wu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

24 papers

2 author rows

EAAI Journal 2026 Journal Article

Application of dunhuang caisson patterns in cultural and creative product design based on artificial intelligence generated content technology

Yu Fang
Yuyao Zhang
Hongjia Tu
Wuyang Yang
Bin Wu
Wenkai Zhu
Song Li

The rapid advancement of artificial intelligence generated content (AIGC) provides new technological pathways for digitally preserving and reinventing traditional patterns. This study introduces an integrated framework to explore the creative reproduction of Dunhuang caisson patterns and their application in product design. By coupling the analytic hierarchy process (AHP) with low-rank adaptation (LoRA) fine-tuning, the proposed methodology translates cultural hierarchies into quantitative generative constraints. Firstly, the AHP was employed to establish an evaluation system for the Dunhuang caisson patterns, ranking the key patterns such as lotus and flowers according to their relative importance. Based on these results, representative patterns were collected and preprocessed from published books, academic literature, and open-source datasets to construct a dataset. Subsequently, the weights obtained from the AHP were converted into attention-guided and gradient-guided mechanisms, and the LoRA technique was used to fine-tune three mainstream models (Stable Diffusion 1. 5 (SD 1. 5), Stable Diffusion XL (SDXL), and Flux model). The performance of the fine-tuned models in terms of pattern generation quality and cultural symbol representation was evaluated. Finally, the generated patterns were applied to four representative cultural and creative products (mugs, carpets, handbags, and cushions). The design of the mugs was further examined through Importance-performance analysis (IPA) to verify the practical feasibility and performance of the generated output. This study not only enhances the automated generation quality of Dunhuang caisson patterns but also provides a transferable technical framework for the digital preservation and innovative utilization of traditional patterns in product design.

Details DOI

AAAI Conference 2026 Conference Paper

CTX-Coder: Cross-Attention Architectures Empower LLMs for Long-Context Vulnerability Detection

Jujie Wang
Kangfeng Zheng
Bin Wu
Chunhua Wu
Yulin Yao
Jiaqi Gao
Minjiao Yang

Software vulnerabilities have increased sharply, underscoring the growing urgency for effective detection methods. Although large language model (LLM) based methods have shown promise in this task, current state-of-the-art LLM approaches struggle with functions that have long contexts. In this paper, we propose CTX-Coder, a context-enhanced vulnerability detection framework that enables LLMs to selectively focus on relevant contextual functions. To achieve this, we represent the contextual functions as embeddings and integrate them with the target code via cross-attention, thereby enhancing the model's ability to capture contextual information. Furthermore, to equip the model with the ability to recognize these embedding features, we propose a two-stage pretraining pipeline. We also introduce a new dataset, CTX-VUL, which addresses the limitations of existing datasets that either lack contextual information for vulnerable functions or are not publicly available. Extensive experiments demonstrate that CTX-Coder (10B) significantly outperforms baseline models with even larger parameters, such as Qwen2.5-14B and SecGPT. As the input code length increases, CTX-Coder’s F1 score drops by only 5.01%, while other models degrade by 25% to 41.5%, showing strong robustness to long-context scenarios and the effectiveness of our design.

PDF Details DOI

AAAI Conference 2026 Conference Paper

SAFE: Semantic- and Frequency-Enhanced Curriculum for Cross-Domain Deepfake Detection

Yulin Yao
Kangfeng Zheng
Bin Wu
Chunhua Wu
Jujie Wang
Jiaqi Gao
Minjiao Yang
Dan Luo

Driven by advances in GANs and diffusion models, deepfake content has reached an unprecedented level of photorealism, causing detectors to deteriorate once they leave their training domain. Most prior studies adopt CLIP as the backbone of an image-level binary classifier, yet overlook CLIP’s core strength: text-to-image semantic alignment. Moreover, captions generated by CLIP-CAP lack sufficient high-level semantics to distinguish between authentic and manipulated faces. Deepfake generators often fail to maintain semantic coherence, resulting in contradictions that traditional visual models cannot capture. Existing approaches also intermingle all samples during training and thus lack a systematic, difficulty-aware curriculum. To bridge these gaps, we introduce Semantic- and Frequency-Enhanced (SAFE) deepfake detection, a two-component framework: 1) Semantic-enhanced multimodal alignment. Authenticity cues are injected into CLIP-CAP captions, and low-rank LoRA fine-tuning is applied to CLIP’s visual branch, yielding dual supervision for text–image alignment and forgery discrimination. 2) Dual-score curriculum learning. Fourier Correlation Variance (FCV) measures local spectral consistency and, combined with the loss value, is transformed into a difficulty score that ranks training samples from easy to hard, reducing training time by 23.3% and enhancing generalization. SAFE attains state-of-the-art performance on several cross-dataset and cross-manipulation benchmarks. Ablation studies confirm that semantic enhancement, LoRA fine-tuning, and dual-score curriculum are complementary, jointly delivering substantial gains in open-set generalization.

PDF Details DOI

AAAI Conference 2026 Conference Paper

TGCA-LLM: Time-Aware Graph-Text Contrastive Alignment for Enhancing LLMs in Temporal Knowledge Graph Completion

Zexuan Wan
Bo Wang
Kuofei Fang
Bin Wu

Temporal Knowledge Graph Completion (TKGC) aims to infer missing facts by modeling historical events and latent temporal dependencies in Temporal Knowledge Graphs (TKGs). Recently, TKGC methods that integrate graph embeddings into Large Language Models (LLMs) have shown great promise by leveraging the structural information of TKGs together with the powerful reasoning capabilities of LLMs. However, these embedding-based methods are limited by suboptimal graph representations due to noise and long-tail issues in real-world scenarios, and insufficient cross-modal alignment between graph and language, hindering LLMs' ability to fully capture the temporal and structural information of TKGs. To address these issues, we propose TGCA-LLM, a novel embedding-based framework for TKGC. Specifically, TGCA-LLM first employs time-aware contrastive learning to align fact texts with graph structures in the temporal dimension, generating robust graph embeddings and establishing initial cross-modal alignment. Then, through a two-stage tuning process, it enables LLMs to gradually acquire structural and temporal knowledge from graph embeddings while enhancing their cross-modal reasoning capabilities in TKGC. Extensive experiments on three widely used real-world benchmarks demonstrate that TGCA-LLM outperforms state-of-the-art (SOTA) baselines by at least 8.7% MRR, highlighting its effectiveness.

PDF Details DOI

JBHI Journal 2026 Journal Article

Towards Unconstrained Fall Detection Using Vision Language Model: Dataset, Theory and Practices

Shiman Wu
Tianyi Chen
Zhihao Zha
Bin Wu
Yixin Li
Ran Wang
Yanan Li
Chong Tian

Unconstrained fall detection is essential for real-world applications. However, it remains underexplored due to the scarcity of real-world fall data and the limited generalization ability of existing methods. To address these challenges, we first introduce HUST-FALL, a fine-grained text-video dataset for unconstrained fall detection, featuring diverse fall scenarios and rich semantic annotations. Building on this dataset, we propose Action-R1, a lightweight vision-language model that leverages structured textual guidance and reasoning to improve the understanding of fall events. In challenging cross-dataset tests, Action-R1 achieves an average F1 score of 0. 827 on three benchmarks, significantly outperforming conventional CNN/RNN-based methods. Despite having only 1/16 the parameters, Action-R1 achieves competitive performance against MiniCPM-V 2. 6, even surpassing it on UPFall by 116. 22%. These results demonstrate that Action-R1 is a lightweight yet powerful solution for unconstrained fall detection in real-world scenarios.

Details DOI

EAAI Journal 2025 Journal Article

A novel adaptive spatial–temporal cross-graph convolutional fusion learning network for skeleton-based abnormal gait recognition

Liang Wang
Xiaoyan Wu
Bin Wu
Jianning Wu

Developing graph-based abnormal gait classification models with high generalization has been a challenging problem in gait analysis. In this study, a novel adaptive spatial–temporal cross-graph convolutional fusion learning network is proposed to accurately recognize skeleton-based abnormal gait patterns. In the proposed model, with an adaptive fusion adjacency matrix including self-adaptive adjacency matrices and cross-adaptive adjacency matrices, a joint–bone gait graph convolutional fusion learning algorithm is constructed to capture spatial gait abnormality features hidden in skeleton data. A temporal convolution network is then adopted to explore temporal dependencies of gait abnormality embedded in the spatial feature space. This could discover the most discriminative spatial–temporal gait abnormality representations containing richer information about interaction coupling across joints and bones for high-generalization. The skeleton data of mimic abnormal gait from 57 participants were collected to evaluate the feasibility of our model. The experimental results based on the leave-one-subject-out (LOSO) cross-validation scheme show that our proposed model reaches the optimal performance with the highest accuracy of 99. 43%, and significantly outcompetes several recent state-of-the-art models. Our model can feasibly take advantage of the adaptive fusion adjacency matrix to greatly enhance the aggregation degree of joints and bones. This helps to learn excellent gait abnormality representations containing richer interaction information for high generalization while keeping a low learning complexity. Our findings hopefully provide a powerful technical solution for abnormal gait recognition in practical clinical application.

Details DOI

ICRA Conference 2025 Conference Paper

Fuel-Optimal Operational Speed Planning for Autonomous Trucking on Highways

Wei Li 0111
Bin Wu
Jiahao Xiang
Jiaping Ren
Yi Wu
Ruigang Yang

The rapid advancement of autonomous driving technology, particularly in autonomous trucking on highways, shows great value for enhancing efficiency and reducing costs in the logistics industry. In this work, we define the full-trip speed planning problem for autonomous trucks under delivery time and fuel consumption constraints, referred to as the Operational Speed Planning (OSP) problem. To support and accelerate research on the OSP problem, we have developed a comprehensive dataset using a fleet of over 400 trucks. The dataset contains rich, diverse information covering more than 22 million kilometers of real-world highway driving data. In addition to this static dataset, we have developed a closed-loop simulator that allows for the interactive evaluation of OSP solutions, enabling researchers to test speed planning strategies in a realistic environment. Furthermore, we provide an OSP baseline method based on dynamic programming to optimize speed planning, balancing the delivery time requirements and fuel consumption. Our extensive experiments demonstrate both the accuracy of the simulation and the effectiveness of the OSP baseline in planning optimal speeds, proving its capability to meet time constraints while improving fuel efficiency. The dataset, simulator, and baseline will be made publicly available to foster further research and innovation in this area.

Details

ECAI Conference 2025 Conference Paper

HAT-Match: Graph Transformer with Hybrid Attention for Two-View Correspondence Pruning

Gang Wang 0008
Bin Wu
Yufei Chen 0002

Feature correspondence, particularly distinguishing inliers (true matches) from outliers (false matches), remains a core challenge in geometric computer vision. We present HAT-Match, a novel Graph Transformer framework with Hybrid Attention for two-view correspondence pruning. HAT-Match integrates three types of attention mechanismsself-attention for modeling pairwise dependencies, SE-based channel attention for emphasizing salient feature channels, and global global structure-aware for capturing structure-aware consistency. To model local geometric relationships, we first generate coarse clusters using a permutation-equivariant graph pooling and unpooling mechanism, which serves as an initial grouping of potentially consistent correspondences. Based on the resulting embeddings, we further construct local graphs using a DGCNN-style k-nearest neighbor (KNN) strategy, enabling the modeling of fine-grained local dependencies. These local graphs are then passed through a hybrid attention module that jointly encodes local and global contextual features. To refine correspondence confidence, we apply graph Laplacian-based attention over the global graph, enhancing discriminative feature propagation. The entire architecture is integrated into a progressive pruning framework that iteratively removes outliers and updates correspondence weights. Extensive experiments demonstrate that HAT-Match achieves state-of-the-art results across various challenging tasks, including relative pose estimation and visual localization, on both indoor and outdoor datasets.

Details

IJCAI Conference 2025 Conference Paper

Learning Causally Disentangled Representations for Fair Personality Detection

Yangfu Zhu
Meiling Li
Yuting Wei
Di Liu
Yuqing Li
Bin Wu

Personality detection aims to identify the personality traits implied in social posts. Existing methods mainly focus on learning the mapping between user-generated posts and personality trait labels but inevitably suffer from potential harm caused by individual bias, as these posts are written by authors from different backgrounds. Learning such spurious associations between posts and traits may lead to the formation of stereotypes, ultimately restricting the detection of personality in different kind of individual. To tackle the issue, we first investigate individual bias in personality detection from the causality perspective. We propose an Interventional Personality Detection Network (IPDN) to learn implicit confounders in user-generated posts and exploit the true causal effect to train the detection model. Specifically, our IPDN disentangled the causal and biased features behind user-generated posts, and then the biased features are accumulatively clustered as confounder prototypes as the training iterations increase. In parallel, the reconstruction network is reused to approximate backdoor adjustment on raw posts, ensuring that traits see each confounder equally before detection. Extensive experiments conducted on three real-world datasets demonstrate that our IPDN outperforms state-of-the-art methods in personality detection.

PDF Details DOI

EAAI Journal 2025 Journal Article

The three-party evolutionary game of preannouncement strategies for platform's AI updates considering users' loyalties

Ping Li
Bin Wu
Na Wu

The platform adopts the preannouncement strategy to announce information related to AI updates in advance, the preannounce information may reshape the market expectations, thus affecting the supply and demand balance for two-sided users. To analyze the dynamic progress of different preannouncement strategy and explore the equilibrium outcomes, this paper considers three preannouncement strategies regarding AI updates: full, partial and no-preannouncement under consideration of users' loyalty or disloyalty, then building Hotelling game model to explore the optimal results of three players. We construct three-party evolutionary game model to describe the multi-period gaming behaviors, and obtain the possible ESS (evolutionary stable strategy) and related equilibrium conditions. The results show that the full preannouncement is not optimal stable strategy for platform, there exists ESS only when two-sided users are no-loyal with capability of switching platform. When the preannouncement investment is lower than two-sided users' expectations, platform should partially preannounce AI updates, but when the preannouncement investment is higher than bilateral users’ expectations, platform should avoid preannouncing AI updates.

Details DOI

NeurIPS Conference 2025 Conference Paper

Unifying and Enhancing Graph Transformers via a Hierarchical Mask Framework

Yujie Xing
Xiao Wang
Bin Wu
Hai Huang
Chuan Shi

Graph Transformers (GTs) have emerged as a powerful paradigm for graph representation learning due to their ability to model diverse node interactions. However, existing GTs often rely on intricate architectural designs tailored to specific interactions, limiting their flexibly. To address this, we propose a unified hierarchical mask framework that reveals an underlying equivalence between model architecture and attention mask construction. This framework enables a consistent modeling paradigm by capturing diverse interactions through carefully designed attention masks. Theoretical analysis under this framework demonstrates that the probability of correct classification positively correlates with the receptive field size and label consistency, leading to a fundamental design principle: An effective attention mask should ensure both a sufficiently large receptive field and a high level of label consistency. While no single existing mask satisfies this principle across all scenarios, our analysis reveals that hierarchical masks offer complementary strengths—motivating their effective integration. Then, we introduce M$^3$Dphormer, a Mixture-of-Experts based Graph Transformer with Multi-Level Masking and Dual Attention Computation. M$^3$Dphormer incorporates three theoretically grounded hierarchical masks and employs a bi-level expert routing mechanism to adaptively integrate multi-level interaction information. To ensure scalability, we further introduce a dual attention computation scheme that dynamically switches between dense and sparse modes based on local mask sparsity. Extensive experiments across multiple benchmarks demonstrate that M$^3$Dphormer achieves state-of-the-art performance, validating the effectiveness of our unified framework and model design.

PDF Details

AAAI Conference 2024 Conference Paper

Data Augmented Graph Neural Networks for Personality Detection

Yangfu Zhu
Yue Xia
Meiling Li
Tingting Zhang
Bin Wu

Personality detection is a fundamental task for user psychology research. One of the biggest challenges in personality detection lies in the quantitative limitation of labeled data collected by completing the personality questionnaire, which is very time-consuming and labor-intensive. Most of the existing works are mainly devoted to learning the rich representations of posts based on labeled data. However, they still suffer from the inherent weakness of the amount limitation of labels, which potentially restricts the capability of the model to deal with unseen data. In this paper, we construct a heterogeneous personality graph for each labeled and unlabeled user and develop a novel psycholinguistic augmented graph neural network to detect personality in a semi-supervised manner, namely Semi-PerGCN. Specifically, our model first explores a supervised Personality Graph Neural Network (PGNN) to refine labeled user representation on the heterogeneous graph. For the remaining massive unlabeled users, we utilize the empirical psychological knowledge of the Linguistic Inquiry and Word Count (LIWC) lexicon for multi-view graph augmentation and perform unsupervised graph consistent constraints on the parameters shared PGNN. During the learning process of finite labeled users, noise-invariant learning on a large scale of unlabeled users is combined to enhance the generalization ability. Extensive experiments on three real-world datasets, Youtube, PAN2015, and MyPersonality demonstrate the effectiveness of our Semi-PerGCN in personality detection, especially in scenarios with limited labeled users.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Instruction Tuning With Loss Over Instructions

Zhengyan Shi
Adam X. Yang
Bin Wu
Laurence Aitchison
Emine Yilmaz
Aldo Lipani

Instruction tuning plays a crucial role in shaping the outputs of language models (LMs) to desired styles. In this work, we propose a simple yet effective method, Instruction Modelling (IM), which trains LMs by applying a loss function to the instruction and prompt part rather than solely to the output part. Through experiments across 21 diverse benchmarks, we show that, in many scenarios, IM can effectively improve the LM performance on both NLP tasks ( e. g. , MMLU, TruthfulQA, and HumanEval) and open-ended generation benchmarks ( e. g. , MT-Bench and AlpacaEval). Remarkably, in the most advantageous case, IM boosts model performance on AlpacaEval 1. 0 by over 100%. We identify two key factors influencing the effectiveness of IM: (1) The ratio between instruction length and output length in the training data; and (2) The number of training examples. We observe that IM is especially beneficial when trained on datasets with lengthy instructions paired with brief outputs, or under the Superficial Alignment Hypothesis (SAH) where a small amount of training examples are used for instruction tuning. Further analysis substantiates our hypothesis that our improvement can be attributed to reduced overfitting to instruction tuning datasets. It is worth noting that we are not proposing \ours as a replacement for the current instruction tuning process. Instead, our work aims to provide practical guidance for instruction tuning LMs, especially in low-resource scenarios. Our code is available at https: //github. com/ZhengxiangShi/InstructionModelling.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

Graph Sampling-based Meta-Learning for Molecular Property Prediction

Xiang Zhuang
Qiang Zhang
Bin Wu
Keyan Ding
Yin Fang
Huajun Chen

Molecular property is usually observed with a limited number of samples, and researchers have considered property prediction as a few-shot problem. One important fact that has been ignored by prior works is that each molecule can be recorded with several different properties simultaneously. To effectively utilize many-to-many correlations of molecules and properties, we propose a Graph Sampling-based Meta-learning (GS-Meta) framework for few-shot molecular property prediction. First, we construct a Molecule-Property relation Graph (MPG): molecule and properties are nodes, while property labels decide edges. Then, to utilize the topological information of MPG, we reformulate an episode in meta-learning as a subgraph of the MPG, containing a target property node, molecule nodes, and auxiliary property nodes. Third, as episodes in the form of subgraphs are no longer independent of each other, we propose to schedule the subgraph sampling process with a contrastive loss function, which considers the consistency and discrimination of subgraphs. Extensive experiments on 5 commonly-used benchmarks show GS-Meta consistently outperforms state-of-the-art methods by 5. 71%-6. 93% in ROC-AUC and verify the effectiveness of each proposed module. Our code is available at https: //github. com/HICAI-ZJU/GS-Meta.

PDF Details DOI

IJCAI Conference 2022 Conference Paper

Contrastive Graph Transformer Network for Personality Detection

Yangfu Zhu
Linmei Hu
Xinkai Ge
Wanrong Peng
Bin Wu

Personality detection is to identify the personality traits underlying social media posts. Most of the existing work is mainly devoted to learning the representations of posts based on labeled data. Yet the ground-truth personality traits are collected through time-consuming questionnaires. Thus, one of the biggest limitations lies in the lack of training data for this data-hungry task. In addition, the correlations among traits should be considered since they are important psychological cues that could help collectively identify the traits. In this paper, we construct a fully-connected post graph for each user and develop a novel Contrastive Graph Transformer Network model (CGTN) which distills potential labels of the graphs based on both labeled and unlabeled data. Specifically, our model first explores a self-supervised Graph Neural Network (GNN) to learn the post embeddings. We design two types of post graph augmentations to incorporate different priors based on psycholinguistic knowledge of Linguistic Inquiry and Word Count (LIWC) and post semantics. Then, upon the post embeddings of the graph, a Transformer-based decoder equipped with post-to-trait attention is exploited to generate traits sequentially. Experiments on two standard datasets demonstrate that our CGTN outperforms the state-of-the-art methods for personality detection.

PDF Details DOI

KR Conference 2022 Conference Paper

Open Relation Extraction with Non-existent and Multi-span Relationships

Huifan Yang
Da-Wei Li
Zekun Li
Donglin Yang
Bin Wu

Open relation extraction (ORE) aims to assign semantic relationships among arguments, essential to the automatic construction of knowledge graphs (KG). The previous ORE methods and some benchmark datasets consider a relation between two arguments as definitely existing and in a simple single-span form, neglecting possible non-existent relationships and flexible, expressive multi-span relations. However, detecting non-existent relations is necessary for a pipelined information extraction system (first performing named entity recognition then relation extraction), and multi-span relationships contribute to the diversity of connections in KGs. To fulfill the practical demands of ORE, we design a novel Query-based Multi-head Open Relation Extractor (QuORE) to extract single/multi-span relations and detect non-existent relationships effectively. Moreover, we re-construct some public datasets covering English and Chinese to derive augmented and multi-span relation tuples. Extensive experiment results show that our method outperforms the state-of-the-art ORE model LOREM in the extraction of existing single/multi-span relations and the overall performances on four datasets with non-existent relationships.

PDF Details DOI

TIST Journal 2022 Journal Article

Privacy Preservation for Trajectory Publication Based on Differential Privacy

Lin Yao
Zhenyu Chen
Haibo Hu
Guowei Wu
Bin Wu

With the proliferation of location-aware devices, trajectory data have been used widely in real-life applications. However, trajectory data are often associated with sensitive labels, such as users’ purchase transactions and planned activities. As such, inappropriate sharing or publishing of these data could threaten users’ privacy, especially when an adversary has sufficient background knowledge about a trajectory through other data sources, such as social media (check-in tags). Though differential privacy has been used to address the privacy of trajectory data, no existing method can protect the privacy of both trajectory data and sensitive labels. In this article, we propose a comprehensive trajectory publishing algorithm with three effective procedures. First, we apply density-based clustering to determine hotspots and outliers and then blur their locations by generalization. Second, we propose a graph-based model to efficiently capture the relationship among sensitive labels and trajectory points in all records and leverage Laplace noise to achieve differential privacy. Finally, we generate and publish trajectories by traversing and updating this graph until we travel all vertexes. Our experiments on synthetic and real-life datasets demonstrate that our algorithm effectively protects the privacy of both sensitive labels and location data in trajectory publication. Compared with existing works on trajectory publishing, our algorithm can also achieve higher data utility.

Details DOI

EAAI Journal 2021 Journal Article

BiS4EV: A fast routing algorithm considering charging stations and preferences for electric vehicles

Ying Zhang
Bin Wu
Yao-Yi Chiang
Xin Zhang
Yuanchang Chen
Muyang Li
Fanyu Li

Electric Vehicles (EVs) have grown in recent years as they have become a promising alternative to traditional fossil fuel-driven vehicles. As a result, new routing algorithms that consider both the locations of charging stations and the charging preferences of users are necessary to maintain urban traffic efficiency. This paper proposes a Constrained Route Planning (CRP) approach, called BiS4EV, to provide efficient route planning for EVs in large urban road networks. CRP is an NP-hard problem. BiS4EV introduces a two-layered structure to solve the CRP problem for EVs. In the first layer, BiS4EV integrates the locations of charging stations and the battery consumption on each road arc into a road network graph and proposes a fast routing algorithm that considers the constraints of battery capacity and charging preferences to find the shortest feasible path for EVs. The second layer incorporates charging policies to the obtained path, which forms a complete route plan for EVs. Such a two-layered method breaks down the complexity of the problem and avoids directly solving the entire NP-hard problem. We have proved the correctness of the proposed approach in theory, and the presented experimental results verified the efficiency and the effectiveness of BiS4EV from the practical aspect.

Details DOI

IJCAI Conference 2019 Conference Paper

DeepRec: An Open-source Toolkit for Deep Learning based Recommendation

Shuai Zhang
Yi Tay
Lina Yao
Bin Wu
Aixin Sun

Deep learning based recommender systems have been extensively explored in recent years. However, the large number of models proposed each year poses a big challenge for both researchers and practitioners in reproducing the results for further comparisons. Although a portion of papers provides source code, they adopted different programming languages or different deep learning packages, which also raises the bar in grasping the ideas. To alleviate this problem, we released the open source project: \textbf{DeepRec}. In this toolkit, we have implemented a number of deep learning based recommendation algorithms using Python and the widely used deep learning package - Tensorflow. Three major recommendation scenarios: rating prediction, top-N recommendation (item ranking) and sequential recommendation, were considered. Meanwhile, DeepRec maintains good modularity and extensibility to easily incorporate new models into the framework. It is distributed under the terms of the GNU General Public License. The source code is available at github: https: //github. com/cheungdaven/DeepRec

PDF Details

AAAI Conference 2019 Conference Paper

Hierarchical Macro Strategy Model for MOBA Game AI

Bin Wu

The next challenge of game AI lies in Real Time Strategy (RTS) games. RTS games provide partially observable gaming environments, where agents interact with one another in an action space much larger than that of GO. Mastering RTS games requires both strong macro strategies and delicate micro level execution. Recently, great progress has been made in micro level execution, while complete solutions for macro strategies are still lacking. In this paper, we propose a novel learning-based Hierarchical Macro Strategy model for mastering MOBA games, a sub-genre of RTS games. Trained by the Hierarchical Macro Strategy model, agents explicitly make macro strategy decisions and further guide their micro level execution. Moreover, each of the agents makes independent strategy decisions, while simultaneously communicating with the allies through leveraging a novel imitated crossagent communication mechanism. We perform comprehensive evaluations on a popular 5v5 Multiplayer Online Battle Arena (MOBA) game. Our 5-AI team achieves a 48% winning rate against human player teams which are ranked top 1% in the player ranking system.

PDF Details

TCS Journal 2015 Journal Article

Pfaffian orientations and perfect matchings of scale-free networks

Zhongzhi Zhang
Bin Wu

Counting perfect matchings is of great interest in theoretical computer science, mathematics, among other disciplines. However, explicitly determining the number of perfect matchings of general graphs is a challenge. In this paper, we present a first analytical study of the enumeration problem of perfect matchings on scale-free networks, which are ubiquitous in real-life systems. We focus on two iteratively growing scale-free networks: one is fractal, while the other is non-fractal. For each network, we construct an orientation and prove that it is Pfaffian. Then, based on the connection between the number of perfect matchings and determinant of the skew adjacency matrix corresponding to a Pfaffian orientation, we derive the recursive expressions for the number of perfect matchings of the networks at two successive iterations. Furthermore, using the recursive relations, we show that for both scale-free networks, their entropy is zero, which is in sharp contrast to the previously obtained results for other networks without scale-free behavior, but having the same average degree as the studied networks, such as square lattice and honeycomb lattice, whose entropies are larger than zero. Our results indicate from another angle that scale-free networks have a distinct architecture in comparison with those lattices lacking the power-law property.

Details DOI

TIST Journal 2014 Journal Article

Multi-Label Classification Based on Multi-Objective Optimization

Chuan Shi
Xiangnan Kong
Di Fu
Philip S. Yu
Bin Wu

Multi-label classification refers to the task of predicting potentially multiple labels for a given instance. Conventional multi-label classification approaches focus on single objective setting, where the learning algorithm optimizes over a single performance criterion (e.g., Ranking Loss ) or a heuristic function. The basic assumption is that the optimization over one single objective can improve the overall performance of multi-label classification and meet the requirements of various applications. However, in many real applications, an optimal multi-label classifier may need to consider the trade-offs among multiple inconsistent objectives, such as minimizing Hamming Loss while maximizing Micro F1. In this article, we study the problem of multi-objective multi-label classification and propose a novel solution (called M oml ) to optimize over multiple objectives simultaneously. Note that optimization objectives may be inconsistent, even conflicting, thus one cannot identify a single solution that is optimal on all objectives. Our M oml algorithm finds a set of non-dominated solutions which are optimal according to different trade-offs among multiple objectives. So users can flexibly construct various predictive models from the solution set, which provides more meaningful classification results in different application scenarios. Empirical studies on real-world tasks demonstrate that the M oml can effectively boost the overall performance of multi-label classification by optimizing over multiple objectives simultaneously.

Details DOI

IJCAI Conference 2011 Conference Paper

Multi-Kernel Multi-Label Learning with Max-Margin Concept Network

Wei Zhang
Xiangyang Xue
Jianping Fan
Xiaojing Huang
Bin Wu
Mingjie Liu

In this paper, a novel method is developed for enabling Multi-Kernel Multi-Label Learning. Inter-label dependency and similarity diversity are simultaneously leveraged in the proposed method. A concept network is constructed to capture the inter-label correlations for classifier training. Maximal margin approach is used to effectively formulate the feature-label associations and the label-label correlations. Specific kernels are learned not only for each label but also for each pair of the inter-related labels. By learning the eigenfunctions of the kernels, the similarity between a new data point and the training samples can be computed in the online mode. Our experimental results on real datasets (web pages, images, music, and bioinformatics) have demonstrated the effectiveness of our method.

PDF Details DOI

NeurIPS Conference 2002 Conference Paper

Mean Field Approach to a Probabilistic Model in Information Retrieval

Bin Wu
K. Wong
David Bodoff

We study an explicit parametric model of documents, queries, and rel- evancy assessment for Information Retrieval (IR). Mean-ﬁeld methods are applied to analyze the model and derive efﬁcient practical algorithms to estimate the parameters in the problem. The hyperparameters are es- timated by a fast approximate leave-one-out cross-validation procedure based on the cavity method. The algorithm is further evaluated on several benchmark databases by comparing with standard algorithms in IR.

PDF Details