EAAI Journal 2026 Journal Article
A novel feature reconstruction method for bone marrow cell classification
- Huixiang Zhi
- Muwei Jian
- Hongyu Chen
- Wenjing Xu
- Changqun Nie
- Hanjiang Luo
- Xiaoguang Li
Author name cluster
Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.
EAAI Journal 2026 Journal Article
JBHI Journal 2025 Journal Article
Automated landmark localization can help radiologists quickly determine the locations of key structures or lesion areas from medical images. However, when facing large-volume 3D medical images, existing methods have very high computational complexity due to the need to encode the global image. That is to say, it is difficult for existing methods to achieve accurate landmark localization in 3D medical images at a faster localization speed. In this paper, an accurate multi-landmark localization method for ear 3D Ultra-High Resolution CT (U-HRCT) images is proposed. This method adopts a novel localization pipeline that combines Deep Reinforcement Learning (DRL) and Transformer. Firstly, the DRL algorithm is used to quickly collect landmark-related local features. Secondly, Transformer is used to extract the spatial position relationship between anatomical structures from these discrete local features to infer the coordinate position of the landmark. Because the complex process of encoding the global image is avoided, the proposed method can achieve fast localization of ear multi-landmark in 3D U-HRCT images. Finally, we proposed a refinement module based on dual-branch hybrid Multi-Layer Perceptron, which can use the fast localization results of multi-landmark to learn the spatial position relationship between landmarks, thereby further improving the accuracy and stability of landmark localization. Experimental results on the self-built ear 3D U-HRCT dataset and the publicly available 2D cephalometric dataset demonstrate that, the proposed method can achieve Successful Detection Rate of 96. 71% and 89. 97% respectively within the precision range of 2. 0 mm, surpassing the state-of-the-art multi-landmark localization methods and has a faster localization speed.
NeurIPS Conference 2025 Conference Paper
Information seeking demands iterative evidence gathering and reflective reasoning, yet large language models (LLMs) still struggle with it in open-web question answering. Existing prompting and supervised fine-tuning (SFT) methods remain fixed by prompt rules or training corpora, and are usually benchmarked only on well-structured wiki sources, limiting real-world adaptability. We introduce $\textbf{WebPuzzle}$, a 24k-sample training and 275-sample test benchmark that evaluates information seeking on the live internet, across both wiki and open-domain queries. Leveraging 7k WebPuzzle instances, we develop $\textbf{DeepDiver}$, a reinforcement-learning (RL) framework that cultivates $\textbf{Search Intensity Scaling (SIS)}$—an emergent ability to escalate search frequency and depth instead of settling on overconfident, under-evidenced answers. With SIS, Qwen2. 5-7B-Instruct and Pangu-7B-Reasoner attain performance on real-web tasks comparable to the 671B-parameter DeepSeek-R1. We detail DeepDiver’s curriculum from cold-start SFT to a well designed RL procedure, and show that its seeking policy generalized from closed-ended queries to open-ended generation such as long-form writing. Our results advance adaptive information seeking in LLMs and provide a rigorous benchmark for future work.
EAAI Journal 2024 Journal Article
FLAP Journal 2024 Journal Article
The main purpose of this paper is to investigate the type of monadic bounded L-algebras as L-algebras equipped with two monadic operators, named univer- sal quantifier “∀” and existential quantifier “∃”, respectively. First, we investi- gate the properties of pre-ideals on L-algebras and the pre-ideal generated by a nonempty subset of an L-algebra is defined. Second, we investigate monadic bounded L-algebras and monadic pre-ideals in monadic bounded L-algebras. Moreover, the relation between monadic bounded L-algebras and monadic quan- tum B-algebras is discussed. Finally, the relations among monadic self-similar L-algebras and other monadic structures are discussed, such as monadic (left) hoops, monadic Wajsberg hoops and monadic MV-algebras. Moreover, we ob- tain a characterization of monadic bounded L-algebras and monadic bounded self-similar L-algebras by relatively complete subalgebras and m-relatively com- plete subalgebras, respectively. These results are important to the further study of logical system with monadic operators.
AAAI Conference 2024 Conference Paper
Point cloud shape completion, which aims to reconstruct the missing regions of the incomplete point clouds with plausible shapes, is an ill-posed and challenging task that benefits many downstream 3D applications. Prior approaches achieve this goal by employing a two-stage completion framework, generating a coarse yet complete seed point cloud through an encoder-decoder network, followed by refinement and upsampling. However, the encoded features suffer from information loss of the missing portion, leading to an inability of the decoder to reconstruct seed points with detailed geometric clues. To tackle this issue, we propose a novel Orthogonal Dictionary Guided Shape Completion Network (ODGNet). The proposed ODGNet consists of a Seed Generation U-Net, which leverages multi-level feature extraction and concatenation to significantly enhance the representation capability of seed points, and Orthogonal Dictionaries that can learn shape priors from training samples and thus compensate for the information loss of the missing portions during inference. Our design is simple but to the point, extensive experiment results indicate that the proposed method can reconstruct point clouds with more details and outperform previous state-of-the-art counterparts. The implementation code is available at https://github.com/corecai163/ODGNet.
ICLR Conference 2024 Conference Paper
Disentangled representation learning remains challenging as the underlying factors of variation in the data do not naturally exist. The inherent complexity of real-world data makes it unfeasible to exhaustively enumerate and encapsulate all its variations within a finite set of factors. However, it is worth noting that most real-world data have linguistic equivalents, typically in the form of textual descriptions. These linguistic counterparts can represent the data and effortlessly decomposed into distinct tokens. In light of this, we present Vocabulary Disentangled Retrieval (VDR), a retrieval-based framework that harnesses natural language as proxies of the underlying data variation to drive disentangled representation learning. Our approach employ a bi-encoder model to represent both data and natural language in a vocabulary space, enabling the model to distinguish dimensions that capture intrinsic characteristics within data through its natural language counterpart, thus facilitating disentanglement. We extensively assess the performance of VDR across 15 retrieval benchmark datasets, covering text-to-text and cross-modal retrieval scenarios, as well as human evaluation. Our experimental results compellingly demonstrate the superiority of VDR over previous bi-encoder retrievers with comparable model size and training costs, achieving an impressive 8.7% improvement in NDCG@10 on the BEIR benchmark, a 5.3\% increase on MS COCO, and a 6.0% increase on Flickr30k in terms of mean recall in the zero-shot setting. Moreover, The results from human evaluation indicate that interpretability of our method is on par with SOTA captioning models.
FLAP Journal 2024 Journal Article
In this paper, some types of weak hyper filters in hyper BE-algebras are introduced and studied including positive implicative weak hyper filters, im- plicative weak hyper filters and obstinate weak hyper filters. The relationships between (positive) implicative weak hyper filters and weak hyper filters, and also obstinate weak hyper filters and maximal weak hyper filters, positive implicative hyper filters, are discussed respectively. Moreover, the equivalent characteriza- tions of these weak hyper filters are given, and the corresponding conditions are found.
JBHI Journal 2024 Journal Article
Colon polyps in colonoscopy images exhibit significant differences in color, size, shape, appearance, and location, posing significant challenges to accurate polyp segmentation. In this paper, a Weighted Dual-branch Feature Fusion Network is proposed for Polyp Segmentation, named WDFF-Net, which adopts HarDNet68 as the backbone network. First, a dual-branch feature fusion network architecture is constructed, which includes a shared feature extractor and two feature fusion branches, i. e. Progressive Feature Fusion (PFF) branch and Scale-aware Feature Fusion (SFF) branch. The branches fuse the deep features of multiple layers for different purposes and with different fusion ways. The PFF branch is to address the under-segmentation or over-segmentation problems of flat polyps with low-edge contrast by iteratively fusing the features from low, medium, and high layers. The SFF branch is to tackle the the problem of drastic variations in polyp size and shape, especially the missed segmentation problem for small polyps. These two branches are complementary and play different roles, in improving segmentation accuracy. Second, an Object-aware Attention Mechanism (OAM) is proposed to enhance the features of the target regions and suppress those of the background regions, to interfere with the segmentation performance. Third, a weighted dual-branch the segmentation loss function is specifically designed, which dynamically assigns the weight factors of the loss functions for two branches to optimize their collaborative training. Experimental results on five public colon polyp datasets demonstrate that, the proposed WDFF-Net can achieve a superior segmentation performance with lower model complexity and faster inference speed, while maintaining good generalization ability.
NeurIPS Conference 2023 Conference Paper
Model poisoning attacks greatly jeopardize the application of federated learning (FL). The effectiveness of existing defenses is susceptible to the latest model poisoning attacks, leading to a decrease in prediction accuracy. Besides, these defenses are intractable to distinguish benign outliers from malicious gradients, which further compromises the model generalization. In this work, we propose a novel defense including detection and aggregation, named RECESS, to serve as a “vaccine” for FL against model poisoning attacks. Different from the passive analysis in previous defenses, RECESS proactively queries each participating client with a delicately constructed aggregation gradient, accompanied by the detection of malicious clients according to their responses with higher accuracy. Further, RECESS adopts a newly proposed trust scoring based mechanism to robustly aggregate gradients. Rather than previous methods of scoring in each iteration, RECESS takes into account the correlation of clients’ performance over multiple iterations to estimate the trust score, bringing in a significant increase in detection fault tolerance. Finally, we extensively evaluate RECESS on typical model architectures and four datasets under various settings including white/black-box, cross-silo/device FL, etc. Experimental results show the superiority of RECESS in terms of reducing accuracy loss caused by the latest model poisoning attacks over five classic and two state-of-the-art defenses.
JBHI Journal 2023 Journal Article
Refined and automatic retinal vessel segmentation is crucial for computer-aided early diagnosis of retinopathy. However, existing methods often suffer from mis-segmentation when dealing with thin and low-contrast vessels. In this paper, a two-path retinal vessel segmentation network is proposed, namely TP-Net, which consists of three core parts, i. e. , main-path, sub-path, and multi-scale feature aggregation module (MFAM). Main-path is to detect the trunk area of the retinal vessels, and the sub-path to effectively capture edge information of the retinal vessels. The prediction results of the two paths are combined by MFAM, obtaining refined segmentation of retinal vessels. In the main-path, a three-layer lightweight backbone network is elaborately designed according to the characteristics of retinal vessels, and then a global feature selection mechanism (GFSM) is proposed, which can autonomously select features that are more important for the segmentation task from the features at different layers of the network, thereby, enhancing the segmentation capability for low-contrast vessels. In the sub-path, an edge feature extraction method and an edge loss function are proposed, which can enhance the ability of the network to capture edge information and reduce the mis-segmentation of thin vessels. Finally, MFAM is proposed to fuse the prediction results of main-path and sub-path, which can remove background noises while preserving edge details, and thus, obtaining refined segmentation of retinal vessels. The proposed TP-Net has been evaluated on three public retinal vessel datasets, namely DRIVE, STARE, and CHASE DB1. The experimental results show that the TP-Net achieved a superior performance and generalization ability with fewer model parameters compared with the state-of-the-art methods.
IJCAI Conference 2022 Conference Paper
Recently users’ right-to-be-forgotten is stipulated by many laws and regulations. However, only removing the data from the dataset is not enough, as machine learning models would memorize the training data once the data is involved in model training, increasing the risk of exposing users’ privacy. To solve this problem, currently, the straightforward method, naive retraining, is to discard these data and retrain the model from scratch, which is reliable but brings much computational and time overhead. In this paper, we propose an exact unlearning architecture called ARCANE. Based on ensemble learning, we transform the naive retraining into multiple one-class classification tasks to reduce retraining cost while ensuring model performance, especially in the case of a large number of unlearning requests not considered by previous works. Then we further introduce data preprocessing methods to reduce the retraining overhead and speed up the unlearning, which includes representative data selection for redundancy removal, training state saving to reuse previous calculation results, and sorting to cope with unlearning requests of different distributions. We extensively evaluate ARCANE on three typical datasets with three common model architectures. Experiment results show the effectiveness and superiority of ARCANE over both the naive retraining and the state-of-the-art method in terms of model performance and unlearning speed.
AAAI Conference 2021 Conference Paper
Collecting supporting evidence from large corpora of text (e. g. , Wikipedia) is of great challenge for open-domain Question Answering (QA). Especially, for multi-hop open-domain QA, scattered evidence pieces are required to be gathered together to support the answer extraction. In this paper, we propose a new retrieval target, hop, to collect the hidden reasoning evidence from Wikipedia for complex question answering. Specifically, the hop in this paper is defined as the combination of a hyperlink and the corresponding outbound link document. The hyperlink is encoded as the mention embedding which models the structured knowledge of how the outbound link entity is mentioned in the textual context, and the corresponding outbound link document is encoded as the document embedding representing the unstructured knowledge within it. Accordingly, we build HopRetriever which retrieves hops over Wikipedia to answer complex questions. Experiments on the HotpotQA dataset demonstrate that HopRetriever outperforms previously published evidence retrieval methods by large margins. Moreover, our approach also yields quantifiable interpretations of the evidence collection process.
AAAI Conference 2021 Conference Paper
Sequential recommender systems aim to model users’ evolving interests from their historical behaviors, and hence make customized time-relevant recommendations. Compared with traditional models, deep learning approaches such as CNN and RNN have achieved remarkable advancements in recommendation tasks. Recently, the BERT framework also emerges as a promising method, benefited from its selfattention mechanism in processing sequential data. However, one limitation of the original BERT framework is that it only considers one input source of the natural language tokens. It is still an open question to leverage various types of information under the BERT framework. Nonetheless, it is intuitively appealing to utilize other side information, such as item category or tag, for more comprehensive depictions and better recommendations. In our pilot experiments, we found naive approaches, which directly fuse types of side information into the item embeddings, usually bring very little or even negative effects. Therefore, in this paper, we propose the NOninVasive self-Attention mechanism (NOVA) to leverage side information effectively under the BERT framework. NOVA makes use of side information to generate better attention distribution, rather than directly altering the item embeddings, which may cause information overwhelming. We validate the NOVA-BERT model on both public and commercial datasets, and our method can stably outperform the state-of-the-art models with negligible computational overheads.