EAAI Journal 2026 Journal Article
Cross-stain knowledge distillation for low-cost lung cancer programmed death ligand-1 assessment with multi-granularity multiple instance learning
- Yi Shi
- Chong Ge
- Fang Zhao
- Anli Zhang
- Ao Li
- Haibo Wu
- Minghui Wang
Author name cluster
Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.
EAAI Journal 2026 Journal Article
JBHI Journal 2025 Journal Article
Cancer is a pressing public health problem and one of the main causes of mortality worldwide. The development of advanced computational methods for predicting cancer survival is pivotal in aiding clinicians to formulate effective treatment strategies and improve patient quality of life. Recent advances in survival prediction methods show that integrating diverse information from various cancer-related data, such as pathological images and genomics, is crucial for improving prediction accuracy. Despite promising results of existing approaches, there are great challenges of modality gap and semantic redundancy presented in multiple cancer data, which could hinder the comprehensive integration and pose substantial obstacles to further enhancing cancer survival prediction. In this study, we propose a novel agnostic-specific modality learning (ASML) framework for accurate cancer survival prediction. To bridge the modality gap and provide a comprehensive view of distinct data modalities, we employ an agnostic-specific learning strategy to learn the commonality across modalities and the uniqueness of each modality. Moreover, a cross-modal fusion network is exerted to integrate multimodal information by modeling modality correlations and diminish semantic redundancy in a divide-and-conquer manner. Extensive experiment results on three TCGA datasets demonstrate that ASML reaches better performance than other existing cancer survival prediction methods for multiple data.
JBHI Journal 2025 Journal Article
Accurate cancer survival prediction is crucial in devising optimal treatment plans and offering individualized care to improve clinical outcomes. Recent researches confirm that integrating heterogenous cancer data such as histopathological images and genomic data, can enhance our understanding of cancer progression and provides a multimodal perspective on patient survival chances. However, existing methods often over-look the fundamental aspects of multimodal data, i. e. , consistency and complementarity, which in consequence significantly hinder advancements in cancer survival prediction. To address this issue, we represent DRLSurv, a novel multimodal deep learning method that leverages disentangled representation learning for precise cancer survival prediction. Through dedicated deep encoding networks, DRLSurv decomposes each modality into modality-invariant and modality-specific representations, which are mapped to common and unique feature subspaces for simultaneously mining the distinct aspects of cancer multimodal data. Moreover, our method innovatively introduces a subspace-based proximity contrastive loss and re-disentanglement loss, thus ensuring the successful decomposition of consistent and complementary information while maintaining the multimodal fidelity during the learning of disentangled representations. Both quantitative analyses and visual assessments on different datasets validate the superiority of DRLSurv over existing survival prediction approaches, demonstrating its powerful capability to exploit enriched survival-related information from cancer multimodal data. Therefore, DRLSurv not only offers a unified and comprehensive deep learning framework for advancing multimodal survival predictions, but also provides valuable insights for cancer prognosis and survival analysis.
JBHI Journal 2025 Journal Article
Accurate cancer survival prediction is crucial for oncologists to determine therapeutic plan, which directly influences the treatment efficacy and survival outcome of patient. Recently, multimodal fusion-based prognostic methods have demonstrated effectiveness for survival prediction by fusing diverse cancer-related data from different medical modalities, e. g. , pathological images and genomic data. However, these works still face significant challenges. First, most approaches attempt multimodal fusion by simple one-shot fusion strategy, which is insufficient to explore complex interactions underlying in highly disparate multimodal data. Second, current methods for investigating multimodal interactions face the capability-efficiency dilemma, which is the difficult balance between powerful modeling capability and applicable computational efficiency, thus impeding effective multimodal fusion. In this study, to encounter these challenges, we propose an innovative multi-shot interactive fusion method named MIF for precise survival prediction by utilizing pathological and genomic data. Particularly, a novel multi-shot fusion framework is introduced to promote multimodal fusion by decomposing it into successive fusing stages, thus delicately integrating modalities in a progressive way. Moreover, to address the capacity-efficiency dilemma, various affinity-based interactive modules are introduced to synergize the multi-shot framework. Specifically, by harnessing comprehensive affinity information as guidance for mining interactions, the proposed interactive modules can efficiently generate low-dimensional discriminative multimodal representations. Extensive experiments on different cancer datasets unravel that our method not only successfully achieves state-of-the-art performance by performing effective multimodal fusion, but also possesses high computational efficiency compared to existing survival prediction methods.
JBHI Journal 2025 Journal Article
Accurate classification of nuclei is a crucial step in advancing pathology image analysis for disease diagnosis and treatment. Recently, prompt learning has shown great promise in universal nuclei classification for multiple datasets, as it can harvest the common knowledge by representing nuclei semantics across different sources. However, remarkable intra- and cross-dataset variability in nuclei categories and their complicated semantic relationships with numerous nuclei are often observed in distinct image instances, leading to two key challenges of instance variability and semantics ambiguity. In this paper, we propose a universal language-guided nuclei classification framework (ULNC) that leverages prompt-based language supervision to overcome these obstacles at the instance level. To address the instance variability issue, we introduce an innovative prompt learning approach that fully exploits unique contextual information of each image instance and generates instance-aware text embeddings highly adaptable to the categorical semantics of varied data sources. Additionally, to tackle the problem of semantics ambiguity, we employ a local vision-language matching loss that explicitly reinforces semantic connections between localized image regions and text prompts for nuclei categories, thus promoting the universal model's ability to learn discriminative image features for generalized nuclei classification. Extensive experiments conducted on several public datasets demonstrate that ULNC outperforms state-of-the-art methods in both accuracy and generalization to unseen domains, highlighting its potential for robust nuclei classification across diverse datasets.
JBHI Journal 2024 Journal Article
As powerful tools deep neural networks have been successfully adopted for nuclei detection in histopathology images, whereas require the same probability distribution between training and testing data. However, domain shift among histopathology images widely exists in real-world applications and severely deteriorates the detection performance of deep neural networks. Despite encouraging results of existing domain adaptation methods, there remain challenges for cross-domain nuclei detection task. First, in view of the tiny size of nuclei, it is actually very difficult to obtain sufficient nuclei features, thus leading to a negative influence for feature alignment. Second, due to unavailable annotations in target domain, some extracted features contain background pixels and are thereby indiscriminative, which can largely confuse the alignment procedure. To address these challenges, in this article, we propose an end-to-end graph-based nuclei feature alignment (GNFA) method for boosting cross-domain nuclei detection. Concretely, sufficient nuclei features are generated from nuclei graph convolutional network (NGCN) by aggregating information of adjacent nuclei upon construction of nuclei graph for successful alignment. In addition, importance learning module (ILM) is designed to further select discriminative nuclei features for mitigating negative influence of background pixels in target domain during alignment. By utilizing sufficient and discriminative node features generated from GNFA, our method can successfully perform feature alignment and effectively alleviate domain shift problem for nuclei detection. Extensive experiments of multiple adaptation scenarios reveal that our method achieves state-of-the-art performance in cross-domain nuclei detection compared with existing domain adaptation methods.
AIIM Journal 2022 Journal Article
JBHI Journal 2022 Journal Article
Accurate histological subtype classification between adenocarcinoma (ADC) and squamous cell carcinoma (SCC) using computed tomography (CT) images is of great importance to assist clinicians in determining treatment and therapy plans for non-small cell lung cancer (NSCLC) patients. Although current deep learning approaches have achieved promising progress in this field, they are often difficult to capture efficient tumor representations due to inadequate training data, and in consequence show limited performance. In this study, we propose a novel and effective reconstruction-assisted feature encoding network (RAFENet) for histological subtype classification by leveraging an auxiliary image reconstruction task to enable extra guidance and regularization for enhanced tumor feature representations. Different from existing reconstruction-assisted methods that directly use generalizable features obtained from shared encoder for primary task, a dedicated task-aware encoding module is utilized in RAFENet to perform refinement of generalizable features. Specifically, a cascade of cross-level non-local blocks are introduced to progressively refine generalizable features at different levels with the aid of lower-level task-specific information, which can successfully learn multi-level task-specific features tailored to histological subtype classification. Moreover, in addition to widely adopted pixel-wise reconstruction loss, we introduce a powerful semantic consistency loss function to explicitly supervise the training of RAFENet, which combines both feature consistency loss and prediction consistency loss to ensure semantic invariance during image reconstruction. Extensive experimental results show that RAFENet effectively addresses the difficult issues that cannot be resolved by existing reconstruction-based methods and consistently outperforms other state-of-the-art methods on both public and in-house NSCLC datasets. Supplementary material is available at https://github.com/lhch1994/Rafenet_sup_material.
YNICL Journal 2021 Journal Article
IJCAI Conference 2021 Conference Paper
Due to limited knowledge carried by queries, traditional dialogue systems often face the dilemma of generating boring responses, leading to poor user experience. To alleviate this issue, this paper proposes a novel infobox knowledge-aware dialogue generation approach, HITA-Graph, with three unique features. First, open-domain infobox tables that describe entities with relevant attributes are adopted as the knowledge source. An order-irrelevance Hierarchical Infobox Table Encoder is proposed to represent an infobox table at three levels of granularity. In addition, an Infobox-Dialogue Interaction Graph Network is built to effectively integrate the infobox context and the dialogue context into a unified infobox representation. Second, a Hierarchical Infobox Attribute Attention mechanism is developed to access the encoded infobox knowledge at different levels of granularity. Last but not least, a Dynamic Mode Fusion strategy is designed to allow the Decoder to select a vocabulary word or copy a word from the given infobox/query. We extract infobox tables from Chinese Wikipedia and construct an infobox knowledge base. Extensive evaluation on an open-released Chinese corpus demonstrates the superior performance of our approach against several representative methods.
JBHI Journal 2020 Journal Article
Glioblastoma multiforme (GBM) is one of the most malignant brain tumors with very short prognosis expectation. To improve patients’ clinical treatment and their life quality after surgery, researches have developed tremendous in silico models and tools for predicting GBM prognosis based on molecular datasets and have earned great success. However, pathology still plays the most critical role in cancer diagnosis and prognosis in the clinic at present. Recent advancement of storing and processing histopathological images has drawn attention of researchers. Models based on histopathological images are developed, which show great potential for computer-aided pathological diagnoses. But models based on both molecular and histopathological images that could predict GBM prognosis with high accuracy are not present yet. In our previous research, we used the simple MKL method to integrate multi-omics data to improve GBM prognosis prediction successfully. In this paper, we have developed a novel multiple kernel learning (MKL) method, named histopathological integrating multiple kernel learning (HI-MKL), that could integrate both histopathological images and multi-omics data efficiently. By using datasets from The Cancer Genome Atlas project, we have built a system that could predict the GBM prognosis with high accuracy. Our research shows that HI-MKL is an accurate, robust, and generalized MKL method, which performs well in a GBM prognosis task.
YNICL Journal 2020 Journal Article
IROS Conference 2012 Conference Paper
Connectionist Central Pattern Generator models (CCPG) are helpful to understand how the CPG neural mechanism functions, and have relatively small complexity which makes them suitable for controlling snake-like robots. However, there are few CCPG models are constructed to generate the snake-like robot's three-dimensional gaits, which are important for adapation, and their gaits generation ability is also very inadequate. According to the CPG mechanism, a hierarchical CCPG model (HCCPG) with small complexity is proposed to implement the three-dimensional gaits better. The HCCPG has a two-layers structure, namely the basic rhythmic signal generation layer and the output signal modulation layer. The HCCPG can generate three-dimensional gaits well and is extendable. Based on the HCCPG, a three-dimensional gait control method is proposed. The simulations and experiments validate this method.
ICRA Conference 2012 Conference Paper
Stair-climbing is a necessary capacity for mobile robots. This paper presents an online control method for the stair-climbing of a transformable tracked robot, Amoeba-II, and this robot is also an isomerism-modules robot with different mechanism modules. Based on the reasonable compartmentalization and kinematics analysis of the stair-climbing process, the coordination of the rotations of modules can reduce the slippage between tracks and terrain. To ensure that the robot can climb stairs with enough capability and stability, the stair-climbing criterion for the robot has been established based on the force analysis of each stage of the stair-climbing procedure. Meanwhile, the interference-avoiding criterion has been set up to avoid the interference between the non-tracked module of the robot and the stair. The experiment for the stair-climbing of the robot has been implemented to certify the validity of the online stair-climbing control method for a transformable tracked robot.
IROS Conference 2012 Conference Paper
This paper presents an optimal design method for a new robot called amphibious transformable robot which can not only perform reconfiguration but also implement tasks in amphibious environment. To satisfy a range of performance requirements for the robot in aquatic and terrestrial environments, the multi-objective optimization method is adopted to design the robot which can achieve the optimal comprehensive performance in the amphibious environment. Based on the kinematics and dynamic analysis of the robot, the multi-objective optimization problem of the mechanism parameters design is established on the mapping relationships between the performance indexes, and then Multi-Objective Genetic Algorithm is proposed to get Pareto solution. Based on combination weighting method of multi-attribute decision-making, the result can be extracted and used to direct the mechanism design of the amphibious transformable robot, Amoeba-II. The experiment for the maneuverability of Amoeba-II in the amphibious environment is performed to verify the validity and applicability of the mechanism-parameters design method of amphibious transformable robot based on Multi-Objective Genetic Algorithm.
IROS Conference 2010 Conference Paper
The mobile robots often perform the dangerous missions such as planetary exploration, reconnaissance, anti-terrorism, rescue, and so on. So it is required that the robots should be able to move in the complex and unpredictable environment where the ground might be soft and hard, even and uneven. To access to such terrains, a novel robot (NEZA-I) with the self-adaptive mobile mechanism is proposed and developed. It consists of a control system unit and two symmetric transformable wheel-track (TWT) units. Each TWT unit is driven only by one servo motor, and can efficiently move over rough terrain by changing the locomotion mode and transforming the track configuration. It means that the mobile mechanism of NEZA-I has self-adaptability to the irregular environment. The paper proposes the design concept of NEZA-I, presents the structure and the drive system of NEZA-I, and describes the self-adaptive principle of the mobile mechanism to the rough terrains. The locomotion mode and posture of the mobile mechanism is analyzed by the means of simulation. Finally, basic experiments verify the mobility of NEZA-I.
IROS Conference 2006 Conference Paper
A reconfigurable modular planetary robot system (RMPRS) consists of the parent body and multiple asymmetric wheel-manipulator child-robot modules. The module, which can independently locomote and manipulate, possesses the orientation of posture and the direction of locomotion. The modules have reconfiguration capability so that a group of the modules can construct a variety of configurations. The aim of the robot reconfiguration is to generate better configuration with respect to the directional locomotion adapted to the environment. Module state vector (MSV) and configuration state matrix (CSM) are presented and constructed for representing the asymmetric module and the configurations, and supporting the transformation operation for triggering the elementary motions of the module and the reconfiguration. The algorithm for optimizing the assembly reconfiguration of discrete modules is proposed and the result is evaluated through numerical simulation in an example
ICRA Conference 2005 Conference Paper
A new Reconfigurable Planetary Robot System (RPRS) is introduced in this paper. The locomotion mechanism, especially the static force analysis and the climbing ability for different configurations of the multiple child-robots are presented in detail. The basic configurations of two child-robots systems were given in three modes: connecting in series with arm in front or back and combining to a loop with grasper. The simulation results of these three configurations based on static analysis demonstrate that the climbing ability is closely correlated to their configurations. Compared the results, the conclusion can be obtained that the loop configuration has the best effect than others on slope climbing. The actual experiments of the child-robots system have illustrated the simulating results, and an exciting phenomenon has emerged, which shows that all the configurations can climb bigger gradient than the simulating results. The phenomenon rightly discloses the characteristic of the novel architecture of the child-robot.
IROS Conference 2005 Conference Paper
A reconfigurable planetary rover system (RPRS) is presented, including the parent body and some child robots. The child robot composed with the arm part and the wheel part has two moving modes: locomotion mode and manipulation mode. According to the mechanical characteristics, we proposed the two methods for the motion planning of swerving locomotion. The results of experiments showed that the robot can achieve turn during locomotion by adjusting the arm's attitude. But the child robot's radius of left-hand turning motion during locomotion is bigger much than the radius of right-hand turning motion during locomotion and the effect is not obvious. The method of spot turning is presented by use of the difference between radial frictional force and tangential frictional force of ground and direction wheel, which is important for robot for autonomous locomotion.