Author name cluster

Tong Zhao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

25 papers

2 author rows

AAAI Conference 2026 Conference Paper

D-GARA: A Dynamic Benchmarking Framework for GUI Agent Robustness in Real-World Anomalies

Sen Chen
Tong Zhao
Yi Bin
Fei Ma
Wenqi Shao
Zheng Wang

Developing intelligent agents capable of operating a wide range of Graphical User Interfaces (GUIs) with human-level proficiency is a key milestone on the path toward Artificial General Intelligence. While most existing datasets and benchmarks for training and evaluating GUI agents are static and idealized, failing to reflect the complexity and unpredictability of real-world environments, particularly the presence of anomalies. To bridge this research gap, we propose D-GARA, a dynamic benchmarking framework, to evaluate Android GUI agent robustness in real-world anomalies. D-GARA introduces a diverse set of real-world anomalies that GUI agents commonly face in practice, including interruptions such as permission dialogs, battery warnings, and update prompts. Based on D-GARA framework, we construct and annotate a benchmark featuring commonly used Android applications with embedded anomalies to support broader community research. Comprehensive experiments and results demonstrate substantial performance degradation in state-of-the-art GUI agents when exposed to anomaly-rich environments, highlighting the need for robustness-aware learning. D-GARA is modular and extensible, supporting the seamless integration of new tasks, anomaly types, and interaction scenarios to meet specific evaluation goals.

PDF Details DOI

AAAI Conference 2026 Conference Paper

ST-VLM: A Spatial-to-Image Multimodal Spatial-Temporal Prediction Framework with Vision-Language Model

Tong Zhao
Junping Du
Zhe Xue
Meiyu Liang
Aijing Li
Xiaolong Meng
Dandan Liu

Spatial-temporal prediction plays a crucial role in various domains, including intelligent transportation and environmental monitoring. Although large language model has shown advantages in long-range dependency modeling and excellent generalization ability for forecasting, it has limited understanding of spatial-temporal features. Especially for spatial features, most existing methods still simplify the spatial-temporal prediction task into multiple independent temporal prediction tasks, failing to effectively encode the dynamic evolution of spatial relations. To address these problems, we propose ST-VLM (Spatial-Temporal Forecasting with Vision-Language Model), a novel framework that leverages visual representations to encode the dynamic spatial dependencies within spatial-temporal data and integrates multi-modal information to enhance prediction. This framework transforms spatial-temporal features into three modalities: vision, text, and time series, enhances cross-modal fusion through an attention-aware fusion mechanism in the first-layer of Vision-Language Model (VLM), optimizes multi-modal feature interaction via adaptive fine-tuning strategies. After fusion, the multi-modal embeddings are subsequently used for the final spatial-temporal prediction task. Extensive experiments demonstrate that ST-VLM achieves state-of-the-art performance across various datasets. In particular, the framework exhibits promising results in few-shot scenarios, verifying its strong generalization ability.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

A Pre-training Framework for Relational Data with Information-theoretic Principles

Quang Truong
Zhikai Chen
Mingxuan Ju
Tong Zhao
Neil Shah
Jiliang Tang

Relational databases underpin critical infrastructure across a wide range of domains, yet the design of generalizable pre-training strategies for learning from relational databases remains an open challenge due to task heterogeneity. Specifically, there exist many possible downstream tasks, as tasks are defined based on relational schema graphs, temporal dependencies, and SQL-defined label logics. An effective pre-training framework is desired to take these factors into account in order to obtain task-aware representations. By incorporating knowledge of the underlying distribution that drives label generation, downstream tasks can benefit from relevant side-channel information. To bridge this gap, we introduce Task Vector Estimation (TVE), a novel pre-training framework that constructs predictive supervisory signals via set-based aggregation over schema traversal graphs, explicitly modeling next-window relational dynamics. We formalize our approach through an information-theoretic lens, demonstrating that task-informed representations retain more relevant signals than those obtained without task priors. Extensive experiments on the RelBench benchmark show that TVE consistently outperforms traditional pre-training baselines. Our findings advocate for pre-training objectives that encode task heterogeneity and temporal structure as design principles for predictive modeling on relational databases. Our code is publicly available at https: //github. com/quang-truong/task-vector-estimation.

PDF Details

IJCAI Conference 2025 Conference Paper

CSAHFL: Clustered Semi-Asynchronous Hierarchical Federated Learning for Dual-layer Non-IID in Heterogeneous Edge Computing Networks

Aijing Li
Junping Du
Dandan Liu
Yingxia Shao
Tong Zhao
Guanhua Ye

Federated Learning (FL) enables collaborative model training across distributed devices without sharing raw data. Hierarchical Federated Learning (HFL) is a new paradigm of FL that leverages the Edge Servers (ESs) layer as an intermediary to perform partial local model aggregation in proximity, reducing core network transmission overhead. However, HFL faces new challenges: (1) The two-stage aggregation process between client-edge and edge-cloud results in a dual-layer non-IID issue, which may significantly compromise model training accuracy. (2) The heterogeneity and mobility of clients further impact model training efficiency. To address these challenges, we propose a novel Clustered Semi-Asynchronous Hierarchical Federated Learning (CSAHFL) framework that integrates adaptive semi-asynchronous intra-cluster aggregation at client-edge layer and dynamic distribution-aware inter-cluster aggregation at edge-cloud layer, collaboratively enhancing model performance and scalability in heterogeneous and mobile environments. We conducte experiments under varying degrees of dual-layer non-IID in both static and high-mobility scenarios. The results demonstrate significant advantages of CSAHFL over representative state-of-the-art methods.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Exploring Landscapes for Better Minima along Valleys

Tong Zhao
Jiacheng Li
Yuanchang Zhou
Guangming Tan
Weile Jia

Finding lower and better-generalizing minima is crucial for deep learning. However, most existing optimizers stop searching the parameter space once they reach a local minimum. Given the complex geometric properties of the loss landscape, it is difficult to guarantee that such a point is the lowest or provides the best generalization. To address this, we propose an adaptor "E" for gradient-based optimizers. The adapted optimizer tends to continue exploring along landscape valleys (areas with low and nearly identical losses) in order to search for potentially better local minima even after reaching a local minimum. This approach increases the likelihood of finding a lower and flatter local minimum, which is often associated with better generalization. We also provide a proof of convergence for the adapted optimizers in both convex and non-convex scenarios for completeness. Finally, we demonstrate their effectiveness in an important but notoriously difficult training scenario, large-minibatch training, where Lamb is the benchmark optimizer. Our testing results show that the adapted Lamb, ALTO, increases the test accuracy (generalization) of the current state-of-the-art optimizer by an average of 2. 5\% across a variety of large-batch training tasks. This work potentially opens a new research direction in the design of optimization algorithms.

PDF Details

ICRA Conference 2025 Conference Paper

Is Linear Feedback on Smoothed Dynamics Sufficient for Stabilizing Contact-Rich Plans?

Yuki Shirai
Tong Zhao
H. J. Terry Suh
Huaijiang Zhu
Xinpei Ni
Jiuguang Wang
Max Simchowitz
Tao Pang

Designing planners and controllers for contact-rich manipulation is extremely challenging as contact violates the smoothness conditions that many gradient-based controller synthesis tools assume. Contact smoothing approximates a non-smooth system with a smooth one, allowing one to use these synthesis tools more effectively. However, applying classical control synthesis methods to smoothed contact dynamics remains relatively under-explored. This paper analyzes the efficacy of linear controller synthesis using differential simulators based on contact smoothing. We introduce natural baselines for leveraging contact smoothing to compute (a) open-loop plans robust to uncertain conditions and/or dynamics, and (b) feedback gains to stabilize around open-loop plans. Using robotic bimanual whole-body manipulation as a testbed, we perform extensive empirical experiments on over 300 trajectories and analyze why LQR seems insufficient for stabilizing contact-rich plans.

Details

TMLR Journal 2025 Journal Article

Graph Neural Networks (GNNs) are prominent in graph machine learning and have shown state-of-the-art performance in Link Prediction (LP) tasks. Nonetheless, recent studies show that GNNs struggle to produce good results on low-degree nodes despite their overall strong performance. In practical applications of LP, like recommendation systems, improving performance on low-degree nodes is critical, as it amounts to tackling the cold-start problem of improving the experiences of users with few observed interactions. In this paper, we investigate improving GNNs' LP performance on low-degree nodes while preserving their performance on high-degree nodes and propose a simple yet surprisingly effective augmentation technique called NodeDup. Specifically, NodeDup duplicates low-degree nodes and creates links between nodes and their own duplicates before following the standard supervised LP training scheme. By leveraging a ``multi-view'' perspective for low-degree nodes, NodeDup shows significant LP performance improvements on low-degree nodes without compromising any performance on high-degree nodes. Additionally, as a plug-and-play augmentation module, NodeDup can be easily applied on existing GNNs with very light computational cost. Extensive experiments show that NodeDup achieves 38.49%, 13.34%, and 6.76% relative improvements on isolated, low-degree, and warm nodes, respectively, on average across all datasets compared to GNNs and the existing cold-start methods.

PDF Details

NeurIPS Conference 2025 Conference Paper

V2X-Radar: A Multi-modal Dataset with 4D Radar for Cooperative Perception

Lei Yang
Xinyu Zhang
Jun Li
Chen Wang
Jiaqi Ma
Zhiying Song
Tong Zhao
Ziying Song

Modern autonomous vehicle perception systems often struggle with occlusions and limited perception range. Previous studies have demonstrated the effectiveness of cooperative perception in extending the perception range and overcoming occlusions, thereby enhancing the safety of autonomous driving. In recent years, a series of cooperative perception datasets have emerged; however, these datasets primarily focus on cameras and LiDAR, neglecting 4D Radar—a sensor used in single-vehicle autonomous driving to provide robust perception in adverse weather conditions. In this paper, to bridge the gap created by the absence of 4D Radar datasets in cooperative perception, we present V2X-Radar, the first large-scale, real-world multi-modal dataset featuring 4D Radar. V2X-Radar dataset is collected using a connected vehicle platform and an intelligent roadside unit equipped with 4D Radar, LiDAR, and multi-view cameras. The collected data encompasses sunny and rainy weather conditions, spanning daytime, dusk, and nighttime, as well as various typical challenging scenarios. The dataset consists of 20K LiDAR frames, 40K camera images, and 20K 4D Radar data, including 350K annotated boxes across five categories. To support various research domains, we have established V2X-Radar-C for cooperative perception, V2X-Radar-I for roadside perception, and V2X-Radar-V for single-vehicle perception. Furthermore, we provide comprehensive benchmarks across these three sub-datasets.

PDF Details

NeurIPS Conference 2024 Conference Paper

How Does Message Passing Improve Collaborative Filtering?

Clark M. Ju
William Shiao
Zhichun Guo
Yanfang Ye
Yozen Liu
Neil Shah
Tong Zhao

Collaborative filtering (CF) has exhibited prominent results for recommender systems and been broadly utilized for real-world applications. A branch of research enhances CF methods by message passing (MP) used in graph neural networks, due to its strong capabilities of extracting knowledge from graph-structured data, like user-item bipartite graphs that naturally exist in CF. They assume that MP helps CF methods in a manner akin to its benefits for graph-based learning tasks in general (e. g. , node classification). However, even though MP empirically improves CF, whether or not this assumption is correct still needs verification. To address this gap, we formally investigate why MP helps CF from multiple perspectives and show that many assumptions made by previous works are not entirely accurate. With our curated ablation studies and theoretical analyses, we discover that (i) MP improves the CF performance primarily by additional representations passed from neighbors during the forward pass instead of additional gradient updates to neighbor representations during the model back-propagation and (ii) MP usually helps low-degree nodes more than high-degree nodes. }Utilizing these novel findings, we present Test-time Aggregation for Collaborative Filtering, namely TAG-CF, a test-time augmentation framework that only conducts MP once at inference time. The key novelty of TAG-CF is that it effectively utilizes graph knowledge while circumventing most of notorious computational overheads of MP. Besides, TAG-CF is extremely versatile can be used as a plug-and-play module to enhance representations trained by different CF supervision signals. Evaluated on six datasets (i. e. , five academic benchmarks and one real-world industrial dataset), TAG-CF consistently improves the recommendation performance of CF methods without graph by up to 39. 2% on cold users and 31. 7% on all users, with little to no extra computational overheads. Furthermore, compared with trending graph-enhanced CF methods, TAG-CF delivers comparable or even better performance with less than 1% of their total training times. Our code is publicly available at https: //github. com/snap-research/Test-time-Aggregation-for-CF.

PDF Details DOI

IROS Conference 2024 Conference Paper

Similarity Distance-Based Label Assignment for Tiny Object Detection

Shuohao Shi
Qiang Fang
Xin Xu
Tong Zhao

Tiny object detection is becoming one of the most challenging tasks in computer vision because of the limited object size and lack of information. The label assignment strategy is a key factor affecting the accuracy of object detection. Although there are some effective label assignment strategies for tiny objects, most of them focus on reducing the sensitivity to the bounding boxes to increase the number of positive samples and have some fixed hyperparameters need to set. However, more positive samples may not necessarily lead to better detection results, in fact, excessive positive samples may lead to more false positives. In this paper, we introduce a simple but effective strategy named the Similarity Distance (SimD) to evaluate the similarity between bounding boxes. This proposed strategy not only considers both location and shape similarity but also learns hyperparameters adaptively, ensuring that it can adapt to different datasets and various object sizes in a dataset. Our approach can be simply applied in common anchor-based detectors in place of the IoU for label assignment and Non Maximum Suppression (NMS). Extensive experiments on four mainstream tiny object detection datasets demonstrate superior performance of our method, especially, 1. 8 AP points and 4. 1 AP points of very tiny higher than the state-of-the-art competitors on AI-TOD. Code is available at: https://github.com/cszzshi/simd.

Details

YNIMG Journal 2024 Journal Article

VAEEG: Variational auto-encoder for extracting EEG representation

Tong Zhao
Yi Cui
Taoyun Ji
Jiejian Luo
Wenling Li
Jun Jiang
Zaifen Gao
Wenguang Hu

Details DOI

NeurIPS Conference 2023 Conference Paper

Data-Centric Learning from Unlabeled Graphs with Diffusion Model

Gang Liu
Eric Inae
Tong Zhao
Jiaxin Xu
Tengfei Luo
Meng Jiang

Graph property prediction tasks are important and numerous. While each task offers a small size of labeled examples, unlabeled graphs have been collected from various sources and at a large scale. A conventional approach is training a model with the unlabeled graphs on self-supervised tasks and then fine-tuning the model on the prediction tasks. However, the self-supervised task knowledge could not be aligned or sometimes conflicted with what the predictions needed. In this paper, we propose to extract the knowledge underlying the large set of unlabeled graphs as a specific set of useful data points to augment each property prediction model. We use a diffusion model to fully utilize the unlabeled graphs and design two new objectives to guide the model's denoising process with each task's labeled data to generate task-specific graph examples and their labels. Experiments demonstrate that our data-centric approach performs significantly better than fifteen existing various methods on fifteen tasks. The performance improvement brought by unlabeled data is visible as the generated labeled examples unlike the self-supervised learning.

PDF Details

NeurIPS Conference 2023 Conference Paper

Demystifying Structural Disparity in Graph Neural Networks: Can One Size Fit All?

Haitao Mao
Zhikai Chen
Wei Jin
Haoyu Han
Yao Ma
Tong Zhao
Neil Shah
Jiliang Tang

Recent studies on Graph Neural Networks(GNNs) provide both empirical and theoretical evidence supporting their effectiveness in capturing structural patterns on both homophilic and certain heterophilic graphs. Notably, most real-world homophilic and heterophilic graphs are comprised of a mixture of nodes in both homophilic and heterophilic structural patterns, exhibiting a structural disparity. However, the analysis of GNN performance with respect to nodes exhibiting different structural patterns, e. g. , homophilic nodes in heterophilic graphs, remains rather limited. In the present study, we provide evidence that Graph Neural Networks(GNNs) on node classification typically perform admirably on homophilic nodes within homophilic graphs and heterophilic nodes within heterophilic graphs while struggling on the opposite node set, exhibiting a performance disparity. We theoretically and empirically identify effects of GNNs on testing nodes exhibiting distinct structural patterns. We then propose a rigorous, non-i. i. d PAC-Bayesian generalization bound for GNNs, revealing reasons for the performance disparity, namely the aggregated feature distance and homophily ratio difference between training and testing nodes. Furthermore, we demonstrate the practical implications of our new findings via (1) elucidating the effectiveness of deeper GNNs; and (2) revealing an over-looked distribution shift factor on graph out-of-distribution problem and proposing a new scenario accordingly.

PDF Details

IROS Conference 2023 Conference Paper

Efficient Deep Learning of Robust, Adaptive Policies using Tube MPC-Guided Data Augmentation

Tong Zhao
Andrea Tagliabue
Jonathan P. How

The deployment of agile autonomous systems in challenging, unstructured environments requires adaptation capabilities and robustness to uncertainties. Existing robust and adaptive controllers, such as those based on model predictive control (MPC), can achieve impressive performance at the cost of heavy online onboard computations. Strategies that efficiently learn robust and onboard-deployable policies from MPC have emerged, but they still lack fundamental adaptation capabilities. In this work, we extend an existing efficient Imitation Learning (IL) algorithm for robust policy learning from MPC with the ability to learn policies that adapt to challenging model/environment uncertainties. The key idea of our approach consists in modifying the IL procedure by conditioning the policy on a learned lower-dimensional model/environment representation that can be efficiently estimated online. We tailor our approach to the task of learning an adaptive position and attitude control policy to track trajectories under challenging disturbances on a multirotor. Evaluations in simulation show that a high-quality adaptive policy can be obtained in about 1. 3 hours. We additionally empirically demonstrate rapid adaptation to in- and out-of-training-distribution uncertainties, achieving a 6. 1 cm average position error under wind disturbances that correspond to about 50% of the weight of the robot, and that are 36% larger than the maximum wind seen during training.

Details

NeurIPS Conference 2023 Conference Paper

GraphPatcher: Mitigating Degree Bias for Graph Neural Networks via Test-time Augmentation

Mingxuan Ju
Tong Zhao
Wenhao Yu
Neil Shah
Yanfang Ye

Recent studies have shown that graph neural networks (GNNs) exhibit strong biases towards the node degree: they usually perform satisfactorily on high-degree nodes with rich neighbor information but struggle with low-degree nodes. Existing works tackle this problem by deriving either designated GNN architectures or training strategies specifically for low-degree nodes. Though effective, these approaches unintentionally create an artificial out-of-distribution scenario, where models mainly or even only observe low-degree nodes during the training, leading to a downgraded performance for high-degree nodes that GNNs originally perform well at. In light of this, we propose a test-time augmentation framework, namely GraphPatcher, to enhance test-time generalization of any GNNs on low-degree nodes. Specifically, GraphPatcher iteratively generates virtual nodes to patch artificially created low-degree nodes via corruptions, aiming at progressively reconstructing target GNN's predictions over a sequence of increasingly corrupted nodes. Through this scheme, GraphPatcher not only learns how to enhance low-degree nodes (when the neighborhoods are heavily corrupted) but also preserves the original superior performance of GNNs on high-degree nodes (when lightly corrupted). Additionally, GraphPatcher is model-agnostic and can also mitigate the degree bias for either self-supervised or supervised GNNs. Comprehensive experiments are conducted over seven benchmark datasets and GraphPatcher consistently enhances common GNNs' overall performance by up to 3. 6% and low-degree performance by up to 6. 5%, significantly outperforming state-of-the-art baselines. The source code is publicly available at https: //github. com/jumxglhf/GraphPatcher.

PDF Details

AAAI Conference 2023 Conference Paper

RLEKF: An Optimizer for Deep Potential with Ab Initio Accuracy

Siyu Hu
Wentao Zhang
Qiuchen Sha
Feng Pan
Lin-Wang Wang
Weile Jia
Guangming Tan
Tong Zhao

It is imperative to accelerate the training of neural network force field such as Deep Potential, which usually requires thousands of images based on first-principles calculation and a couple of days to generate an accurate potential energy surface. To this end, we propose a novel optimizer named reorganized layer extended Kalman filtering (RLEKF), an optimized version of global extended Kalman filtering (GEKF) with a strategy of splitting big and gathering small layers to overcome the O(N^2) computational cost of GEKF. This strategy provides an approximation of the dense weights error covariance matrix with a sparse diagonal block matrix for GEKF. We implement both RLEKF and the baseline Adam in our alphaDynamics package and numerical experiments are performed on 13 unbiased datasets. Overall, RLEKF converges faster with slightly better accuracy. For example, a test on a typical system, bulk copper, shows that RLEKF converges faster by both the number of training epochs (x11.67) and wall-clock time (x1.19). Besides, we theoretically prove that the updates of weights converge and thus are against the gradient exploding problem. Experimental results verify that RLEKF is not sensitive to the initialization of weights. The RLEKF sheds light on other AI-for-science applications where training a large neural network (with tons of thousands parameters) is a bottleneck.

PDF Details DOI

IROS Conference 2022 Conference Paper

Beyond the Limit Automated Driving with Performance Constrained Reachability Analysis

Tong Zhao
Ekim Yurtsever
Giorgio Rizzoni

Professional human drivers usually have more than one driving strategy to handle incoming traffic situations. These different strategies activate different performance characteristics of the vehicle, enabling the driver to minimize the risk in a variety of situations by optimizing the strategy selection. In the same spirit, we define a novel concept of strategy-wise performance metric and creatively combine this performance metric with reachability analysis to evaluate candidate control strategies. Such a performance evaluation produces solid guarantees on which strategies will not qualify for the given traffic scenario. Then we automate the strategy selection process by weighing and minimizing the overall risk of each strategy candidate.

Details

TIST Journal 2022 Journal Article

Federated Dynamic Graph Neural Networks with Secure Aggregation for Video-based Distributed Surveillance

Meng Jiang
Taeho Jung
Ryan Karl
Tong Zhao

Distributed surveillance systems have the ability to detect, track, and snapshot objects moving around in a certain space. The systems generate video data from multiple personal devices or street cameras. Intelligent video-analysis models are needed to learn dynamic representation of the objects for detection and tracking. Can we exploit the structural and dynamic information without storing the spatiotemporal video data at a central server that leads to a violation of user privacy? In this work, we introduce Federated Dynamic Graph Neural Network (Feddy), a distributed and secured framework to learn the object representations from graph sequences: (1) It aggregates structural information from nearby objects in the current graph as well as dynamic information from those in the previous graph. It uses a self-supervised loss of predicting the trajectories of objects. (2) It is trained in a federated learning manner. The centrally located server sends the model to user devices. Local models on the respective user devices learn and periodically send their learning to the central server without ever exposing the user’s data to server. (3) Studies showed that the aggregated parameters could be inspected though decrypted when broadcast to clients for model synchronizing, after the server performed a weighted average. We design an appropriate aggregation mechanism of secure aggregation primitives that can protect the security and privacy in federated learning with scalability. Experiments on four video camera datasets as well as simulation demonstrate that Feddy achieves great effectiveness and security.

Details DOI

AAAI Conference 2021 Conference Paper

Data Augmentation for Graph Neural Networks

Tong Zhao
Yozen Liu
Leonardo Neves
Oliver Woodford
Meng Jiang
Neil Shah

Data augmentation has been widely used to improve generalizability of machine learning models. However, comparatively little work studies data augmentation for graphs. This is largely due to the complex, non-Euclidean structure of graphs, which limits possible manipulation operations. Augmentation operations commonly used in vision and language have no analogs for graphs. Our work studies graph data augmentation for graph neural networks (GNNs) in the context of improving semi-supervised node-classification. We discuss practical and theoretical motivations, considerations and strategies for graph data augmentation. Our work shows that neural edge predictors can effectively encode classhomophilic structure to promote intra-class edges and demote inter-class edges in given graph structure, and our main contribution introduces the GAUG graph data augmentation framework, which leverages these insights to improve performance in GNN-based node classification via edge prediction. Extensive experiments on multiple benchmarks show that augmentation via GAUG improves performance across GNN architectures and datasets.

PDF Details

AAAI Conference 2020 Conference Paper

Fast and Efficient Boolean Matrix Factorization by Geometric Segmentation

Changlin Wan
Wennan Chang
Tong Zhao
Mengya Li
Sha Cao
Chi Zhang

Boolean matrix has been used to represent digital information in many ﬁelds, including bank transaction, crime records, natural language processing, protein-protein interaction, etc. Boolean matrix factorization (BMF) aims to ﬁnd an approximation of a binary matrix as the Boolean product of two low rank Boolean matrices, which could generate vast amount of information for the patterns of relationships between the features and samples. Inspired by binary matrix permutation theories and geometric segmentation, we developed a fast and efﬁcient BMF approach, called MEBF (Median Expansion for Boolean Factorization). Overall, MEBF adopted a heuristic approach to locate binary patterns presented as submatrices that are dense in 1’s. At each iteration, MEBF permutates the rows and columns such that the permutated matrix is approximately Upper Triangular-Like (UTL) with socalled Simultaneous Consecutive-ones Property (SC1P). The largest submatrix dense in 1 would lie on the upper triangular area of the permutated matrix, and its location was determined based on a geometric segmentation of a triangular. We compared MEBF with other state of the art approaches on data scenarios with different density and noise levels. MEBF demonstrated superior performances in lower reconstruction error, and higher computational efﬁciency, as well as more accurate density patterns than popular methods such as ASSO, PANDA and Message Passing. We demonstrated the application of MEBF on both binary and non-binary data sets, and revealed its further potential in knowledge retrieving and data denoising.

PDF Details

NeurIPS Conference 2020 Conference Paper

Geometric All-way Boolean Tensor Decomposition

Changlin Wan
Wennan Chang
Tong Zhao
Sha Cao
Chi Zhang

Boolean tensor has been broadly utilized in representing high dimensional logical data collected on spatial, temporal and/or other relational domains. Boolean Tensor Decomposition (BTD) factorizes a binary tensor into the Boolean sum of multiple rank-1 tensors, which is an NP-hard problem. Existing BTD methods have been limited by their high computational cost, in applications to large scale or higher order tensors. In this work, we presented a computationally efficient BTD algorithm, namely Geometric Expansion for all-order Tensor Factorization (GETF), that sequentially identifies the rank-1 basis components for a tensor from a geometric perspective. We conducted rigorous theoretical analysis on the validity as well as algorithemic efficiency of GETF in decomposing all-order tensor. Experiments on both synthetic and real-world data demonstrated that GETF has significantly improved performance in reconstruction accuracy, extraction of latent structures and it is an order of magnitude faster than other state-of-the-art methods.

PDF Details

IJCAI Conference 2018 Conference Paper

Beyond Polarity: Interpretable Financial Sentiment Analysis with Hierarchical Query-driven Attention

Ling Luo
Xiang Ao
Feiyang Pan
Jin Wang
Tong Zhao
Ningzi Yu
Qing He

Sentiment analysis has played a significant role in financial applications in recent years. The informational and emotive aspects of news texts may affect the prices, volatilities, volume of trades, and even potential risks of financial subjects. Previous studies in this field mainly focused on identifying polarity~(e. g. positive or negative). However, as financial decisions broadly require justifications, only plausible polarity cannot provide enough evidence during the decision making processes of humanity. Hence an explainable solution is in urgent demand. In this paper, we present an interpretable neural net framework for financial sentiment analysis. First, we design a hierarchical model to learn the representation of a document from multiple granularities. In addition, we propose a query-driven attention mechanism to satisfy the unique characteristics of financial documents. With the domain specified questions provided by the financial analysts, we can discover different spotlights for queries from different aspects. We conduct extensive experiments on a real-world dataset. The results demonstrate that our framework can learn better representation of the document and unearth meaningful clues on replying different users? preferences. It also outperforms the state-of-the-art methods on sentiment prediction of financial documents.

PDF Details

IJCAI Conference 2016 Conference Paper

Modeling the Homophily Effect between Links and Communities for Overlapping Community Detection

Hongyi Zhang
Tong Zhao
Irwin King
Michael R. Lyu

Overlapping community detection has drawn much attention recently since it allows nodes in a network to have multiple community memberships. A standard framework to deal with overlapping community detection is Matrix Factorization (MF). Although all existing MF-based approaches use links as input to identify communities, the relationship between links and communities is still under-investigated. Most of the approaches only view links as consequences of communities (community-to-link) but fail to explore how nodes' community memberships can be represented by their linked neighbors (link-to-community). In this paper, we propose a Homophily-based Nonnegative Matrix Factorization (HNMF) to model both-sided relationships between links and communities. From the community-to-link perspective, we apply a preference-based pairwise function by assuming that nodes with common communities have a higher probability to build links than those without common communities. From the link-to-community perspective, we propose a new community representation learning with network embedding by assuming that linked nodes have similar community representations. We conduct experiments on several real-world networks and the results show that our HNMF model is able to find communities with better quality compared with state-of-the-art baselines.

PDF Details

AAAI Conference 2016 Conference Paper

STELLAR: Spatial-Temporal Latent Ranking for Successive Point-of-Interest Recommendation

Shenglin Zhao
Tong Zhao
Haiqin Yang
Michael Lyu
Irwin King

Successive point-of-interest (POI) recommendation in location-based social networks (LBSNs) becomes a signiﬁcant task since it helps users to navigate a number of candidate POIs and provides the best POI recommendations based on users’ most recent check-in knowledge. However, all existing methods for successive POI recommendation only focus on modeling the correlation between POIs based on users’ check-in sequences, but ignore an important fact that successive POI recommendation is a time-subtle recommendation task. In fact, even with the same previous check-in information, users would prefer different successive POIs at different time. To capture the impact of time on successive POI recommendation, in this paper, we propose a spatial-temporal latent ranking (STELLAR) method to explicitly model the interactions among user, POI, and time. In particular, the proposed STELLAR model is built upon a ranking-based pairwise tensor factorization framework with a ﬁne-grained modeling of user-POI, POI-time, and POI-POI interactions for successive POI recommendation. Moreover, we propose a new interval-aware weight utility function to differentiate successive check-ins’ correlations, which breaks the time interval constraint in prior work. Evaluations on two real-world datasets demonstrate that the STELLAR model outperforms state-of-the-art successive POI recommendation model about 20% in Precision@5 and Recall@5.

PDF Details

ICRA Conference 2011 Conference Paper

Steady state and dynamic modeling of RO desalination modules and system using EES

Tong Zhao
Ran Niu
Ming Su
Todd Anderson

In this article, we introduce a powerful software tool Engineering Equation Solver (EES) and apply it to create steady state and dynamic models for a reverse osmosis (RO) desalination system. EES is a general equation-solving program that can numerically solve thousands of coupled nonlinear algebraic equations. It can also be used to solve differential equations and optimization problems. In RO desalination system design, amount of coupled differential equations related to mass balance and momentum balance need to be solved to develop system model. Hence, by applying EES to solve differential equations is a very efficient and effective method to build RO desalination system model. Comparing with Matlab, EES has the advantage of easy programming and fast convergence speed. It significantly reduces the time spending for programming to solve nonlinear equations and researchers can focus on RO system optimal design and analysis.

Details