Arrow Research search

Author name cluster

Hui Qian

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

27 papers
1 author row

Possible papers

27

AAAI Conference 2026 Conference Paper

Mass Concept Erasure in Diffusion Models with Concept Hierarchy

  • Jiahang Tu
  • Ye Li
  • Yiming Wu
  • Hanbin Zhao
  • Chao Zhang
  • Hui Qian

The success of diffusion models has raised concerns about the generation of unsafe or harmful content, prompting concept erasure approaches that fine-tune modules to suppress specific concepts while preserving general generative capabilities. However, as the number of erased concepts grows, these methods often become inefficient and ineffective, since each concept requires a separate set of fine-tuned parameters and may degrade the overall generation quality. In this work, we propose a supertype-subtype concept hierarchy that organizes erased concepts into a parent–child structure. Each erased concept is treated as a child node, and semantically related concepts (e.g., macaw, and bald eagle) are grouped under a shared parent node, referred to as a supertype concept (e.g., bird). Rather than erasing concepts individually, we introduce an effective and efficient group-wise suppression method, where semantically similar concepts are grouped and erased jointly by sharing a single set of learnable parameters. During the erasure phase, standard diffusion regularization is applied to preserve denoising process in unmasked regions. To mitigate the degradation of supertype generation caused by excessive erasure of semantically related subtypes, we propose a novel method called Supertype-Preserving Low-Rank Adaptation (SuPLoRA), which encodes the supertype concept information in the frozen down-projection matrix and updates only the up-projection matrix during erasure. Theoretical analysis demonstrates the effectiveness of SuPLoRA in mitigating generation performance degradation. We construct a more challenging benchmark that requires simultaneous erasure of concepts across diverse domains, including celebrities, objects, and pornographic content. Comprehensive experiments demonstrate that our method achieves a superior balance between effective multi-concept erasure and the preservation of desirable generative performance.

NeurIPS Conference 2025 Conference Paper

Rebalancing Return Coverage for Conditional Sequence Modeling in Offline Reinforcement Learning

  • Wensong Bai
  • Chufan Chen
  • Yichao Fu
  • Qihang Xu
  • zhang chao
  • Hui Qian

Recent advancements in offline reinforcement learning (RL) have underscored the capabilities of conditional sequence modeling (CSM), a paradigm that models the action distribution conditioned on both historical trajectories and target returns associated with each state. However, due to the imbalanced return distribution caused by suboptimal datasets, CSM is grappling with a serious distributional shift problem when conditioning on high returns. While recent approaches attempt to empirically tackle this challenge through return rebalancing techniques such as weighted sampling and value-regularized supervision, the relationship between return rebalancing and the performance of CSM methods is not well understood. In this paper, we reveal that both expert-level and full-spectrum return-coverage critically influence the performance and sample efficiency of CSM policies. Building on this finding, we devise a simple yet effective return-coverage rebalancing mechanism that can be seamlessly integrated into common CSM frameworks, including the most widely used one, Decision Transformer (DT). The resulting CSM algorithm, referred to as Return-rebalanced Value-regularized Decision Transformer (RVDT), integrates both implicit and explicit return-coverage rebalancing mechanisms, and achieves state-of-the-art performance in the D4RL experiments.

AAAI Conference 2025 Conference Paper

TextToucher: Fine-Grained Text-to-Touch Generation

  • Jiahang Tu
  • Hao Fu
  • Fengyu Yang
  • Hanbin Zhao
  • Chao Zhang
  • Hui Qian

Tactile sensation plays a crucial role in the development of multi-modal large models and embodied intelligence. To collect tactile data with minimal cost as possible, a series of studies have attempted to generate tactile images by vision-to-touch image translation. However, compared to text modality, visual modality-driven tactile generation cannot accurately depict human tactile sensation. In this work, we analyze the characteristics of tactile images in detail from two granularities: object-level (tactile texture, tactile shape), and sensor-level (gel status). We model these granularities of information through text descriptions and propose a fine-grained Text-to-Touch generation method (TextToucher) to generate high-quality tactile samples. Specifically, we introduce a multimodal large language model to build the text sentences about object-level tactile information and employ a set of learnable text prompts to represent the sensor-level tactile information. To better guide the tactile generation process with the built text information, we fuse the dual grains of text information and explore various dual-grain text conditioning methods within the diffusion transformer architecture. Furthermore, we propose a Contrastive Text-Touch Pre-training (CTTP) metric to precisely evaluate the quality of text-driven generated tactile data. Extensive experiments demonstrate the superiority of our TextToucher method.

NeurIPS Conference 2024 Conference Paper

BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models

  • Fangyikang Wang
  • Hubery Yin
  • Yuejiang Dong
  • Huminhao Zhu
  • Chao Zhang
  • Hanbin Zhao
  • Hui Qian
  • Chen Li

The inversion of diffusion model sampling, which aims to find the corresponding initial noise of a sample, plays a critical role in various tasks. Recently, several heuristic exact inversion samplers have been proposed to address the inexact inversion issue in a training-free manner. However, the theoretical properties of these heuristic samplers remain unknown and they often exhibit mediocre sampling quality. In this paper, we introduce a generic formulation, \emph{Bidirectional Explicit Linear Multi-step} (BELM) samplers, of the exact inversion samplers, which includes all previously proposed heuristic exact inversion samplers as special cases. The BELM formulation is derived from the variable-stepsize-variable-formula linear multi-step method via integrating a bidirectional explicit constraint. We highlight this bidirectional explicit constraint is the key of mathematically exact inversion. We systematically investigate the Local Truncation Error (LTE) within the BELM framework and show that the existing heuristic designs of exact inversion samplers yield sub-optimal LTE. Consequently, we propose the Optimal BELM (O-BELM) sampler through the LTE minimization approach. We conduct additional analysis to substantiate the theoretical stability and global convergence property of the proposed optimal sampler. Comprehensive experiments demonstrate our O-BELM sampler establishes the exact inversion property while achieving high-quality sampling. Additional experiments in image editing and image interpolation highlight the extensive potential of applying O-BELM in varying applications.

NeurIPS Conference 2024 Conference Paper

D-LLM: A Token Adaptive Computing Resource Allocation Strategy for Large Language Models

  • Yikun Jiang
  • Huanyu Wang
  • Lei Xie
  • Hanbin Zhao
  • Chao Zhang
  • Hui Qian
  • John C. Lui

Large language models have shown an impressive societal impact owing to their excellent understanding and logical reasoning skills. However, such strong ability relies on a huge amount of computing resources, which makes it difficult to deploy LLMs on computing resource-constrained platforms. Currently, LLMs process each token equivalently, but we argue that not every word is equally important. Some words should not be allocated excessive computing resources, particularly for dispensable terms in simple questions. In this paper, we propose a novel dynamic inference paradigm for LLMs, namely D-LLMs, which adaptively allocate computing resources in token processing. We design a dynamic decision module for each transformer layer that decides whether a network unit should be executed or skipped. Moreover, we tackle the issue of adapting D-LLMs to real-world applications, specifically concerning the missing KV-cache when layers are skipped. To overcome this, we propose a simple yet effective eviction policy to exclude the skipped layers from subsequent attention calculations. The eviction policy not only enables D-LLMs to be compatible with prevalent applications but also reduces considerable storage resources. Experimentally, D-LLMs show superior performance, in terms of computational cost and KV storage utilization. It can reduce up to 45\% computational cost and KV storage on Q&A, summarization, and math solving tasks, 50\% on commonsense reasoning tasks.

AAAI Conference 2024 Conference Paper

GAD-PVI: A General Accelerated Dynamic-Weight Particle-Based Variational Inference Framework

  • Fangyikang Wang
  • Huminhao Zhu
  • Chao Zhang
  • Hanbin Zhao
  • Hui Qian

Particle-based Variational Inference (ParVI) methods approximate the target distribution by iteratively evolving finite weighted particle systems. Recent advances of ParVI methods reveal the benefits of accelerated position update strategies and dynamic weight adjustment approaches. In this paper, we propose the first ParVI framework that possesses both accelerated position update and dynamical weight adjustment simultaneously, named the General Accelerated Dynamic-Weight Particle-based Variational Inference (GAD-PVI) framework. Generally, GAD-PVI simulates the semi-Hamiltonian gradient flow on a novel Information-Fisher-Rao space, which yields an additional decrease on the local functional dissipation. GAD-PVI is compatible with different dissimilarity functionals and associated smoothing approaches under three information metrics. Experiments on both synthetic and real-world data demonstrate the faster convergence and reduced approximation error of GAD-PVI methods over the state-of-the-art.

NeurIPS Conference 2024 Conference Paper

Solving Zero-Sum Markov Games with Continuous State via Spectral Dynamic Embedding

  • Chenhao Zhou
  • Zebang Shen
  • Chao Zhang
  • Hanbin Zhao
  • Hui Qian

In this paper, we propose a provably efficient natural policy gradient algorithm called Spectral Dynamic Embedding Policy Optimization (\SDEPO) for two-player zero-sum stochastic Markov games with continuous state space and finite action space. In the policy evaluation procedure of our algorithm, a novel kernel embedding method is employed to construct a finite-dimensional linear approximations to the state-action value function. We explicitly analyze the approximation error in policy evaluation, and show that \SDEPO\ achieves an $\tilde{O}(\frac{1}{(1-\gamma)^3\epsilon})$ last-iterate convergence to the $\epsilon-$optimal Nash equilibrium, which is independent of the cardinality of the state space. The complexity result matches the best-known results for global convergence of policy gradient algorithms for single agent setting. Moreover, we also propose a practical variant of \SDEPO\ to deal with continuous action space and empirical results demonstrate the practical superiority of the proposed method.

AAAI Conference 2023 Conference Paper

CDMA: A Practical Cross-Device Federated Learning Algorithm for General Minimax Problems

  • Jiahao Xie
  • Chao Zhang
  • Zebang Shen
  • Weijie Liu
  • Hui Qian

Minimax problems arise in a wide range of important applications including robust adversarial learning and Generative Adversarial Network (GAN) training. Recently, algorithms for minimax problems in the Federated Learning (FL) paradigm have received considerable interest. Existing federated algorithms for general minimax problems require the full aggregation (i.e., aggregation of local model information from all clients) in each training round. Thus, they are inapplicable to an important setting of FL known as the cross-device setting, which involves numerous unreliable mobile/IoT devices. In this paper, we develop the first practical algorithm named CDMA for general minimax problems in the cross-device FL setting. CDMA is based on a Start-Immediately-With-Enough-Responses mechanism, in which the server first signals a subset of clients to perform local computation and then starts to aggregate the local results reported by clients once it receives responses from enough clients in each round. With this mechanism, CDMA is resilient to the low client availability. In addition, CDMA is incorporated with a lightweight global correction in the local update steps of clients, which mitigates the impact of slow network connections. We establish theoretical guarantees of CDMA under different choices of hyperparameters and conduct experiments on AUC maximization, robust adversarial network training, and GAN training tasks. Theoretical and experimental results demonstrate the efficiency of CDMA.

AAAI Conference 2023 Conference Paper

Towards Optimal Randomized Strategies in Adversarial Example Game

  • Jiahao Xie
  • Chao Zhang
  • Weijie Liu
  • Wensong Bai
  • Hui Qian

The vulnerability of deep neural network models to adversarial example attacks is a practical challenge in many artificial intelligence applications. A recent line of work shows that the use of randomization in adversarial training is the key to find optimal strategies against adversarial example attacks. However, in a fully randomized setting where both the defender and the attacker can use randomized strategies, there are no efficient algorithm for finding such an optimal strategy. To fill the gap, we propose the first algorithm of its kind, called FRAT, which models the problem with a new infinite-dimensional continuous-time flow on probability distribution spaces. FRAT maintains a lightweight mixture of models for the defender, with flexibility to efficiently update mixing weights and model parameters at each iteration. Furthermore, FRAT utilizes lightweight sampling subroutines to construct a random strategy for the attacker. We prove that the continuous-time limit of FRAT converges to a mixed Nash equilibria in a zero-sum game formed by a defender and an attacker. Experimental results also demonstrate the efficiency of FRAT on CIFAR-10 and CIFAR-100 datasets.

IJCAI Conference 2022 Conference Paper

DPVI: A Dynamic-Weight Particle-Based Variational Inference Framework

  • Chao Zhang
  • Zhijian Li
  • Xin Du
  • Hui Qian

The recently developed Particle-based Variational Inference (ParVI) methods drive the empirical distribution of a set of fixed-weight particles towards a given target distribution by iteratively updating particles' positions. However, the fixed weight restriction greatly confines the empirical distribution's approximation ability, especially when the particle number is limited. In this paper, we propose to dynamically adjust particles' weights according to a Fisher-Rao reaction flow. We develop a general Dynamic-weight Particle-based Variational Inference (DPVI) framework according to a novel continuous composite flow, which evolves the positions and weights of particles simultaneously. We show that the mean-field limit of our composite flow is actually a Wasserstein-Fisher-Rao gradient flow of the associated dissimilarity functional. By using different finite-particle approximations in our general framework, we derive several efficient DPVI algorithms. The empirical results demonstrate the superiority of our derived DPVI algorithms over their fixed-weight counterparts.

AAAI Conference 2022 Conference Paper

From One to All: Learning to Match Heterogeneous and Partially Overlapped Graphs

  • Weijie Liu
  • Hui Qian
  • Chao Zhang
  • Jiahao Xie
  • Zebang Shen
  • Nenggan Zheng

Recent years have witnessed a flurry of research activity in graph matching, which aims at finding the correspondence of nodes across two graphs and lies at the heart of many artificial intelligence applications. However, matching heterogeneous graphs with partial overlap remains a challenging problem in real-world applications. This paper proposes the first practical learning-to-match method to meet this challenge. The proposed unsupervised method adopts a novel partial optimal transport paradigm to learn a transport plan and node embeddings simultaneously. In a from-one-to-all manner, the entire learning procedure is decomposed into a series of easy-tosolve sub-procedures, each of which only handles the alignment of a single type of nodes. A mechanism for searching the transport mass is also proposed. Experimental results demonstrate that the proposed method outperforms state-ofthe-art graph matching methods.

AAAI Conference 2021 Conference Paper

A Hybrid Stochastic Gradient Hamiltonian Monte Carlo Method

  • Chao Zhang
  • Zhijian Li
  • Zebang Shen
  • Jiahao Xie
  • Hui Qian

Recent theoretical analyses reveal that existing Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods need large mini-batches of samples (exponentially dependent on the dimension) to reduce the mean square error of gradient estimates and ensure non-asymptotic convergence guarantees when the target distribution has a nonconvex potential function. In this paper, we propose a novel SG-MCMC algorithm, called Hybrid Stochastic Gradient Hamiltonian Monte Carlo (HSG-HMC) method, which needs merely one sample per iteration and possesses a simple structure with only one hyperparameter. Such improvement leverages a hybrid stochastic gradient estimator that exploits historical stochastic gradient information to control the mean square error. Theoretical analyses show that our method obtains the best-known overall sample complexity to achieve epsilon-accuracy in terms of the 2-Wasserstein distance for sampling from distributions with nonconvex potential functions. Empirical studies on both simulated and real-world datasets demonstrate the advantage of our method.

IJCAI Conference 2021 Conference Paper

SHPOS: A Theoretical Guaranteed Accelerated Particle Optimization Sampling Method

  • Zhijian Li
  • Chao Zhang
  • Hui Qian
  • Xin Du
  • Lingwei Peng

Recently, the Stochastic Particle Optimization Sampling (SPOS) method is proposed to solve the particle-collapsing pitfall of deterministic Particle Variational Inference methods by ultilizing the stochastic Overdamped Langevin dynamics to enhance exploration. In this paper, we propose an accelerated particle optimization sampling method called Stochastic Hamiltonian Particle Optimization Sampling (SHPOS). Compared to the first-order dynamics used in SPOS, SHPOS adopts an augmented second-order dynamics, which involves an extra momentum term to achieve acceleration. We establish a non-asymptotic convergence analysis for SHPOS, and show that it enjoys a faster convergence rate than SPOS. Besides, we also propose a variance-reduced stochastic gradient variant of SHPOS for tasks with large-scale datasets and complex models. Experiments on both synthetic and real data validate our theory and demonstrate the superiority of SHPOS over the state-of-the-art.

IJCAI Conference 2020 Conference Paper

Accelerating Stratified Sampling SGD by Reconstructing Strata

  • Weijie Liu
  • Hui Qian
  • Chao Zhang
  • Zebang Shen
  • Jiahao Xie
  • Nenggan Zheng

In this paper, a novel stratified sampling strategy is designed to accelerate the mini-batch SGD. We derive a new iteration-dependent surrogate which bound the stochastic variance from above. To keep the strata minimizing this surrogate with high probability, a stochastic stratifying algorithm is adopted in an adaptive manner, that is, in each iteration, strata are reconstructed only if an easily verifiable condition is met. Based on this novel sampling strategy, we propose an accelerated mini-batch SGD algorithm named SGD-RS. Our theoretical analysis shows that the convergence rate of SGD-RS is superior to the state-of-the-art. Numerical experiments corroborate our theory and demonstrate that SGD-RS achieves at least 3. 48-times speed-ups compared to vanilla minibatch SGD.

AAAI Conference 2020 Conference Paper

Aggregated Gradient Langevin Dynamics

  • Chao Zhang
  • Jiahao Xie
  • Zebang Shen
  • Peilin Zhao
  • Tengfei Zhou
  • Hui Qian

In this paper, we explore a general Aggregated Gradient Langevin Dynamics framework (AGLD) for the Markov Chain Monte Carlo (MCMC) sampling. We investigate the nonasymptotic convergence of AGLD with a unified analysis for different data accessing (e. g. random access, cyclic access and random reshuffle) and snapshot updating strategies, under convex and nonconvex settings respectively. It is the first time that bounds for I/O friendly strategies such as cyclic access and random reshuffle have been established in the MCMC literature. The theoretic results also indicate that methods in AGLD possess the merits of both the low periteration computational complexity and the short mixture time. Empirical studies demonstrate that our framework allows to derive novel schemes to generate high-quality samples for large-scale Bayesian posterior learning tasks.

AAAI Conference 2020 Conference Paper

Efficient Projection-Free Online Methods with Stochastic Recursive Gradient

  • Jiahao Xie
  • Zebang Shen
  • Chao Zhang
  • Boyu Wang
  • Hui Qian

This paper focuses on projection-free methods for solving smooth Online Convex Optimization (OCO) problems. Existing projection-free methods either achieve suboptimal regret bounds or have high per-round computational costs. To fill this gap, two efficient projection-free online methods called ORGFW and MORGFW are proposed for solving stochastic and adversarial OCO problems, respectively. By employing a recursive gradient estimator, our methods achieve optimal regret bounds (up to a logarithmic factor) while possessing low per-round computational costs. Experimental results demonstrate the efficiency of the proposed methods compared to state-of-the-arts.

IJCAI Conference 2018 Conference Paper

JUMP: a Jointly Predictor for User Click and Dwell Time

  • Tengfei Zhou
  • Hui Qian
  • Zebang Shen
  • Chao Zhang
  • Chengwei Wang
  • Shichen Liu
  • Wenwu Ou

With the recent proliferation of recommendation system, there have been a lot of interests in session-based prediction methods, particularly those based on Recurrent Neural Network (RNN) and their variants. However, existing methods either ignore the dwell time prediction that plays an important role in measuring user's engagement on the content, or fail to process very short or noisy sessions. In this paper, we propose a joint predictor, JUMP, for both user click and dwell time in session-based settings. To map its input into a feature vector, JUMP adopts a novel three-layered RNN structure which includes a fast-slow layer for very short sessions and an attention layer for noisy sessions. Experiments demonstrate that JUMP outperforms state-of-the-art methods in both user click and dwell time prediction.

IJCAI Conference 2017 Conference Paper

Accelerated Doubly Stochastic Gradient Algorithm for Large-scale Empirical Risk Minimization

  • Zebang Shen
  • Hui Qian
  • Tongzhou Mu
  • Chao Zhang

Nowadays, algorithms with fast convergence, small memory footprints, and low per-iteration complexity are particularly favorable for artificial intelligence applications. In this paper, we propose a doubly stochastic algorithm with a novel accelerating multi-momentum technique to solve large scale empirical risk minimization problem for learning tasks. While enjoying a provably superior convergence rate, in each iteration, such algorithm only accesses a mini batch of samples and meanwhile updates a small block of variable coordinates, which substantially reduces the amount of memory reference when both the massive sample size and ultra-high dimensionality are involved. Specifically, to obtain an ε-accurate solution, our algorithm requires only O(log(1/ε)/sqrt(ε)) overall computation for the general convex case and O((n+sqrt{nκ})log(1/ε)) for the strongly convex case. Empirical studies on huge scale datasets are conducted to illustrate the efficiency of our method in practice.

IJCAI Conference 2017 Conference Paper

Tensor Completion with Side Information: A Riemannian Manifold Approach

  • Tengfei Zhou
  • Hui Qian
  • Zebang Shen
  • Chao Zhang
  • Congfu Xu

By restricting the iterate on a nonlinear manifold, the recently proposed Riemannian optimization methods prove to be both efficient and effective in low rank tensor completion problems. However, existing methods fail to exploit the easily accessible side information, due to their format mismatch. Consequently, there is still room for improvement. To fill the gap, in this paper, a novel Riemannian model is proposed to tightly integrate the original model and the side information by overcoming their inconsistency. For this model, an efficient Riemannian conjugate gradient descent solver is devised based on a new metric that captures the curvature of the objective. Numerical experiments suggest that our method is more accurate than the state-of-the-art without compromising the efficiency.

IJCAI Conference 2016 Conference Paper

Adaptive Variance Reducing for Stochastic Gradient Descent

  • Zebang Shen
  • Hui Qian
  • Tengfei Zhou
  • Tongzhou Mu

Variance Reducing (VR) stochastic methods are fast-converging alternatives to the classical Stochastic Gradient Descent (SGD) for solving large-scale regularized finite sum problems, especially when a highly accurate solution is required. One critical step in VR is the function sampling. State-of-the-art VR algorithms such as SVRG and SAGA, employ either Uniform Probability (UP) or Importance Probability (IP), which is deficient in reducing the variance and hence leads to suboptimal convergence rate. In this paper, we propose a novel sampling scheme that explicitly computes some Adaptive Probability (AP) at each iteration. Analysis shows that, equipped with AP, both SVRG and SAGA yield provably better convergence rate than the ones with UP or IP, which is confirmed in experiments. Additionally, to cut down the per iteration computation load, an efficient variant is proposed by utilizing AP periodically, whose performance is empirically validated.

AAAI Conference 2016 Conference Paper

An Alternating Proximal Splitting Method with Global Convergence for Nonconvex Structured Sparsity Optimization

  • Shubao Zhang
  • Hui Qian
  • Xiaojin Gong

In many learning tasks with structural properties, structured sparse modeling usually leads to better interpretability and higher generalization performance. While great efforts have focused on the convex regularization, recent studies show that nonconvex regularizers can outperform their convex counterparts in many situations. However, the resulting nonconvex optimization problems are still challenging, especially for the structured sparsity-inducing regularizers. In this paper, we propose a splitting method for solving nonconvex structured sparsity optimization problems. The proposed method alternates between a gradient step and an easily solvable proximal step, and thus enjoys low per-iteration computational complexity. We prove that the whole sequence generated by the proposed method converges to a critical point with at least sublinear convergence rate, relying on the Kurdyka-Łojasiewicz inequality. Experiments on both simulated and real-world data sets demonstrate the efficiency and efficacy of the proposed method.

IJCAI Conference 2016 Conference Paper

Constrained Preference Embedding for Item Recommendation

  • Xin Wang
  • Congfu Xu
  • Yunhui Guo
  • Hui Qian

To learn users' preference, their feedback information is commonly modeled as scalars and integrated into matrix factorization (MF) based algorithms. Based on MF techniques, the preference degree is computed by the product of user and item vectors, which is also represented by scalars. On the contrary, in this paper, we express users' feedback as constrained vectors, and call the idea constrained preference embedding (CPE); it means that we regard users, items and all users' behavior as vectors. We find that this viewpoint is more flexible and powerful than traditional MF for item recommendation. For example, by the proposed assumption, users' heterogeneous actions can be coherently mined because all entities and actions can be transferred to a space of the same dimension. In addition, CPE is able to model the feedback of uncertain preference degree. To test our assumption, we propose two models called CPE-s and CPE-ps based on CPE for item recommendation, and show that the popular pair-wise ranking model BPR-MF can be deduced by some restrictions and variations on CPE-s. In the experiments, we will test CPE and the proposed algorithms, and prove their effectiveness.

AAAI Conference 2016 Conference Paper

Fast Hybrid Algorithm for Big Matrix Recovery

  • Tengfei Zhou
  • Hui Qian
  • Zebang Shen
  • Congfu Xu

Large-scale Nuclear Norm penalized Least Square problem (NNLS) is frequently encountered in estimation of low rank structures. In this paper we accelerate the solution procedure by combining non-smooth convex optimization with smooth Riemannian method. Our methods comprise of two phases. In the first phase, we use Alternating Direction Method of Multipliers (ADMM) both to identify the fix rank manifold where an optimum resides and to provide an initializer for the subsequent refinement. In the second phase, two superlinearly convergent Riemannian methods: Riemannian NewTon (NT) and Riemannian Conjugate Gradient descent (CG) are adopted to improve the approximation over a fix rank manifold. We prove that our Hybrid method of ADMM and NT (HADMNT) converges to an optimum of NNLS at least quadratically. The experiments on large-scale collaborative filtering datasets demonstrate very competitive performance of these fast hybrid methods compared to the state-of-the-arts.

IJCAI Conference 2015 Conference Paper

Simple Atom Selection Strategy for Greedy Matrix Completion

  • Zebang Shen
  • Hui Qian
  • Tengfei Zhou
  • Song Wang

In this paper we focus on the greedy matrix completion problem. A simple atom selection strategy is proposed to find the optimal atom in each iteration by alternating minimization. Based on this per-iteration strategy, we devise a greedy algorithm and establish an upper bound of the approximating error. To evaluate different weight refinement methods, several variants are designed. We prove that our algorithm and three of its variants have the property of linear convergence. Experiments of Recommendation and Image Recovery are conducted to make empirical evaluation with promising results. The proposed algorithm takes only 700 seconds to process Yahoo Music dataset in PC, and achieves a root mean square error 24. 5 on the test set.

AAAI Conference 2014 Conference Paper

Using The Matrix Ridge Approximation to Speedup Determinantal Point Processes Sampling Algorithms

  • Shusen Wang
  • Chao Zhang
  • Hui Qian
  • Zhihua Zhang

Determinantal point process (DPP) is an important probabilistic model that has extensive applications in artificial intelligence. The exact sampling algorithm of DPP requires the full eigenvalue decomposition of the kernel matrix which has high time and space complexities. This prohibits the applications of DPP from large-scale datasets. Previous work has applied the Nyström method to speedup the sampling algorithm of DPP, and error bounds have been established for the approximation. In this paper we employ the matrix ridge approximation (MRA) to speedup the sampling algorithm of DPP, showing that our approach MRA-DPP has stronger error bound than the Nyström-DPP. In certain circumstances our MRA-DPP is provably exact, whereas the Nyström-DPP is far from the ground truth. Finally, experiments on several real-world datasets show that our MRA-DPP is more accurate than the other approximation approaches.

AAAI Conference 2013 Conference Paper

A Concave Conjugate Approach for Nonconvex Penalized Regression with the MCP Penalty

  • Shubao Zhang
  • Hui Qian
  • Wei Chen
  • Zhihua Zhang

The minimax concave plus penalty (MCP) has been demonstrated to be effective in nonconvex penalization for feature selection. In this paper we propose a novel construction approach for MCP. In particular, we show that MCP can be derived from a concave conjugate of the Euclidean distance function. This construction approach in turn leads us to an augmented Lagrange multiplier method for solving the penalized regression problem with MCP. In our method each tuning parameter corresponds to a feature, and these tuning parameters can be automatically updated. We also develop a d. c. (difference of convex functions) programming approach for the penalized regression problem. We find that the augmented Lagrange multiplier method degenerates into the d. c. programming method under specific conditions. Experimental analysis is conducted on a set of simulated data. The result is encouraging.

AAAI Conference 2013 Conference Paper

Large-Scale Hierarchical Classification via Stochastic Perceptron

  • Dehua Liu
  • Bojun Tu
  • Hui Qian
  • Zhihua Zhang

Hierarchical classification (HC) plays an significant role in machine learning and data mining. However, most of the state-of-the-art HC algorithms suffer from high computational costs. To improve the performance of solving, we propose a stochastic perceptron (SP) algorithm in the large margin framework. In particular, a stochastic choice procedure is devised to decide the direction of next iteration. We prove that after finite iterations the SP algorithm yields a sub-optimal solution with high probability if the input instances are separable. For large-scale and high-dimensional data sets, we reform SP to the kernel version (KSP), which dramatically reduces the memory space needed. The KSP algorithm has the merit of low space complexity as well as low time complexity. The experimental results show that our KSP approach achieves almost the same accuracy as the contemporary algorithms on the real-world data sets, but with much less CPU running time.