Author name cluster

Laiwan Chan

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers

2 author rows

UAI Conference 2019 Conference Paper

Domain Generalization via Multidomain Discriminant Analysis

Shoubo Hu
Kun Zhang 0001
Zhitang Chen
Laiwan Chan

Domain generalization (DG) aims to incorporate knowledge from multiple source domains into a single model that could generalize well on unseen target domains. This problem is ubiquitous in practice since the distributions of the target data may rarely be identical to those of the source data. In this paper, we propose Multidomain Discriminant Analysis (MDA) to address DG of classification tasks in general situations. MDA learns a domain-invariant feature transformation that aims to achieve appealing properties, including a minimal divergence among domains within each class, a maximal separability among classes, and overall maximal compactness of all classes. Furthermore, we provide the bounds on excess risk and generalization error by learning theory analysis. Comprehensive experiments on synthetic and real benchmark datasets demonstrate the effectiveness of MDA.

Details

NeurIPS Conference 2018 Conference Paper

Causal Inference and Mechanism Clustering of A Mixture of Additive Noise Models

Shoubo Hu
Zhitang Chen
Vahid Partovi Nia
Laiwan Chan
Yanhui Geng

The inference of the causal relationship between a pair of observed variables is a fundamental problem in science, and most existing approaches are based on one single causal model. In practice, however, observations are often collected from multiple sources with heterogeneous causal models due to certain uncontrollable factors, which renders causal analysis results obtained by a single model skeptical. In this paper, we generalize the Additive Noise Model (ANM) to a mixture model, which consists of a finite number of ANMs, and provide the condition of its causal identifiability. To conduct model estimation, we propose Gaussian Process Partially Observable Model (GPPOM), and incorporate independence enforcement into it to learn latent parameter associated with each observation. Causal inference and clustering according to the underlying generating mechanisms of the mixture model are addressed in this work. Experiments on synthetic and real data demonstrate the effectiveness of our proposed approach.

PDF Details

TIST Journal 2015 Journal Article

Causal Discovery on Discrete Data with Extensions to Mixture Model

Furui Liu
Laiwan Chan

In this article, we deal with the causal discovery problem on discrete data. First, we present a causal discovery method for traditional additive noise models that identifies the causal direction by analyzing the supports of the conditional distributions. Then, we present a causal mixture model to address the problem that the function transforming cause to effect varies across the observations. We propose a novel method called Support Analysis (SA) for causal discovery with the mixture model. Experiments using synthetic and real data are presented to demonstrate the performance of our proposed algorithm.

Details DOI

NeurIPS Conference 2012 Conference Paper

Causal discovery with scale-mixture model for spatiotemporal variance dependencies

Zhitang Chen
Kun Zhang
Laiwan Chan

In conventional causal discovery, structural equation models (SEM) are directly applied to the observed variables, meaning that the causal effect can be represented as a function of the direct causes themselves. However, in many real world problems, there are significant dependencies in the variances or energies, which indicates that causality may possibly take place at the level of variances or energies. In this paper, we propose a probabilistic causal scale-mixture model with spatiotemporal variance dependencies to represent a specific type of generating mechanism of the observations. In particular, the causal mechanism including contemporaneous and temporal causal relations in variances or energies is represented by a Structural Vector AutoRegressive model (SVAR). We prove the identifiability of this model under the non-Gaussian assumption on the innovation processes. We also propose algorithms to estimate the involved parameters and discover the contemporaneous causal structure. Experiments on synthesis and real world data are conducted to show the applicability of the proposed model and algorithms.

PDF Details

TIST Journal 2012 Journal Article

Learning Causal Relations in Multivariate Time Series Data

Zhenxing Wang
Laiwan Chan

Many applications naturally involve time series data and the vector autoregression (VAR), and the structural VAR (SVAR) are dominant tools to investigate relations between variables in time series. In the first part of this work, we show that the SVAR method is incapable of identifying contemporaneous causal relations for Gaussian process. In addition, least squares estimators become unreliable when the scales of the problems are large and observations are limited. In the remaining part, we propose an approach to apply Bayesian network learning algorithms to identify SVARs from time series data in order to capture both temporal and contemporaneous causal relations, and avoid high-order statistical tests. The difficulty of applying Bayesian network learning algorithms to time series is that the sizes of the networks corresponding to time series tend to be large, and high-order statistical tests are required by Bayesian network learning algorithms in this case. To overcome the difficulty, we show that the search space of conditioning sets d-separating two vertices should be a subset of the Markov blankets. Based on this fact, we propose an algorithm enabling us to learn Bayesian networks locally, and make the largest order of statistical tests independent of the scales of the problems. Empirical results show that our algorithm outperforms existing methods in terms of both efficiency and accuracy.

Details DOI

JMLR Journal 2008 Journal Article

Minimal Nonlinear Distortion Principle for Nonlinear Independent Component Analysis

Kun Zhang
Laiwan Chan

It is well known that solutions to the nonlinear independent component analysis (ICA) problem are highly non-unique. In this paper we propose the "minimal nonlinear distortion" (MND) principle for tackling the ill-posedness of nonlinear ICA problems. MND prefers the nonlinear ICA solution with the estimated mixing procedure as close as possible to linear, among all possible solutions. It also helps to avoid local optima in the solutions. To achieve MND, we exploit a regularization term to minimize the mean square error between the nonlinear mixing mapping and the best-fitting linear one. The effect of MND on the inherent trivial and non-trivial indeterminacies in nonlinear ICA solutions is investigated. Moreover, we show that local MND is closely related to the smoothness regularizer penalizing large curvature, which provides another useful regularization condition for nonlinear ICA. Experiments on synthetic data show the usefulness of the MND principle for separating various nonlinear mixtures. Finally, as an application, we use nonlinear ICA with MND to separate daily returns of a set of stocks in Hong Kong, and the linear causal relations among them are successfully discovered. The resulting causal relations give some interesting insights into the stock market. Such a result can not be achieved by linear ICA. Simulation studies also verify that when doing causality discovery, sometimes one should not ignore the nonlinear distortion in the data generation procedure, even if it is weak. [abs] [ pdf ][ bib ] &copy JMLR 2008. ( edit, beta )

PDF Details

ICML Conference 2007 Conference Paper

Nonlinear independent component analysis with minimal nonlinear distortion

Kun Zhang 0001
Laiwan Chan

Details

JMLR Journal 2004 Journal Article

The Minimum Error Minimax Probability Machine

Kaizhu Huang
Haiqin Yang
Irwin King
Michael R. Lyu
Laiwan Chan

We construct a distribution-free Bayes optimal classifier called the Minimum Error Minimax Probability Machine (MEMPM) in a worst-case setting, i.e., under all possible choices of class-conditional densities with a given mean and covariance matrix. By assuming no specific distributions for the data, our model is thus distinguished from traditional Bayes optimal approaches, where an assumption on the data distribution is a must. This model is extended from the Minimax Probability Machine (MPM), a recently-proposed novel classifier, and is demonstrated to be the general case of MPM. Moreover, it includes another special case named the Biased Minimax Probability Machine, which is appropriate for handling biased classification. One appealing feature of MEMPM is that it contains an explicit performance indicator, i.e., a lower bound on the worst-case accuracy, which is shown to be tighter than that of MPM. We provide conditions under which the worst-case Bayes optimal classifier converges to the Bayes optimal classifier. We demonstrate how to apply a more general statistical framework to estimate model input parameters robustly. We also show how to extend our model to nonlinear classification by exploiting kernelization techniques. A series of experiments on both synthetic data sets and real world benchmark data sets validates our proposition and demonstrates the effectiveness of our model. [abs] [ pdf ]

PDF Details