Author name cluster

Yale Chang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

2 author rows

NeurIPS Conference 2019 Conference Paper

Solving Interpretable Kernel Dimensionality Reduction

Chieh Wu
Jared Miller
Yale Chang
Mario Sznaier
Jennifer Dy

Kernel dimensionality reduction (KDR) algorithms find a low dimensional representation of the original data by optimizing kernel dependency measures that are capable of capturing nonlinear relationships. The standard strategy is to first map the data into a high dimensional feature space using kernels prior to a projection onto a low dimensional space. While KDR methods can be easily solved by keeping the most dominant eigenvectors of the kernel matrix, its features are no longer easy to interpret. Alternatively, Interpretable KDR (IKDR) is different in that it projects onto a subspace \textit{before} the kernel feature mapping, therefore, the projection matrix can indicate how the original features linearly combine to form the new features. Unfortunately, the IKDR objective requires a non-convex manifold optimization that is difficult to solve and can no longer be solved by eigendecomposition. Recently, an efficient iterative spectral (eigendecomposition) method (ISM) has been proposed for this objective in the context of alternative clustering. However, ISM only provides theoretical guarantees for the Gaussian kernel. This greatly constrains ISM's usage since any kernel method using ISM is now limited to a single kernel. This work extends the theoretical guarantees of ISM to an entire family of kernels, thereby empowering ISM to solve any kernel method of the same objective. In identifying this family, we prove that each kernel within the family has a surrogate $\Phi$ matrix and the optimal projection is formed by its most dominant eigenvectors. With this extension, we establish how a wide range of IKDR applications across different learning paradigms can be solved by ISM. To support reproducible results, the source code is made publicly available on \url{https: //github. com/ANONYMIZED}.

PDF Details

JMLR Journal 2017 Journal Article

A Robust-Equitable Measure for Feature Ranking and Selection

A. Adam Ding
Jennifer G. Dy
Yi Li
Yale Chang

In many applications, not all the features used to represent data samples are important. Often only a few features are relevant for the prediction task. The choice of dependence measures often affect the final result of many feature selection methods. To select features that have complex nonlinear relationships with the response variable, the dependence measure should be equitable, a concept proposed by Reshef et al. (2011); that is, the dependence measure treats linear and nonlinear relationships equally. Recently, Kinney and Atwal (2014) gave a mathematical definition of self- equitability. In this paper, we introduce a new concept of robust-equitability and identify a robust- equitable copula dependence measure, the robust copula dependence (RCD) measure. RCD is based on the $L_1$-distance of the copula density from uniform and we show that it is equitable under both equitability definitions. We also prove theoretically that RCD is much easier to estimate than mutual information. Because of these theoretical properties, the RCD measure has the following advantages compared to existing dependence measures: it is robust to different relationship forms and robust to unequal sample sizes of different features. Experiments on both synthetic and real-world data sets confirm the theoretical analysis, and illustrate the advantage of using the dependence measure RCD for feature selection. [abs] [ pdf ][ bib ] &copy JMLR 2017. ( edit, beta )

PDF Details

AAAI Conference 2017 Conference Paper

Informative Subspace Learning for Counterfactual Inference

Yale Chang
Jennifer Dy

Inferring causal relations from observational data is widely used for knowledge discovery in healthcare and economics. To investigate whether a treatment can affect an outcome of interest, we focus on answering counterfactual questions of this type: what would a patient’s blood pressure be had he/she recieved a different treatment? Nearest neighbor matching (NNM) sets the counterfactual outcome of any treatment (control) sample to be equal to the factual outcome of its nearest neighbor in the control (treatment) group. Although being simple, ﬂexible and interpretable, most NNM approaches could be easily misled by variables that do not affect the outcome. In this paper, we address this challenge by learning subspaces that are predictive of the outcome variable for both the treatment group and control group. Applying NNM in the learned subspaces leads to more accurate estimation of the counterfactual outcomes and therefore treatment effects. We introduce an informative subspace learning algorithm by maximizing the nonlinear dependence between the candidate subspace and the outcome variable measured by the Hilbert-Schmidt Independence Criterion (HSIC). We propose a scalable estimator of HSIC, called HSIC-RFF that reduces the quadratic computational and storage complexities (with respect to the sample size) of the naive HSIC implementation to linear through constructing random Fourier features. We also prove an upper bound on the approximation error of the HSIC-RFF estimator. Experimental results on simulated datasets and real-world datasets demonstrate our proposed approach outperforms existing NNM approaches and other commonly used regression-based methods for counterfactual inference.

PDF Details

ICML Conference 2017 Conference Paper

Multiple Clustering Views from Multiple Uncertain Experts

Yale Chang
Junxiang Chen
Michael H. Cho
Peter J. Castaldi
Edwin K. Silverman
Jennifer G. Dy

Expert input can improve clustering performance. In today’s collaborative environment, the availability of crowdsourced multiple expert input is becoming common. Given multiple experts’ inputs, most existing approaches can only discover one clustering structure. However, data is multi-faced by nature and can be clustered in different ways (also known as views). In an exploratory analysis problem where ground truth is not known, different experts may have diverse views on how to cluster data. In this paper, we address the problem on how to automatically discover multiple ways to cluster data given potentially diverse inputs from multiple uncertain experts. We propose a novel Bayesian probabilistic model that automatically learns the multiple expert views and the clustering structure associated with each view. The benefits of learning the experts’ views include 1) enabling the discovery of multiple diverse clustering structures, and 2) improving the quality of clustering solution in each view by assigning higher weights to experts with higher confidence. In our approach, the expert views, multiple clustering structures and expert confidences are jointly learned via variational inference. Experimental results on synthetic datasets, benchmark datasets and a real-world disease subtyping problem show that our proposed approach outperforms competing baselines, including meta clustering, semi-supervised clustering, semi-crowdsourced clustering and consensus clustering.

Details