Arrow Research search

Author name cluster

Lifeng Zhang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers
1 author row

Possible papers

4

AAAI Conference 2022 Conference Paper

Categorical Neighbour Correlation Coefficient (CnCor) for Detecting Relationships between Categorical Variables

  • Lifeng Zhang
  • Shimo Yang
  • Hongxun Jiang

Categorical data is common and, however, special in that its possible values exist only on a nominal scale so that many statistical operations such as mean, variance, and covariance become not applicable. Following the basic idea of the neighbour correlation coefficient (nCor), in this study, we propose a new measure named the categorical nCor (CnCor) to examine the association between categorical variables through using indicator functions to reform the distance metric and product-moment correlation coefficient. The proposed measure is easy to compute, and enables a direct test of statistical dependence without the need of converting the qualitative variables to quantitative ones. Compared with previous approaches, it is much more robust and effective in dealing with multi-categorical target variables especially when highly nonlinear relationships occur in the multivariate case. We also applied the CnCor to implementing feature selection by the scheme of backward elimination. Finally, extensive experiments performed on both synthetic and real-world datasets are conducted to demonstrate the outstanding performance of the proposed methods, and draw comparisons with state-ofthe-art association measures and feature selection algorithms.

NeurIPS Conference 2021 Conference Paper

Absolute Neighbour Difference based Correlation Test for Detecting Heteroscedastic Relationships

  • Lifeng Zhang

It is a challenge to detect complicated data relationships thoroughly. Here, we propose a new statistical measure, named the absolute neighbour difference based neighbour correlation coefficient, to detect the associations between variables through examining the heteroscedasticity of the unpredictable variation of dependent variables. Different from previous studies, the new method concentrates on measuring nonfunctional relationships rather than functional or mixed associations. Either used alone or in combination with other measures, it enables not only a convenient test of heteroscedasticity, but also measuring functional and nonfunctional relationships separately that obviously leads to a deeper insight into the data associations. The method is concise and easy to implement that does not rely on explicitly estimating the regression residuals or the dependencies between variables so that it is not restrict to any kind of model assumption. The mechanisms of the correlation test are proved in theory and demonstrated with numerical analyses.

AAAI Conference 2020 Conference Paper

Systematically Exploring Associations among Multivariate Data

  • Lifeng Zhang

Detecting relationships among multivariate data is often of great importance in the analysis of high-dimensional data sets, and has received growing attention for decades from both academic and industrial fields. In this study, we propose a statistical tool named the neighbor correlation coefficient (nCor), which is based on a new idea that measures the local continuity of the reordered data points to quantify the strength of the global association between variables. With sufficient sample size, the new method is able to capture a wide range of functional relationship, whether it is linear or nonlinear, bivariate or multivariate, main effect or interaction. The score of nCor roughly approximates the coefficient of determination (R2 ) of the data which implies the proportion of variance in one variable that is predictable from one or more other variables. On this basis, three nCor based statistics are also proposed here to further characterize the intra and inter structures of the associations from the aspects of nonlinearity, interaction effect, and variable redundancy. The mechanisms of these measures are proved in theory and demonstrated with numerical analyses.