Arrow Research search

Author name cluster

Jie Shen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers
2 author rows

Possible papers

12

TMLR Journal 2025 Journal Article

Towards Efficient Contrastive PAC Learning

  • Jie Shen

We study contrastive learning under the PAC learning framework. While a series of recent works have shown statistical results for learning under contrastive loss, based either on the VC-dimension or Rademacher complexity, their algorithms are inherently inefficient or not implying PAC guarantees. In this paper, we consider contrastive learning of the fundamental concept of linear representations. Surprisingly, even under such basic setting, the existence of efficient PAC learners is largely open. We first show that the problem of contrastive PAC learning of linear representations is intractable to solve in general. We then show that it can be relaxed to a semi-definite program when the distance between contrastive samples is measured by the $\ell_2$-norm. We then establish generalization guarantees based on Rademacher complexity, and connect it to PAC guarantees under certain contrastive large-margin conditions. To the best of our knowledge, this is the first efficient PAC learning algorithm for contrastive learning.

AAAI Conference 2024 Conference Paper

Efficient Look-Up Table from Expanded Convolutional Network for Accelerating Image Super-resolution

  • Kai Yin
  • Jie Shen

The look-up table (LUT) has recently shown its practicability and effectiveness in super-resolution (SR) tasks due to its low computational cost and hardware independence. However, most existing methods focus on improving the performance of SR, neglecting the demand for high-speed SR on low-computational edge devices. In this paper, we propose an efficient expanded convolution (EC) layer, which expands the output size of regular convolution to enlarge the receptive field (RF) indirectly. It can increase the size of the LUT corresponding to the network linearly with the increase of RF. Additionally, after introducing the EC, multiple LUTs are merged into one LUT, achieving faster running speed while maintaining SR performance. More specifically, we expand the coverage of the convolutional output so that the output at the current position covers the target position and its surroundings, forming an overlapping sliding window at the output end. We sum up the overlapping parts of the sliding window as the output, thereby achieving the effect of enlarging the RF size. Moreover, by expanding the numerical range of the accumulated results and rescaling them to [0,255], the method can mitigate the error caused by quantization output. Experiments indicate that the proposed method performs better than the baseline method and is faster than other LUT-based SR methods.

NeurIPS Conference 2022 Conference Paper

List-Decodable Sparse Mean Estimation

  • Shiwei Zeng
  • Jie Shen

Robust mean estimation is one of the most important problems in statistics: given a set of samples in $\mathbb{R}^d$ where an $\alpha$ fraction are drawn from some distribution $D$ and the rest are adversarially corrupted, we aim to estimate the mean of $D$. A surge of recent research interest has been focusing on the list-decodable setting where $\alpha \in (0, \frac12]$, and the goal is to output a finite number of estimates among which at least one approximates the target mean. In this paper, we consider that the underlying distribution $D$ is Gaussian with $k$-sparse mean. Our main contribution is the first polynomial-time algorithm that enjoys sample complexity $O\big(\mathrm{poly}(k, \log d)\big)$, i. e. poly-logarithmic in the dimension. One of our core algorithmic ingredients is using low-degree {\em sparse polynomials} to filter outliers, which may find more applications.

TMLR Journal 2022 Journal Article

Uncertainty-Based Active Learning for Reading Comprehension

  • Jing Wang
  • Jie Shen
  • Xiaofei Ma
  • Andrew Arnold

Recent years have witnessed a surge of successful applications of machine reading comprehension. Of central importance to these tasks is the availability of massive amount of labeled data, which facilitates training of large-scale neural networks. However, in many real-world problems, annotated data are expensive to gather not only because of time cost and budget, but also of certain domain-specific restrictions such as privacy for healthcare data. In this regard, we propose an uncertainty-based active learning algorithm for reading comprehension, which interleaves data annotation and model updating to mitigate the demand of labeling. Our key techniques are two-fold: 1) an unsupervised uncertainty-based sampling scheme that queries the labels of the most informative instances with respect to the currently learned model; and 2) an adaptive loss minimization paradigm that simultaneously fits the data and controls the degree of model updating. We demonstrate on benchmark datasets that 25% less labeled samples suffice to guarantee similar, or even improved performance. Our results show strong evidence that for label-demanding scenarios, the proposed approach offers a practical guide on data collection and model training.

NeurIPS Conference 2020 Conference Paper

Efficient active learning of sparse halfspaces with arbitrary bounded noise

  • Chicheng Zhang
  • Jie Shen
  • Pranjal Awasthi

We study active learning of homogeneous $s$-sparse halfspaces in $\mathbb{R}^d$ under the setting where the unlabeled data distribution is isotropic log-concave and each label is flipped with probability at most $\eta$ for a parameter $\eta \in \big[0, \frac12\big)$, known as the bounded noise. Even in the presence of mild label noise, i. e. $\eta$ is a small constant, this is a challenging problem and only recently have label complexity bounds of the form $\tilde{O}(s \cdot polylog(d, \frac{1}{\epsilon}))$ been established in [Zhang 2018] for computationally efficient algorithms. In contrast, under high levels of label noise, the label complexity bounds achieved by computationally efficient algorithms are much worse: the best known result [Awasthi et al. 2016] provides a computationally efficient algorithm with label complexity $\tilde{O}((s ln d/\epsilon)^{poly(1/(1-2\eta))})$, which is label-efficient only when the noise rate $\eta$ is a fixed constant. In this work, we substantially improve on it by designing a polynomial time algorithm for active learning of $s$-sparse halfspaces, with a label complexity of $\tilde{O}\big(\frac{s}{(1-2\eta)^4} polylog (d, \frac 1 \epsilon) \big)$. This is the first efficient algorithm with label complexity polynomial in $\frac{1}{1-2\eta}$ in this setting, which is label-efficient even for $\eta$ arbitrarily close to $\frac12$. Our active learning algorithm and its theoretical guarantees also immediately translate to new state-of-the-art label and sample complexity results for full-dimensional active and passive halfspace learning under arbitrary bounded noise.

JMLR Journal 2018 Journal Article

A Tight Bound of Hard Thresholding

  • Jie Shen
  • Ping Li

This paper is concerned with the hard thresholding operator which sets all but the $k$ largest absolute elements of a vector to zero. We establish a tight bound to quantitatively characterize the deviation of the thresholded solution from a given signal. Our theoretical result is universal in the sense that it holds for all choices of parameters, and the underlying analysis depends only on fundamental arguments in mathematical optimization. We discuss the implications for two domains: Compressed Sensing. On account of the crucial estimate, we bridge the connection between the restricted isometry property (RIP) and the sparsity parameter for a vast volume of hard thresholding based algorithms, which renders an improvement on the RIP condition especially when the true sparsity is unknown. This suggests that in essence, many more kinds of sensing matrices or fewer measurements are admissible for the data acquisition procedure. Machine Learning. In terms of large-scale machine learning, a significant yet challenging problem is learning accurate sparse models in an efficient manner. In stark contrast to prior work that attempted the $\ell_1$-relaxation for promoting sparsity, we present a novel stochastic algorithm which performs hard thresholding in each iteration, hence ensuring such parsimonious solutions. Equipped with the developed bound, we prove the {\em global linear convergence} for a number of prevalent statistical models under mild assumptions, even though the problem turns out to be non-convex. [abs] [ pdf ][ bib ] &copy JMLR 2018. ( edit, beta )

NeurIPS Conference 2017 Conference Paper

Partial Hard Thresholding: Towards A Principled Analysis of Support Recovery

  • Jie Shen
  • Ping Li

In machine learning and compressed sensing, it is of central importance to understand when a tractable algorithm recovers the support of a sparse signal from its compressed measurements. In this paper, we present a principled analysis on the support recovery performance for a family of hard thresholding algorithms. To this end, we appeal to the partial hard thresholding (PHT) operator proposed recently by Jain et al. [IEEE Trans. Information Theory, 2017]. We show that under proper conditions, PHT recovers an arbitrary "s"-sparse signal within O(s kappa log kappa) iterations where "kappa" is an appropriate condition number. Specifying the PHT operator, we obtain the best known result for hard thresholding pursuit and orthogonal matching pursuit with replacement. Experiments on the simulated data complement our theoretical findings and also illustrate the effectiveness of PHT compared to other popular recovery methods.

NeurIPS Conference 2014 Conference Paper

Online Optimization for Max-Norm Regularization

  • Jie Shen
  • Huan Xu
  • Ping Li

Max-norm regularizer has been extensively studied in the last decade as it promotes an effective low rank estimation of the underlying data. However, max-norm regularized problems are typically formulated and solved in a batch manner, which prevents it from processing big data due to possible memory bottleneck. In this paper, we propose an online algorithm for solving max-norm regularized problems that is scalable to large problems. Particularly, we consider the matrix decomposition problem as an example, although our analysis can also be applied in other problems such as matrix completion. The key technique in our algorithm is to reformulate the max-norm into a matrix factorization form, consisting of a basis component and a coefficients one. In this way, we can solve the optimal basis and coefficients alternatively. We prove that the basis produced by our algorithm converges to a stationary point asymptotically. Experiments demonstrate encouraging results for the effectiveness and robustness of our algorithm. See the full paper at arXiv: 1406. 3190.

LORI Conference 2009 Conference Paper

Oppositional Logic

  • Guoping Du
  • Hongguang Wang
  • Jie Shen

Abstract In intuitionistic logic system, constructive negation operator complies with the law of contradiction but not the law of excluded middle in intuitionistic logic system. In da Costa’s paraconsistent logic system, paraconsistent negation operator complies with the law of excluded middle but not the law of contradiction. Putting aside classical negation operator, both intuitionistic logic and da Costa’s paraconsistent logic establish logic systems by directly introducing new negation operators basing on the positive proposition logic. This paper attempts to make constructive negation operator and paraconsistent negation operator satisfying the conditions mentioned above in classical logical system. Oppositional logic is an extended system of classical propositional logic. It can be obtained from the classical propositional logic by adding an unary connective * and introducing the definitions of two unary connectives ∆ and ∇. In oppositional logic system, there are four kinds of negation: the classical negation ¬ complying with both law of contradiction and law of excluded middle, the constructive negation ∇ complying with law of contradiction but not law of excluded middle, the paraconsistent negation ∆ complying with law of excluded middle but not law of contradiction, as well as the dialectical negation * complying with neither law of contradiction nor law of excluded middle. This paper gives the proof of the soundness and completenesstheorem of oppositional logic. It also gives the following conclusions: [1] Oppositonal logic can be a kind of tools for paraconsistent theory and intuitionistic theory; the famous Duns Scotus law does not hold according to the paraconsistent negation and the dialectical negation; [2] In oppositonal logic, according to the unary connective ¬, *, ∇ and ∆, A is in contradictory opposition with ¬ A; A is in subaltern opposition with * A; A is in contrary opposition with ∇ A; A is in subcontrary opposition with ∆ A. In this sense, we call the logical system mentioned above oppositional logic.

ICRA Conference 1988 Conference Paper

A motion control scheme for a biped robot to climb sloping surfaces

  • Yuan F. Zheng
  • Jie Shen
  • Fred R. Sias Jr.

A scheme to enable a biped robot to climb sloping surfaces is proposed. By means of sensing devices, namely, position sensors on the joints and force sensors underneath the heel and toe, a biped robot called SD-2 is able to detect the transition of the supporting terrain from a flat floor to a sloping surface and to walk on the slope. Since in the frontal plane almost no difference exists for biped locomotion on level or on slope, the study concentrates on the sagittal plane. >