Author name cluster

Kfir Levy

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers

1 author row

JBHI Journal 2026 Journal Article

Novel Deep Learning Model to Estimate Knee Flexion and Adduction Moments With Wearable IMUs During Treadmill and Overground Walking

Alon Sabaty
Adi Fishman
Shani Batcir
Tian Tan
Peter B. Shull
Kfir Levy
Arielle G. Fischer

A major issue after total knee replacement (TKR) surgery is asymmetric gait kinetics, which increases knee loads on the non-operated knee. This imbalance accelerates osteoarthritis (OA) progression, often leading to a second contralateral TKR. There is a clear need for an advanced wearable system with multiple sensors to accurately estimate gait kinetics in natural environments. This study aims to develop a machine learning framework that exclusively uses wearable inertial measurement units (IMUs) during overground and treadmill walking to estimate knee flexion moment (KFM) and knee adduction moment (KAM), significant biomechanical factors linked to OA. We introduce a novel deep learning model that combines a Long Short-Term Memory (LSTM)-based Autoencoder and Variational Gaussian Process (VGP) to estimate the mean and uncertainty region of the KAM and KFM. Seventeen healthy participants performed treadmill walking trials, while a separate group of seventeen healthy participants performed overground walking trials for model training and validation. Results demonstrated Root Mean Square Errors (RMSE) of 0. 49%BW $\cdot$ BH (body weight × body height) and 0. 73%BW $\cdot$ BH for KAM and KFM, respectively, during treadmill walking and 0. 74%BW $\cdot$ BH and 0. 49%BW $\cdot$ BH for KAM and KFM respectively during overground walking, which is more accurate than existing approaches. The proposed model with wearable IMUs could enable knee health monitoring and rehabilitation for these key biomechanical factors linked to the progression of knee joint pathologies outside of traditional biomechanical laboratories with large, tethered equipment and into clinics, hospitals, and the community.

Details DOI

AAAI Conference 2024 Conference Paper

Solving Non-rectangular Reward-Robust MDPs via Frequency Regularization

Uri Gadot
Esther Derman
Navdeep Kumar
Maxence Mohamed Elfatihi
Kfir Levy
Shie Mannor

In robust Markov decision processes (RMDPs), it is assumed that the reward and the transition dynamics lie in a given uncertainty set. By targeting maximal return under the most adversarial model from that set, RMDPs address performance sensitivity to misspecified environments. Yet, to preserve computational tractability, the uncertainty set is traditionally independently structured for each state. This so-called rectangularity condition is solely motivated by computational concerns. As a result, it lacks a practical incentive and may lead to overly conservative behavior. In this work, we study coupled reward RMDPs where the transition kernel is fixed, but the reward function lies within an alpha-radius from a nominal one. We draw a direct connection between this type of non-rectangular reward-RMDPs and applying policy visitation frequency regularization. We introduce a policy-gradient method, and prove its convergence. Numerical experiments illustrate the learned policy's robustness and its less conservative behavior when compared to rectangular uncertainty.

PDF Details DOI

EWRL Workshop 2022 Workshop Paper

$Q$-Learning for $L_p$ Robust Markov Decision Processes

Navdeep Kumar
Kaixin Wang
Kfir Levy
Shie Mannor

Robust Markov Decision Processes (MDPs) are a powerful tool to solve the sequential decision-making problem where system parameters are partially known or changing or adversarial. Recently, there have been works aimed at solving sa and s-rectangular robust MDPs. The methods are model-based that can potentially be generalized to model-free settings. We formally propose model-free algorithm for sa and s-rectangular Lp robust MDPs and provide its convergence guarantees. The proposed model-free algorithms can be combined with existing deep RL techniques such as DQN etc. to solve challenging problems.

PDF Details

NeurIPS Conference 2021 Conference Paper

Faster Neural Network Training with Approximate Tensor Operations

Menachem Adelman
Kfir Levy
Ido Hakimi
Mark Silberstein

We propose a novel technique for faster deep neural network training which systematically applies sample-based approximation to the constituent tensor operations, i. e. , matrix multiplications and convolutions. We introduce new sampling techniques, study their theoretical properties, and prove that they provide the same convergence guarantees when applied to SGD training. We apply approximate tensor operations to single and multi-node training of MLP and CNN networks on MNIST, CIFAR-10 and ImageNet datasets. We demonstrate up to 66% reduction in the amount of computations and communication, and up to 1. 37x faster training time while maintaining negligible or no impact on the final test accuracy.

PDF Details

NeurIPS Conference 2021 Conference Paper

STORM+: Fully Adaptive SGD with Recursive Momentum for Nonconvex Optimization

Kfir Levy
Ali Kavis
Volkan Cevher

In this work we investigate stochastic non-convex optimization problems where the objective is an expectation over smooth loss functions, and the goal is to find an approximate stationary point. The most popular approach to handling such problems is variance reduction techniques, which are also known to obtain tight convergence rates, matching the lower bounds in this case. Nevertheless, these techniques require a careful maintenance of anchor points in conjunction with appropriately selected ``mega-batchsizes". This leads to a challenging hyperparameter tuning problem, that weakens their practicality. Recently, [Cutkosky and Orabona, 2019] have shown that one can employ recursive momentum in order to avoid the use of anchor points and large batchsizes, and still obtain the optimal rate for this setting. Yet, their method called $\rm{STORM}$ crucially relies on the knowledge of the smoothness, as well a bound on the gradient norms. In this work we propose $\rm{STORM}^{+}$, a new method that is completely parameter-free, does not require large batch-sizes, and obtains the optimal $O(1/T^{1/3})$ rate for finding an approximate stationary point. Our work builds on the $\rm{STORM}$ algorithm, in conjunction with a novel approach to adaptively set the learning rate and momentum parameters.

PDF Details

NeurIPS Conference 2017 Conference Paper

Continuous DR-submodular Maximization: Structure and Algorithms

An Bian
Kfir Levy
Andreas Krause
Joachim Buhmann

DR-submodular continuous functions are important objectives with wide real-world applications spanning MAP inference in determinantal point processes (DPPs), and mean-field inference for probabilistic submodular models, amongst others. DR-submodularity captures a subclass of non-convex functions that enables both exact minimization and approximate maximization in polynomial time. In this work we study the problem of maximizing non-monotone DR-submodular continuous functions under general down-closed convex constraints. We start by investigating geometric properties that underlie such objectives, e. g. , a strong relation between (approximately) stationary points and global optimum is proved. These properties are then used to devise two optimization algorithms with provable guarantees. Concretely, we first devise a "two-phase'' algorithm with 1/4 approximation guarantee. This algorithm allows the use of existing methods for finding (approximately) stationary points as a subroutine, thus, harnessing recent progress in non-convex optimization. Then we present a non-monotone Frank-Wolfe variant with 1/e approximation guarantee and sublinear convergence rate. Finally, we extend our approach to a broader class of generalized DR-submodular continuous functions, which captures a wider spectrum of applications. Our theoretical findings are validated on synthetic and real-world problem instances.

PDF Details

NeurIPS Conference 2017 Conference Paper

Online to Offline Conversions, Universality and Adaptive Minibatch Sizes

Kfir Levy

We present an approach towards convex optimization that relies on a novel scheme which converts adaptive online algorithms into offline methods. In the offline optimization setting, our derived methods are shown to obtain favourable adaptive guarantees which depend on the harmonic sum of the queried gradients. We further show that our methods implicitly adapt to the objective's structure: in the smooth case fast convergence rates are ensured without any prior knowledge of the smoothness parameter, while still maintaining guarantees in the non-smooth setting. Our approach has a natural extension to the stochastic setting, resulting in a lazy version of SGD (stochastic GD), where minibathces are chosen adaptively depending on the magnitude of the gradients. Thus providing a principled approach towards choosing minibatch sizes.

PDF Details

NeurIPS Conference 2016 Conference Paper

k*-Nearest Neighbors: From Global to Local

Oren Anava
Kfir Levy

The weighted k-nearest neighbors algorithm is one of the most fundamental non-parametric methods in pattern recognition and machine learning. The question of setting the optimal number of neighbors as well as the optimal weights has received much attention throughout the years, nevertheless this problem seems to have remained unsettled. In this paper we offer a simple approach to locally weighted regression/classification, where we make the bias-variance tradeoff explicit. Our formulation enables us to phrase a notion of optimal weights, and to efficiently find these weights as well as the optimal number of neighbors efficiently and adaptively, for each data point whose value we wish to estimate. The applicability of our approach is demonstrated on several datasets, showing superior performance over standard locally weighted methods.

PDF Details

NeurIPS Conference 2015 Conference Paper

Beyond Convexity: Stochastic Quasi-Convex Optimization

Elad Hazan
Kfir Levy
Shai Shalev-Shwartz

This poster has been moved from Monday #86 to Thursday #101. Stochastic convex optimization is a basic and well studied primitive in machine learning. It is well known that convex and Lipschitz functions can be minimized efficiently using Stochastic Gradient Descent (SGD). The Normalized Gradient Descent (NGD) algorithm, is an adaptation of Gradient Descent, which updates according to the direction of the gradients, rather than the gradients themselves. In this paper we analyze a stochastic version of NGD and prove its convergence to a global minimum for a wider class of functions: we require the functions to be quasi-convex and locally-Lipschitz. Quasi-convexity broadens the concept of unimodality to multidimensions and allows for certain types of saddle points, which are a known hurdle for first-order optimization methods such as gradient descent. Locally-Lipschitz functions are only required to be Lipschitz in a small region around the optimum. This assumption circumvents gradient explosion, which is another known hurdle for gradient descent variants. Interestingly, unlike the vanilla SGD algorithm, the stochastic normalized gradient descent algorithm provably requires a minimal minibatch size.

PDF Details

NeurIPS Conference 2015 Conference Paper

Fast Rates for Exp-concave Empirical Risk Minimization

Tomer Koren
Kfir Levy

We consider Empirical Risk Minimization (ERM) in the context of stochastic optimization with exp-concave and smooth losses---a general optimization framework that captures several important learning problems including linear and logistic regression, learning SVMs with the squared hinge-loss, portfolio selection and more. In this setting, we establish the first evidence that ERM is able to attain fast generalization rates, and show that the expected loss of the ERM solution in $d$ dimensions converges to the optimal expected loss in a rate of $d/n$. This rate matches existing lower bounds up to constants and improves by a $\log{n}$ factor upon the state-of-the-art, which is only known to be attained by an online-to-batch conversion of computationally expensive online algorithms.

PDF Details

NeurIPS Conference 2014 Conference Paper

Bandit Convex Optimization: Towards Tight Bounds

Elad Hazan
Kfir Levy

Bandit Convex Optimization (BCO) is a fundamental framework for decision making under uncertainty, which generalizes many problems from the realm of online and statistical learning. While the special case of linear cost functions is well understood, a gap on the attainable regret for BCO with nonlinear losses remains an important open question. In this paper we take a step towards understanding the best attainable regret bounds for BCO: we give an efficient and near-optimal regret algorithm for BCO with strongly-convex and smooth loss functions. In contrast to previous works on BCO that use time invariant exploration schemes, our method employs an exploration scheme that shrinks with time.

PDF Details