Arrow Research search

Author name cluster

Nikos Vlassis

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

35 papers
2 author rows

Possible papers

35

RLC Conference 2025 Conference Paper

Adaptive Submodular Policy Optimization

  • Branislav Kveton
  • Anup Rao
  • Viet Dac Lai
  • Nikos Vlassis
  • David Arbour

We propose KL-regularized policy optimization for adaptive submodular maximization, which is a framework for decision making under uncertainty with submodular rewards. Policy optimization of adaptive submodular functions justifies a surprisingly simple and efficient policy gradient update, where the optimized action only affects its immediate reward but not the future ones. It also allows us to learn adaptive submodular policies with large action spaces, such as those represented by large language models (LLMs). We prove that our policies monotonically improve as the regularization diminishes and converge to the optimal greedy policy. Our experiments show major gains in statistical efficiency, in both synthetic problems and LLMs.

RLJ Journal 2025 Journal Article

Adaptive Submodular Policy Optimization

  • Branislav Kveton
  • Anup Rao
  • Viet Dac Lai
  • Nikos Vlassis
  • David Arbour

We propose KL-regularized policy optimization for adaptive submodular maximization, which is a framework for decision making under uncertainty with submodular rewards. Policy optimization of adaptive submodular functions justifies a surprisingly simple and efficient policy gradient update, where the optimized action only affects its immediate reward but not the future ones. It also allows us to learn adaptive submodular policies with large action spaces, such as those represented by large language models (LLMs). We prove that our policies monotonically improve as the regularization diminishes and converge to the optimal greedy policy. Our experiments show major gains in statistical efficiency, in both synthetic problems and LLMs.

AAAI Conference 2024 Conference Paper

Distributional Off-Policy Evaluation for Slate Recommendations

  • Shreyas Chaudhari
  • David Arbour
  • Georgios Theocharous
  • Nikos Vlassis

Recommendation strategies are typically evaluated by using previously logged data, employing off-policy evaluation methods to estimate their expected performance. However, for strategies that present users with slates of multiple items, the resulting combinatorial action space renders many of these methods impractical. Prior work has developed estimators that leverage the structure in slates to estimate the expected off-policy performance, but the estimation of the entire performance distribution remains elusive. Estimating the complete distribution allows for a more comprehensive evaluation of recommendation strategies, particularly along the axes of risk and fairness that employ metrics computable from the distribution. In this paper, we propose an estimator for the complete off-policy performance distribution for slates and establish conditions under which the estimator is unbiased and consistent. This builds upon prior work on off-policy evaluation for slates and off-policy distribution estimation in reinforcement learning. We validate the efficacy of our method empirically on synthetic data as well as on a slate recommendation simulator constructed from real-world data (MovieLens-20M). Our results show a significant reduction in estimation variance and improved sample efficiency over prior work across a range of slate structures.

NeurIPS Conference 2021 Conference Paper

Control Variates for Slate Off-Policy Evaluation

  • Nikos Vlassis
  • Ashok Chandrashekar
  • Fernando Amat
  • Nathan Kallus

We study the problem of off-policy evaluation from batched contextual bandit data with multidimensional actions, often termed slates. The problem is common to recommender systems and user-interface optimization, and it is particularly challenging because of the combinatorially-sized action space. Swaminathan et al. (2017) have proposed the pseudoinverse (PI) estimator under the assumption that the conditional mean rewards are additive in actions. Using control variates, we consider a large class of unbiased estimators that includes as specific cases the PI estimator and (asymptotically) its self-normalized variant. By optimizing over this class, we obtain new estimators with risk improvement guarantees over both the PI and the self-normalized PI estimators. Experiments with real-world recommender data as well as synthetic data validate these improvements in practice.

IJCAI Conference 2019 Conference Paper

Marginal Posterior Sampling for Slate Bandits

  • Maria Dimakopoulou
  • Nikos Vlassis
  • Tony Jebara

We introduce a new Thompson sampling-based algorithm, called marginal posterior sampling, for online slate bandits, that is characterized by three key ideas. First, it postulates that the slate-level reward is a monotone function of the marginal unobserved rewards of the base actions selected in the slates's slots, but it does not attempt to estimate this function. Second, instead of maintaining a slate-level reward posterior, the algorithm maintains posterior distributions for the marginal reward of each slot's base actions and uses the samples from these marginal posteriors to select the next slate. Third, marginal posterior sampling optimizes at the slot-level rather than the slate-level, which makes the approach computationally efficient. Simulation results establish substantial advantages of marginal posterior sampling over alternative Thompson sampling-based approaches that are widely used in the domain of web services.

ICML Conference 2019 Conference Paper

More Efficient Off-Policy Evaluation through Regularized Targeted Learning

  • Aurélien Bibaut
  • Ivana Malenica
  • Nikos Vlassis
  • Mark J. van der Laan

We study the problem of off-policy evaluation (OPE) in Reinforcement Learning (RL), where the aim is to estimate the performance of a new policy given historical data that may have been generated by a different policy, or policies. In particular, we introduce a novel doubly-robust estimator for the OPE problem in RL, based on the Targeted Maximum Likelihood Estimation principle from the statistical causal inference literature. We also introduce several variance reduction techniques that lead to impressive performance gains in off-policy evaluation. We show empirically that our estimator uniformly wins over existing off-policy evaluation methods across multiple RL environments and various levels of model misspecification. Finally, we further the existing theoretical analysis of estimators for the RL off-policy estimation problem by showing their $O_P(1/\sqrt{n})$ rate of convergence and characterizing their asymptotic distribution.

ICML Conference 2019 Conference Paper

On the Design of Estimators for Bandit Off-Policy Evaluation

  • Nikos Vlassis
  • Aurélien Bibaut
  • Maria Dimakopoulou
  • Tony Jebara

Off-policy evaluation is the problem of estimating the value of a target policy using data collected under a different policy. Given a base estimator for bandit off-policy evaluation and a parametrized class of control variates, we address the problem of computing a control variate in that class that reduces the risk of the base estimator. We derive the population risk as a function of the class parameters and we establish conditions that guarantee risk improvement. We present our main results in the context of multi-armed bandits, and we propose a simple design for contextual bandits that gives rise to an estimator that is shown to perform well in multi-class cost-sensitive classification datasets.

AAMAS Conference 2018 Conference Paper

Capacity-aware Sequential Recommendations

  • Frits de Nijs
  • Georgios Theocharous
  • Nikos Vlassis
  • Mathijs M. de Weerdt
  • Matthijs T. J. Spaan

Personalized recommendations are increasingly important to engage users and guide them through large systems, for example when recommending points of interest to tourists visiting a popular city. To maximize long-term user experience, the system should consider issuing recommendations sequentially, since by observing the user’s response to a recommendation, the system can update its estimate of the user’s (latent) interests. However, as traditional recommender systems target individuals, their effect on a collective of users can unintentionally overload capacity. Therefore, recommender systems should not only consider the users’ interests, but also the effect of recommendations on the available capacity. The structure in such a constrained, multi-agent, partially observable decision problem can be exploited by a novel belief-space sampling algorithm which bounds the size of the state space by a limit on regret. By exploiting the stationary structure of the problem, our algorithm is significantly more scalable than existing approximate solvers. Moreover, by explicitly considering the information value of actions, this algorithm significantly improves the quality of recommendations over an extension of posterior sampling reinforcement learning to the constrained multi-agent case. We show how to decouple constraint satisfaction from sequential recommendation policies, resulting in algorithms which issue recommendations to thousands of agents while respecting constraints.

NeurIPS Conference 2018 Conference Paper

Scalar Posterior Sampling with Applications

  • Georgios Theocharous
  • Zheng Wen
  • Yasin Abbasi Yadkori
  • Nikos Vlassis

We propose a practical non-episodic PSRL algorithm that unlike recent state-of-the-art PSRL algorithms uses a deterministic, model-independent episode switching schedule. Our algorithm termed deterministic schedule PSRL (DS-PSRL) is efficient in terms of time, sample, and space complexity. We prove a Bayesian regret bound under mild assumptions. Our result is more generally applicable to multiple parameters and continuous state action problems. We compare our algorithm with state-of-the-art PSRL algorithms on standard discrete and continuous problems from the literature. Finally, we show how the assumptions of our algorithm satisfy a sensible parameterization for a large class of problems in sequential recommendations.

NeurIPS Conference 2016 Conference Paper

A posteriori error bounds for joint matrix decomposition problems

  • Nicolo Colombo
  • Nikos Vlassis

Joint matrix triangularization is often used for estimating the joint eigenstructure of a set M of matrices, with applications in signal processing and machine learning. We consider the problem of approximate joint matrix triangularization when the matrices in M are jointly diagonalizable and real, but we only observe a set M' of noise perturbed versions of the matrices in M. Our main result is a first-order upper bound on the distance between any approximate joint triangularizer of the matrices in M' and any exact joint triangularizer of the matrices in M. The bound depends only on the observable matrices in M' and the noise level. In particular, it does not depend on optimization specific properties of the triangularizer, such as its proximity to critical points, that are typical of existing bounds in the literature. To our knowledge, this is the first a posteriori bound for joint matrix decomposition. We demonstrate the bound on synthetic data for which the ground truth is known.

IJCAI Conference 2016 Conference Paper

Matching via Dimensionality Reduction for Estimation of Treatment Effects in Digital Marketing Campaigns

  • Sheng Li
  • Nikos Vlassis
  • Jaya Kawale
  • Yun Fu

A widely used method for estimating counterfactuals and causal treatment effects from observational data is nearest-neighbor matching. This typically involves pairing each treated unit with its nearest-in-covariates control unit, and then estimating an average treatment effect from the set of matched pairs. Although straightforward to implement, this estimator is known to suffer from a bias that increases with the dimensionality of the covariate space, which can be undesirable in applications that involve high-dimensional data. To address this problem, we propose a novel estimator that first projects the data to a number of random linear subspaces, and it then estimates the median treatment effect by nearest-neighbor matching in each subspace. We empirically compute the mean square error of the proposed estimator using semi-synthetic data, and we demonstrate the method on real-world digital marketing campaign data. The results show marked improvement over baseline methods.

IJCAI Conference 2016 Conference Paper

Practical Linear Models for Large-Scale One-Class Collaborative Filtering

  • Suvash Sedhain
  • Hung Bui
  • Jaya Kawale
  • Nikos Vlassis
  • Branislav Kveton
  • Aditya Krishna Menon
  • Trung Bui
  • Scott Sanner

Collaborative filtering has emerged as the de facto approach to personalized recommendation problems. However, a scenario that has proven difficult in practice is the one-class collaborative filtering case (OC-CF), where one has examples of items that a user prefers, but no examples of items they do not prefer. In such cases, it is desirable to have recommendation algorithms that are personalized, learning-based, and highly scalable. Existing linear recommenders for OC-CF achieve good performance in benchmarking tasks, but they involve solving a large number of a regression subproblems, limiting their applicability to large-scale problems. We show that it is possible to scale up linear recommenders to big data by learning an OC-CF model in a randomized low-dimensional embedding of the user-item interaction matrix. Our algorithm, Linear-FLow, achieves state-of-the-art performance in a comprehensive set of experiments on standard benchmarks as well as real data.

ICML Conference 2016 Conference Paper

Tensor Decomposition via Joint Matrix Schur Decomposition

  • Nicolò Colombo
  • Nikos Vlassis

We describe an approach to tensor decomposition that involves extracting a set of observable matrices from the tensor and applying an approximate joint Schur decomposition on those matrices, and we establish the corresponding first-order perturbation bounds. We develop a novel iterative Gauss-Newton algorithm for joint matrix Schur decomposition, which minimizes a nonconvex objective over the manifold of orthogonal matrices, and which is guaranteed to converge to a global optimum under certain conditions. We empirically demonstrate that our algorithm is faster and at least as accurate and robust than state-of-the-art algorithms for this problem.

UAI Conference 2015 Conference Paper

Stable Spectral Learning Based on Schur Decomposition

  • Nicolò Colombo
  • Nikos Vlassis

Spectral methods are a powerful tool for inferring the parameters of certain classes of probability distributions by means of standard eigenvalueeigenvector decompositions. Spectral algorithms can be orders of magnitude faster than loglikelihood based and related iterative methods, and, thanks to the uniqueness of the spectral decomposition, they enjoy global optimality guarantees. In practice, however, the applicability of spectral methods is limited due to their sensitivity to model misspecification, which can cause instability issues in the case of non-exact models. We present a new spectral approach that is based on the Schur triangularization of an observable matrix, and we carry out the corresponding theoretical analysis. Our main result is a bound on the estimation error that is shown to depend linearly on the condition number of the ground-truth conditional probability matrix and inversely on the eigengap of an observable matrix. Numerical experiments show that the proposed method is more stable, and performs better in general, than the classical spectral approach using direct matrix diagonalization.

AAMAS Conference 2008 Conference Paper

Exploiting Locality of Interaction in Factored Dec-POMDPs

  • Frans Oliehoek
  • Matthijs Spaan
  • Shimon Whiteson
  • Nikos Vlassis

Decentralized partially observable Markov decision processes (Dec-POMDPs) constitute an expressive framework for multiagent planning under uncertainty, but solving them is provably intractable. We demonstrate how their scalability can be improved by exploiting locality of interaction between agents in a factored representation. Factored Dec-POMDP representations have been proposed before, but only for Dec- POMDPs whose transition and observation models are fully independent. Such strong assumptions simplify the planning problem, but result in models with limited applicability. By contrast, we consider general factored Dec-POMDPs for which we analyze the model dependencies over space (locality of interaction) and time (horizon of the problem). We also present a formulation of decomposable value functions. Together, our results allow us to exploit the problem structure as well as heuristics in a single framework that is based on collaborative graphical Bayesian games (CGBGs). A preliminary experiment shows a speedup of two orders of magnitude.

ICAPS Conference 2008 Conference Paper

Multiagent Planning Under Uncertainty with Stochastic Communication Delays

  • Matthijs T. J. Spaan
  • Frans A. Oliehoek
  • Nikos Vlassis

We consider the problem of cooperative multiagent planning under uncertainty, formalized as a decentralized partially observable Markov decision process (Dec-POMDP). Unfortunately, in these models optimal planning is provably intractable. By communicating their local observations before they take actions, agents synchronize their knowledge of the environment, and the planning problem reduces to a centralized POMDP. As such, relying on communication significantly reduces the complexity of planning. In the real world however, such communication might fail temporarily. We present a step towards more realistic communication models for Dec-POMDPs by proposing a model that: (1) allows that communication might be delayed by one or more time steps, and (2) explicitly considers future probabilities of successful communication. For our model, we discuss how to efficiently compute an (approximate) value function and corresponding policies, and we demonstrate our theoretical results with encouraging experiments.

AAMAS Conference 2007 Conference Paper

Q-Value Functions for Decentralized POMDPs

  • Frans A. Oliehoek
  • Nikos Vlassis

Planning in single-agent models like MDPs and POMDPs can be carried out by resorting to Q-value functions: a (near-) optimal Q-value function is computed in a recursive manner by dynamic programming, and then a policy is extracted from this value function. In this paper we study whether similar Q-value functions can be defined in decentralized POMDP models (Dec-POMDPs), what the cost of computing such value functions is, and how policies can be extracted from such value functions. Using the framework of Bayesian games, we argue that searching for the optimal Q-value function may be as costly as exhaustive policy search. Then we analyze various approximate Q-value functions that allow efficient computation. Finally, we describe a family of algorithms for extracting policies from such Q-value functions.

NeurIPS Conference 2006 Conference Paper

Accelerated Variational Dirichlet Process Mixtures

  • Kenichi Kurihara
  • Max Welling
  • Nikos Vlassis

Dirichlet Process (DP) mixture models are promising candidates for clustering applications where the number of clusters is unknown a priori. Due to compu- tational considerations these models are unfortunately unsuitable for large scale data-mining applications. We propose a class of deterministic accelerated DP mixture models that can routinely handle millions of data-cases. The speedup is achieved by incorporating kd-trees into a variational Bayesian algorithm for DP mixtures in the stick-breaking representation, similar to that of Blei and Jordan (2005). Our algorithm differs in the use of kd-trees and in the way we handle truncation: we only assume that the variational distributions are fixed at their pri- ors after a certain level. Experiments show that speedups relative to the standard variational algorithm can be significant.

ICML Conference 2006 Conference Paper

An analytic solution to discrete Bayesian reinforcement learning

  • Pascal Poupart
  • Nikos Vlassis
  • Jesse Hoey
  • Kevin Regan 0001

Reinforcement learning (RL) was originally proposed as a framework to allow agents to learn in an online fashion as they interact with their environment. Existing RL algorithms come short of achieving this goal because the amount of exploration required is often too costly and/or too time consuming for online learning. As a result, RL is mostly used for offline learning in simulated environments. We propose a new algorithm, called BEETLE, for effective online learning that is computationally efficient while minimizing the amount of exploration. We take a Bayesian model-based approach, framing RL as a partially observable Markov decision process. Our two main contributions are the analytical derivation that the optimal value function is the upper envelope of a set of multivariate polynomials, and an efficient point-based value iteration algorithm that exploits this simple parameterization.

JMLR Journal 2006 Journal Article

Collaborative Multiagent Reinforcement Learning by Payoff Propagation

  • Jelle R. Kok
  • Nikos Vlassis

In this article we describe a set of scalable techniques for learning the behavior of a group of agents in a collaborative multiagent setting. As a basis we use the framework of coordination graphs of Guestrin, Koller, and Parr (2002a) which exploits the dependencies between agents to decompose the global payoff function into a sum of local terms. First, we deal with the single-state case and describe a payoff propagation algorithm that computes the individual actions that approximately maximize the global payoff function. The method can be viewed as the decision-making analogue of belief propagation in Bayesian networks. Second, we focus on learning the behavior of the agents in sequential decision-making tasks. We introduce different model-free reinforcement-learning techniques, unitedly called Sparse Cooperative Q -learning, which approximate the global action-value function based on the topology of a coordination graph, and perform updates using the contribution of the individual agents to the maximal global action value. The combined use of an edge-based decomposition of the action-value function and the payoff propagation algorithm for efficient action selection, result in an approach that scales only linearly in the problem size. We provide experimental evidence that our method outperforms related multiagent reinforcement-learning methods based on temporal differences. [abs] [ pdf ][ bib ] &copy JMLR 2006. ( edit, beta )

JMLR Journal 2006 Journal Article

Point-Based Value Iteration for Continuous POMDPs

  • Josep M. Porta
  • Nikos Vlassis
  • Matthijs T.J. Spaan
  • Pascal Poupart

We propose a novel approach to optimize Partially Observable Markov Decisions Processes (POMDPs) defined on continuous spaces. To date, most algorithms for model-based POMDPs are restricted to discrete states, actions, and observations, but many real-world problems such as, for instance, robot navigation, are naturally defined on continuous spaces. In this work, we demonstrate that the value function for continuous POMDPs is convex in the beliefs over continuous state spaces, and piecewise-linear convex for the particular case of discrete observations and actions but still continuous states. We also demonstrate that continuous Bellman backups are contracting and isotonic ensuring the monotonic convergence of value-iteration algorithms. Relying on those properties, we extend the algorithm, originally developed for discrete POMDPs, to work in continuous state spaces by representing the observation, transition, and reward models using Gaussian mixtures, and the beliefs using Gaussian mixtures or particle sets. With these representations, the integrals that appear in the Bellman backup can be computed in closed form and, therefore, the algorithm is computationally feasible. Finally, we further extend to deal with continuous action and observation sets by designing effective sampling approaches. [abs] [ pdf ][ bib ] &copy JMLR 2006. ( edit, beta )

ICRA Conference 2005 Conference Paper

Planning with Continuous Actions in Partially Observable Environments

  • Matthijs T. J. Spaan
  • Nikos Vlassis

We present a simple randomized POMDP al gorithm for planning with continuous actions in partially observable environments. Our algorithm operates on a set of reachable belief points, sampled by letting the robot interact randomly with the environment. We perform value iteration steps, ensuring that in each step the value of all sampled belief points is improved. The idea here is that by sampling actions from a continuous action space we can quickly improve the value of all belief points in the set. We demonstrate the viability of our algorithm on two sets of experiments: one involving an active localization task and one concerning robot navigation in a perceptually aliased of fice environment.

ICRA Conference 2004 Conference Paper

A Point-based POMDP Algorithm for Robot Planning

  • Matthijs T. J. Spaan
  • Nikos Vlassis

We present an approximate POMDP solution method for robot planning in partially observable environments. Our algorithm belongs to the family of point-based value iteration solution techniques for POMDP, in which planning is performed only on a sampled set of reachable belief points. We describe a simple, randomized procedure that performs value update steps that strictly improve the value of all belief points in each step. We demonstrate our algorithm on a robotic delivery task in an office environment and on several benchmark problems, for which we compute solutions that are very competitive to those of state-of-the-art methods in terms of speed and solution quality.

NeurIPS Conference 2004 Conference Paper

Newscast EM

  • Wojtek Kowalczyk
  • Nikos Vlassis

We propose a gossip-based distributed algorithm for Gaussian mixture learning, Newscast EM. The algorithm operates on network topologies where each node observes a local quantity and can communicate with other nodes in an arbitrary point-to-point fashion. The main difference between Newscast EM and the standard EM algorithm is that the M-step in our case is implemented in a decentralized manner: (random) pairs of nodes repeatedly exchange their local parameter estimates and com- bine them by (weighted) averaging. We provide theoretical evidence and demonstrate experimentally that, under this protocol, nodes converge ex- ponentially fast to the correct estimates in each M-step of the EM algo- rithm.

NeurIPS Conference 2003 Conference Paper

Non-linear CCA and PCA by Alignment of Local Models

  • Jakob Verbeek
  • Sam Roweis
  • Nikos Vlassis

We propose a non-linear Canonical Correlation Analysis (CCA) method which works by coordinating or aligning mixtures of linear models. In the same way that CCA extends the idea of PCA, our work extends re- cent methods for non-linear dimensionality reduction to the case where multiple embeddings of the same underlying low dimensional coordi- nates are observed, each lying on a different high dimensional manifold. We also show that a special case of our method, when applied to only a single manifold, reduces to the Laplacian Eigenmaps algorithm. As with previous alignment schemes, once the mixture models have been estimated, all of the parameters of our model can be estimated in closed form without local optima in the learning. Experimental results illustrate the viability of the approach as a non-linear extension of CCA.

ICRA Conference 2002 Conference Paper

Auxiliary Particle Filter Robot Localization from High-Dimensional Sensor Observations

  • Nikos Vlassis
  • Bas Terwijn
  • Ben J. A. Kröse

We apply the auxiliary particle filter algorithm of Pitt and Shephard (1999) to the problem of robot localization. To deal with the high-dimensional sensor observations (images) and an unknown observation model, we propose the use of an inverted nonparametric observation model computed by nearest neighbor conditional density estimation. We show that the proposed model can lead to a fully adapted optimal filter, and is able to successfully handle image occlusion and robot kidnap. The proposed algorithm is very simple to implement and exhibits a high degree of robustness in practice. We report experiments involving robot localization from omnidirectional vision in an indoor environment.

ICRA Conference 2001 Conference Paper

Edge-based Features from Omnidirectional Images for Robot Localization

  • Nikos Vlassis
  • Yoichi Motomura
  • Isao Hara
  • Hideki Asoh

We propose a method for extracting low-dimensional features from omnidirectional images to be used for robot localization and navigation. Edge detection is combined with thresholding to locate sharp edge pixels, the coordinates of which are fed into a Parzen density estimator (1962) to compute the edge spatial density. The use of the fast Fourier transform makes this density estimate feasible in real-time, while principal component analysis further drops the dimensionality of the resulting feature vector to a manageable number. We show experimental results from a Nomad XR4000 robot in an office environment.

ICRA Conference 2001 Conference Paper

Learning Task-relevant Features from Robot Data

  • Nikos Vlassis
  • Roland Bunschoten
  • Ben J. A. Kröse

Feature extraction from robot sensor data is a standard way to deal with the high dimensionality and redundancy of such data. In order to get optimal task-relevant features, PCA must be replaced by a supervised projection method. In this paper we extend our previously proposed supervised linear feature extraction method (2000) in two ways: 1) the projection matrix is optimized simultaneously over all columns under the constraint of orthonormality; and 2) a Jacobi parametrization of the matrix allows the use of unconstrained nonlinear optimization algorithms. The new algorithm is more efficient and many times faster than the old version. We show experimental results in extracting features from panoramic images of a mobile robot. The results compare favorably to the PCA solutions.

ICRA Conference 2000 Conference Paper

Supervised Linear Feature Extraction for Mobile Robot Localization

  • Nikos Vlassis
  • Yoichi Motomura
  • Ben J. A. Kröse

We are seeking linear projections of supervised high-dimensional robot observations and an appropriate environment model that optimize the robot localization task. We show that an appropriate risk function to minimize is the conditional entropy of the robot positions given the projected observations. We propose a method of iterative optimization through a probabilistic model based on kernel smoothing. To obtain good starting optimization solutions we use canonical correlation analysis. We apply our method on a real experiment involving a mobile robot equipped with an omnidirectional camera in an office setup.

IROS Conference 1999 Conference Paper

Robot environment modeling via principal component regression

  • Nikos Vlassis
  • Ben J. A. Kröse

A key issue in mobile robot applications involves building a map of the environment to be used by the robot for localization and path planning. We propose a framework for robot map building which is based on principal component regression, a statistical method for extracting low-dimensional dependencies between a set of input and target values. A supervised set of robot positions (inputs) and associated high-dimensional sensor measurements (targets) are assumed. A set of globally uncorrelated features of the original sensor measurements are obtained by applying principal component analysis on the target set. A parametrized model of the conditional density function of the sensor features given the robot positions is built based on an unbiased estimation procedure that fits interpolants for both the mean and the variance of each feature independently. The simulation results show that the average Bayesian localization error is an increasing function of the principal component index.

ICRA Conference 1998 Conference Paper

A Sensory Uncertainty Field Model for Unknown and Non-Stationary Mobile Robot Environments

  • Nikos Vlassis
  • Panayotis Tsanakas

A sensory uncertainty field (SUF) is a model of the localization uncertainty of a mobile robot. The value of the SUF at a specific robot configuration q expresses the expected uncertainty of the robot at q, as this would be measured by some localization procedure. Path planning over the SUF provides a way for better localization, and thus fewer failures, during navigation. In this paper we extend the original notion of a SUF to unknown and non-stationary environments. We propose a self-organizing neural network model that is capable of building and maintaining an estimation of the SUF while the robot moves around its free space, based on some dynamic localization information, e. g. , Kalman filtering. The attractive feature of our algorithm is its capability of handling both unknown and dynamic, i. e. , non-stationary, environments. We present a method for polygonal approximation of the resulting SUF by using the Delaunay triangulation.

IROS Conference 1998 Conference Paper

Dynamic sensory probabilistic maps for mobile robot localization

  • Nikos Vlassis
  • George K. Papakonstantinou
  • Panayotis Tsanakas

In order to localize itself a mobile robot tries to match its sensory information at any instant against a prior environment model, the map. A probabilistic map can be regarded as a model that stores at each robot configuration q the probability density function of the sensor readings at q. By combining the knowledge of its current position, the new-coming sensory information, and the probabilistic map the robot is capable of improving its prior position estimate. In this paper we propose a novel sensor model and a method for maintaining a probabilistic map in cases of dynamic environments. When the environment structure changes, the map must adapt to this change by modifying the sensor densities, at the respective configurations. We propose a combined algorithm for map update and robot localization.