Arrow Research search
Back to AAAI

AAAI 2023

Diversity Maximization in the Presence of Outliers

Conference Paper AAAI Technical Track on Search and Optimization Artificial Intelligence

Abstract

Given a set X of n points in a metric space, the problem of diversity maximization is to extract a set S of k points from X so that the diversity of S is maximized. This problem is essential in AI-related fields, such as web search, databases, recommender systems, and data mining. Although there have been extensive studies of this problem, these studies assume that X is clean. This usually does not hold, because real-world datasets usually contain outliers. The state-of-the-art algorithm for the diversity maximization problem is based on furthest point retrieval, which is too sensitive to outliers. We therefore address the problem of diversity maximization with outliers and propose two algorithms with performance guarantee. The first algorithm runs in O((k+z)n) time, guarantees 1/2-approximation, and returns no outliers, where z is the number of outliers. The second algorithm runs in O(kz) time (which is independent of n), guarantees 1/6(1+epsilon)-approximation, and returns no outliers with constant probability. We conduct experiments on real datasets to demonstrate the effectiveness and efficiency of our algorithms.

Authors

Keywords

  • CSO: Constraint Optimization
  • DMKM: Scalability, Parallel & Distributed Systems
  • SO: Other Foundations of Search & Optimization

Context

Venue
AAAI Conference on Artificial Intelligence
Archive span
1980-2026
Indexed papers
28718
Paper id
670911301426508866