Author name cluster

Ron Kohavi

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers

2 author rows

ICML Conference 1998 Conference Paper

The Case against Accuracy Estimation for Comparing Induction Algorithms

Foster J. Provost
Tom Fawcett
Ron Kohavi

Details

ICML Conference 1997 Conference Paper

Option Decision Trees with Majority Votes

Ron Kohavi
Clayton Kunz

Details

AIJ Journal 1997 Journal Article

Wrappers for feature subset selection

Ron Kohavi
George H. John

In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider how the algorithm and the training set interact. We explore the relation between optimal feature subset selection and relevance. Our wrapper method searches for an optimal feature subset tailored to a particular algorithm and a domain. We study the strengths and weaknesses of the wrapper approach and show a series of improved designs. We compare the wrapper approach to induction without feature subset selection and to Relief, a filter approach to feature subset selection. Significant improvement in accuracy is achieved for some datasets for the two families of induction algorithms used: decision trees and Naive-Bayes.

Details DOI

ICML Conference 1996 Conference Paper

Bias Plus Variance Decomposition for Zero-One Loss Functions

Ron Kohavi
David H. Wolpert

Details

IJCAI Conference 1995 Conference Paper

A Studv of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selectio

Ron Kohavi

We review accuracy estimation methods and compare the two most common methods crossvalidation and bootstrap Recent experimental results on artificial data and theoretical re cults m restricted settings have shown that for selecting a good classifier from a set of classifiers (model selection), ten-fold cross-validation may be better than the more expensive ka\p one-out cross-validation We report on a largescale experiment—over half a million runs of C4 5 and aNaive-Bayes algorithm—loestimale the effects of different parameters on these al gonthms on real-world datascts For crossvalidation we vary the number of folds and whether the folds arc stratified or not, for bootstrap, we vary the number of bootstrap samples Our results indicate that for real-word datasets similar to ours, The best method lo use for model selection is ten fold stratified cross validation even if computation power allows using more folds

PDF

ICML Conference 1995 Conference Paper

Automatic Parameter Selection by Minimizing Estimated Error

Ron Kohavi
George H. John

Details

IJCAI Conference 1995 Conference Paper

Oblivious Decision Trees Graphs and Top Down Pruning

Ron Kohavi
(hia-Hsin Li

We describe a supervised learning algorithm, EODG that uses mutual information to build an oblivious decision tree The tree is then converted to an Oblivious read-Onre Decision Graph (OODG) b\ merging nodes at the same level of the tree For domains that art appropriate for both decision trees and OODGs, per formance is approximately the same aS THAT of C45 ), but the number of nodes in the OODG is much smalle r The merging phase that converts the oblivious decision tree to an OODG provides a new way of dealing with the replication problem and a new pruning mechanism that works lop down starting from tin root The pruning mechanism is well suited for finding symmetries and aids in recovering from splits on irrelevant features that mav happen during the tree consLrm tion

PDF

ICML Conference 1995 Conference Paper

Supervised and Unsupervised Discretization of Continuous Features

James Dougherty
Ron Kohavi
Mehran Sahami

Details

AAAI Conference 1994 Conference Paper

Bottom-Up Induction of Oblivious Read-Once Decision Graphs: Strengths and Limitations

Ron Kohavi

We report improvements to HOODG, a supervised learning algorithm that induces concepts from labelled instances using oblivious, read-once decision graphs as the underlying hypothesis representation structure. While it is shown that the greedy approach to variable ordering is locally optimal, we also show an inherent limitation of all bottom-up induction algorithms, including HOODG, that construct such decision graphs bottom-up by minimizing the width of levels in the resulting graph. We report our empirical experiments that demonstrate the algorithm’ s generalization power.

PDF Details

ICML Conference 1994 Conference Paper

Irrelevant Features and the Subset Selection Problem

George H. John
Ron Kohavi
Karl Pfleger

Details