Arrow Research search

Author name cluster

Simon Kasif

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers
2 author rows

Possible papers

12

AIIM Journal 2010 Journal Article

Predicting malaria interactome classifications from time-course transcriptomic data along the intraerythrocytic developmental cycle

  • Antonina Mitrofanova
  • Samantha Kleinberg
  • Jane Carlton
  • Simon Kasif
  • Bud Mishra

Objective Even though a vaccine for malaria infections has been under intense study for many years, it has resisted several different lines of attack attempted by biologists. More than half of Plasmodium proteins still remain uncharacterized and therefore cannot be used in clinical trials. The task is further complicated by the metamorphic life-cycle of the parasite, which allows for rapid evolutionary changes and diversity among related strains, thus making precise targeting of the appropriate proteins for vaccination a technical challenge. We propose an automated method for predicting functions for the malaria parasite, which capitalizes on the importance of the intraerythrocytic developmental cycle data and expression changes during its five phases, as determined computationally by our segmentation algorithm. Materials and methods Our method combines temporal gene expression profiles with protein–protein interaction data, sequence similarity scores, and metabolic pathway information to produce a set of predicted protein functions that can be used as targets for vaccine development. We use a Bayesian approach, which assigns a probability of having (or not having) a particular function to each protein, given the various sources of evidence. In our method, each data source is represented by either a functional linkage graph or a categorical feature vector. Results and conclusions The methods are tested on Plasmodium falciparum, the species responsible for the deadliest malaria infections. The algorithm was able to assign meaningful functions to 628 out of 1439 previously unannotated proteins, which are first-choice candidates for experimental vaccine research. We conclude that analyzing time-course gene expression profiles in separate phases leads to much higher prediction accuracy when compared with Pearson correlation coefficients computed across the time course as a whole. Additionally, we demonstrate that temporal expression profiles alone are able to improve the predictive power of the integrated data.

FOCS Conference 2002 Conference Paper

Learning a Hidden Matching

  • Noga Alon
  • Richard Beigel
  • Simon Kasif
  • Steven Rudich
  • Benny Sudakov

We consider the problem of learning a matching (i. e. , a graph in which all vertices have degree 0 or 1) in a model where the only allowed operation is to query whether a set of vertices induces an edge. This is motivated by a problem that arises in molecular biology. In the deterministic nonadaptive setting, we prove a ( 1/2 +o(1))(n/2) upper bound and a nearly matching 0. 32(n/2) lower bound for the minimum possible number of queries. In contrast, if we allow randomness then we obtain (by a randomized, nonadaptive algorithm) a much lower O(n log n) upper bound, which is best possible (even for randomized fully adaptive algorithms).

AIJ Journal 1998 Journal Article

A probabilistic framework for memory-based reasoning

  • Simon Kasif
  • Steven Salzberg
  • David Waltz
  • John Rachlin
  • David W. Aha

In this paper, we propose a probabilistic framework for memory-based reasoning (MBR). The framework allows us to clarify the technical merits and limitations of several recently published MBR methods and to design new variants. The proposed computational framework consists of three components: a specification language to define an adaptive notion of relevant context for a query; mechanisms for retrieving this context; and local learning procedures that are used to induce the desired action from this context. We primarily focus on actions in the form of a classification. Based on the framework we derive several analytical and empirical results that shed light on MBR algorithms. We introduce the notion of an MBR transform, and discuss its utility for learning algorithms. We also provide several perspectives on memory-based reasoning from a multi-disciplinary point of view.

UAI Conference 1995 Conference Paper

Logarithmic-Time Updates and Queries in Probabilistic Networks

  • Arthur L. Delcher
  • Adam J. Grove
  • Simon Kasif
  • Judea Pearl

In this paper we propose a dynamic data structure that supports efficient algorithms for updating and querying singly connected Bayesian networks (causal trees and polytrees). In the conventional algorithms, new evidence in absorbed in time O(1) and queries are processed in time O(N), where N is the size of the network. We propose a practical algorithm which, after a preprocessing phase, allows us to answer queries in time O(log N) at the expense of O(logn N) time per evidence absorption. The usefulness of sub-linear processing time manifests itself in applications requiring (near) real-time response over large probabilistic databases.

AIJ Journal 1994 Journal Article

Local consistency in parallel constraint satisfaction networks

  • Simon Kasif
  • Arthur L. Delcher

In this paper we present several basic techniques for achieving parallel execution of constraint networks. The major result supported by our investigations is that the parallel complexity of constraint networks is critically dependent on subtle properties of the network that do not influence its sequential complexity.

AIJ Journal 1990 Journal Article

On the parallel complexity of discrete relaxation in constraint satisfaction networks

  • Simon Kasif

Constraint satisfaction networks have been shown to be a very useful tool for knowledge representation in Artificial Intelligence applications. These networks often utilize local constraint propagation techniques to achieve local consistency (consistent labeling in vision). Such methods have been used extensively in the context of image understanding and interpretation, as well as planning, natural language analysis and truth maintenance systems. In this paper we study the parallel complexity of discrete relaxation, one of the most commonly used constraint propagation techniques. Since the constraint propagation procedures such as discrete relaxation appear to operate locally, it has been previously believed that the relaxation approach for achieving local consistency has a natural parallel solution. Our analysis suggests that a parallel solution is unlikely to improve the known sequential solutions by much. Specifically, we prove that the problem solved by discrete relaxation (arc consistency) is log-space complete for P (the class of polynomial-time deterministic sequential algorithms). Intuitively, this implies that discrete relaxation is inherently sequential and it is unlikely that we can solve the polynomial-time version of the consistent labeling problem in logarithmic time by using only a polynomial number of processors. Some practical implications of our result are discussed. We also provide a two-way transformation between AND/OR graphs, propositional Horn satisfiability and local consistency in constraint networks that allows us to develop optimal linear-time algorithms for local consistency in constraint networks.

AAAI Conference 1986 Conference Paper

On the Parallel Complexity of Some Constraint Satisfaction Problems

  • Simon Kasif

Constraint satisfaction networks have been shown to be a very useful tool for knowledge representation in Artificial Intelligence applications. These networks often utilize local constraint propagation techniques to achieve global consistency (consistent labelling in vision). Such methods have been used extensively in the context of image understanding and interpretation, as well as planning, natural language analysis and commonsense reasoning. In this paper we study the parallel complexity of discrete relaxation, one of the most commonly used constraint satisfaction techniques. Since the constraint propagation procedures such as discrete relaxation appear to operate locally, it has been previously believed that the relaxation approach for achieving global consistency has a natural parallel solution. Our analysis suggests that a parallel solution is unlikely to improve by much the known sequential solutions. Specifically, we prove that the problem solved by discrete relaxation is log-space complete for P (the class of polynomial time deterministic sequential algorithms). Intuitively, this implies that discrete relaxation is inherently sequential and it is unlikely that we can solve the polynomial time version of the consistent labelling problem in logarithmic time by using only a polynomial number of processors. Some practical implications of our result are discussed.