Arrow Research search

Author name cluster

Craig Knoblock

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
1 author row

Possible papers

5

IS Journal 2018 Journal Article

Investigative Knowledge Discovery for Combating Illicit Activities

  • Mayank Kejriwal
  • Pedro Szekely
  • Craig Knoblock

Developing scalable, semi-automatic approaches to derive insights from a domain-specific Web corpus is a longstanding research problem in the knowledge discovery community. The problem is particularly challenging in illicit fields, such as human trafficking, where traditional assumptions concerning information representation are frequently violated. In this article, we describe an end-to-end investigative knowledge discovery system for illicit Web domains. We built and evaluated a prototype, involving separate components for information extraction, semantic modeling and query execution, on a real-world human trafficking Web corpus containing 1. 3 million pages, with promising results.

AAAI Conference 2018 Conference Paper

Load Scheduling of Simple Temporal Networks Under Dynamic Resource Pricing

  • T. K. Satish Kumar
  • Zhi Wang
  • Anoop Kumar
  • Craig Rogers
  • Craig Knoblock

We study load scheduling of simple temporal networks (STNs) under dynamic pricing of resources. We are given a set of processes and a set of simple temporal constraints between their execution times, i. e. , an STN. Each process uses a certain amount of resource for execution. The unit price of the resource is a function of time, f(t). The goal is to find a schedule of a given STN that trades off makespan minimization against cost minimization within a user-specified suboptimality bound. We provide a polynomial-time algorithm for solving the load scheduling problem when f(t) is piecewise constant. This has important applications in many real-world domains including the smart home and smart grid domains. We then study the dependency of the unit price of the resource on time as well as the total demand at that time. This leads to a further characterization of tractable, NP-hard, and conjectured tractable cases.

AAAI Conference 2011 Conference Paper

Using Conditional Random Fields to Exploit Token Structure and Labels for Accurate Semantic Annotation

  • Aman Goel
  • Craig Knoblock
  • Kristina Lerman

Automatic semantic annotation of structured data enables unsupervised integration of data from heterogeneous sources but is difficult to perform accurately due to the presence of many numeric fields and proper-noun fields that do not allow reference-based approaches and the absence of natural language text that prevents the use of language-based approaches. In addition, several of these semantic types have multiple heterogeneous representations, while sharing syntactic structure with other types. In this work, we propose a new approach to use conditional random fields (CRFs) to perform semantic annotation of structured data that takes advantage of the structure and labels of the tokens for higher accuracy of field labeling, while still allowing the use of exact inference techniques. We compare our approach with a linear-CRF based model that only labels fields and also with a regular-expression based approach.

AAAI Conference 1999 Short Paper

Active Learning for Hierarchical Wrapper Induction

  • Ion Muslea
  • Steve Minton
  • Craig Knoblock
  • University of Southern California

As an alternative to manually writing extraction rules, we created STALKER, which is a wrapper induction algorithm that learns high-accuracy extraction rules. The major novelty introduced by STALKER is the concept of hierarchical wrapper induction: the extraction of the relevant data is performed in a hierarchical manner based on the embedded catalog tree (ECT), which is a user-provided description of the information to be extracted.