Arrow Research search

Author name cluster

Randy Goebel

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

20 papers
2 author rows

Possible papers

20

ECAI Conference 2023 Conference Paper

From Intermediate Representations to Explanations: Exploring Hierarchical Structures in NLP

  • Housam Khalifa Bashier Babiker
  • Mi-Young Kim
  • Randy Goebel

Interpretation methods for learned models used in natural language processing (NLP) applications usually provide support for local (specific) explanations, such as quantifying the contribution of each word to the predicted class. But they typically ignore the potential interaction amongst those word tokens. Unlike currently popular methods, we propose a deep model which uses feature attribution and identification of dependencies to support the learning of interpretable representations that will support creation of hierarchical explanations. In addition, hierarchical explanations provide a basis for visualizing how words and phrases are combined at different levels of abstraction, which enables end-users to better understand the prediction process of a deep network. Our study uses multiple well-known datasets to demonstrate the effectiveness of our approach, and provides both automatic and human evaluation.

AAAI Conference 2023 Conference Paper

The Sufficiency of Off-Policyness and Soft Clipping: PPO Is Still Insufficient according to an Off-Policy Measure

  • Xing Chen
  • Dongcui Diao
  • Hechang Chen
  • Hengshuai Yao
  • Haiyin Piao
  • Zhixiao Sun
  • Zhiwei Yang
  • Randy Goebel

The popular Proximal Policy Optimization (PPO) algorithm approximates the solution in a clipped policy space. Does there exist better policies outside of this space? By using a novel surrogate objective that employs the sigmoid function (which provides an interesting way of exploration), we found that the answer is "YES", and the better policies are in fact located very far from the clipped space. We show that PPO is insufficient in "off-policyness", according to an off-policy metric called DEON. Our algorithm explores in a much larger policy space than PPO, and it maximizes the Conservative Policy Iteration (CPI) objective better than PPO during training. To the best of our knowledge, all current PPO methods have the clipping operation and optimize in the clipped policy space. Our method is the first of this kind, which advances the understanding of CPI optimization and policy gradient methods. Code is available at https://github.com/raincchio/P3O.

TCS Journal 2020 Journal Article

Approximation algorithms for the three-machine proportionate mixed shop scheduling

  • Longcheng Liu
  • Yong Chen
  • Jianming Dong
  • Randy Goebel
  • Guohui Lin
  • Yue Luo
  • Guanqun Ni
  • Bing Su

A mixed shop is a manufacturing infrastructure designed to process a mixture of a set of flow-shop jobs and a set of open-shop jobs. Mixed shops are in general much more complex to schedule than flow-shops and open-shops, and have been studied since the 1980's. We consider the three machine proportionate mixed shop problem denoted as M 3 | p r p t | C max, in which by “proportionate” each job has equal processing times on all three machines. Koulamas and Kyparisis (2015) [6] showed that the problem is solvable in polynomial time in some very special cases; for the non-solvable case, they proposed a 5/3-approximation algorithm. In this paper, we first present an improved 4/3-approximation algorithm and show that this ratio of 4/3 is asymptotically tight; when the largest job is a flow-shop job, we then present a fully polynomial-time approximation scheme (FPTAS). On the negative side, while the F 3 | p r p t | C max problem is polynomial-time solvable, we show an interesting hardness result that adding one open-shop job to the job set makes the problem NP-hard if this open-shop job is larger than any flow-shop job. We are able to design an FPTAS for this special case too.

TCS Journal 2020 Journal Article

Open-shop scheduling for unit jobs under precedence constraints

  • Yong Chen
  • Randy Goebel
  • Guohui Lin
  • Bing Su
  • An Zhang

We study open-shop scheduling for unit jobs under precedence constraints, where if one job precedes another job then it has to be finished before the other job can start to be processed. For the three-machine open-shop to minimize the makespan, we first present a simple 5/3-approximation algorithm based on a partition of the job set into agreeable layers using the natural layered representation of the precedence graph, which is directed acyclic. We then show a greedy algorithm to reduce the number of singleton-job layers, resulting in an improved partition, which leads to a 4/3-approximation algorithm. Both approximation algorithms apply to the general m-machine open-shops too.

AAAI Conference 2019 Short Paper

A Multi-Task Learning Framework for Abstractive Text Summarization

  • Yao Lu
  • Linqing Liu
  • Zhile Jiang
  • Min Yang
  • Randy Goebel

We propose a Multi-task learning approach for Abstractive Text Summarization (MATS), motivated by the fact that humans have no difficulty performing such task because they have the capabilities of multiple domains. Specifically, MATS consists of three components: (i) a text categorization model that learns rich category-specific text representations using a bi-LSTM encoder; (ii) a syntax labeling model that learns to improve the syntax-aware LSTM decoder; and (iii) an abstractive text summarization model that shares its encoder and decoder with the text categorization and the syntax labeling tasks, respectively. In particular, the abstractive text summarization model enjoys significant benefit from the additional text categorization and syntax knowledge. Our experimental results show that MATS outperforms the competitors. 1

TCS Journal 2018 Journal Article

An approximation scheme for minimizing the makespan of the parallel identical multi-stage flow-shops

  • Weitian Tong
  • Eiji Miyano
  • Randy Goebel
  • Guohui Lin

In the parallel k-stage flow-shops problem, we are given m identical k-stage flow-shops and a set of jobs. Each job can be processed by any one of the flow-shops but switching between flow-shops is not allowed. The objective is to minimize the makespan, which is the finishing time of the last job. This problem generalizes the classical parallel identical machine scheduling (where k = 1 ) and the classical flow-shop scheduling (where m = 1 ) problems, and thus it is NP-hard. We present a polynomial-time approximation scheme (PTAS) for the problem, when m and k are fixed constants. The key technique is to partition the jobs into big jobs and small jobs, enumerate over all feasible schedules for the big jobs, and handle the small jobs by solving a linear program and employing a “sliding” method. Such a technique has been used in the design of PTAS for several flow-shop scheduling variants. Our main contributions are the non-trivial application of this technique and a valid answer to the open question in the literature.

TCS Journal 2016 Journal Article

Smoothed heights of tries and patricia tries

  • Weitian Tong
  • Randy Goebel
  • Guohui Lin

Tries and patricia tries are two popular data structures for storing strings. Let H n denote the height of the trie (the patricia trie, respectively) on a set of n strings. Under the uniform distribution model on the strings, it is well known that H n / log ⁡ n → 2 for tries and H n / log ⁡ n → 1 for patricia tries, when n approaches infinity. Nevertheless, in the worst case, the height of a trie can be unbounded and the height of a patricia trie is in Θ ( n ). To better understand the practical performance of both tries and patricia tries, we investigate these two classical data structures in a smoothed analysis model. Given a set S = { s 1, s 2, …, s n } of n binary strings, we perturb the set by adding an i. i. d. Bernoulli random noise to each bit of every string. We show that the resulting smoothed heights of the trie and the patricia trie are both in Θ ( log ⁡ n ).

TIST Journal 2015 Journal Article

Recognition of Patient-Related Named Entities in Noisy Tele-Health Texts

  • Mi-Young Kim
  • Ying Xu
  • Osmar R. Zaiane
  • Randy Goebel

We explore methods for effectively extracting information from clinical narratives that are captured in a public health consulting phone service called HealthLink. Our research investigates the application of state-of-the-art natural language processing and machine learning to clinical narratives to extract information of interest. The currently available data consist of dialogues constructed by nurses while consulting patients by phone. Since the data are interviews transcribed by nurses during phone conversations, they include a significant volume and variety of noise. When we extract the patient-related information from the noisy data, we have to remove or correct at least two kinds of noise: explicit noise, which includes spelling errors, unfinished sentences, omission of sentence delimiters, and variants of terms, and implicit noise, which includes non-patient information and patient's untrustworthy information. To filter explicit noise, we propose our own biomedical term detection/normalization method: it resolves misspelling, term variations, and arbitrary abbreviation of terms by nurses. In detecting temporal terms, temperature, and other types of named entities (which show patients’ personal information such as age and sex), we propose a bootstrapping-based pattern learning process to detect a variety of arbitrary variations of named entities. To address implicit noise, we propose a dependency path-based filtering method. The result of our denoising is the extraction of normalized patient information, and we visualize the named entities by constructing a graph that shows the relations between named entities. The objective of this knowledge discovery task is to identify associations between biomedical terms and to clearly expose the trends of patients’ symptoms and concern; the experimental results show that we achieve reasonable performance with our noise reduction methods.

TCS Journal 2014 Journal Article

Approximating the maximum multiple RNA interaction problem

  • Weitian Tong
  • Randy Goebel
  • Tian Liu
  • Guohui Lin

RNA interactions are fundamental in many cellular processes, where two or more RNA molecules can be involved. Multiple RNA interactions are also believed to be much more complex than pairwise interactions. Recently, multiple RNA interaction prediction has been formulated as a maximization problem. Here we extensively examine this optimization problem under several biologically meaningful interaction models. We present a polynomial time algorithm for the problem when the order of interacting RNAs is known and pseudoknot interactions are allowed; for the general problem without an assumed RNA order, we prove the NP-hardness for both variants (allowing and disallowing pseudoknot interactions), and present a constant ratio approximation algorithm for each of them.

TCS Journal 2014 Journal Article

Approximating the minimum independent dominating set in perturbed graphs

  • Weitian Tong
  • Randy Goebel
  • Guohui Lin

We investigate the minimum independent dominating set in perturbed graphs g ( G, p ) of input graph G = ( V, E ), obtained by negating the existence of edges independently with a probability p > 0. The minimum independent dominating set (MIDS) problem does not admit a polynomial running time approximation algorithm with worst-case performance ratio of n 1 − ϵ for any ϵ > 0. We prove that the size of the minimum independent dominating set in g ( G, p ), denoted as i ( g ( G, p ) ), is asymptotically almost surely in Θ ( log ⁡ | V | ). Furthermore, we show that the probability of i ( g ( G, p ) ) ⩾ 4 | V | p is no more than 2 − | V |, and present a simple greedy algorithm of proven worst-case performance ratio 4 | V | p and with polynomial expected running time.

TCS Journal 2014 Journal Article

On the approximability of the exemplar adjacency number problem for genomes with gene repetitions

  • Zhixiang Chen
  • Bin Fu
  • Randy Goebel
  • Guohui Lin
  • Weitian Tong
  • Jinhui Xu
  • Boting Yang
  • Zhiyu Zhao

In this paper, we apply a measure, exemplar adjacency number, which complements and extends the well-studied breakpoint distance between two permutations, to measure the similarity between two genomes (or in general, between any two sequences drawn from the same alphabet). For two genomes G and H drawn from the same set of n gene families and containing gene repetitions, we consider the corresponding Exemplar Adjacency Number problem (EAN), in which we delete duplicated genes from G and H such that the resultant exemplar genomes (permutations) G and H have the maximum adjacency number. We obtain the following results. First, we prove that the one-sided 2-repetitive EAN problem, i. e. , when one of G and H is given exemplar and each gene occurs in the other genome at most twice, can be linearly reduced from the Maximum Independent Set problem. This implies that EAN does not admit any O ( n 0. 5 − ϵ ) -approximation algorithm, for any ϵ > 0, unless P = NP. This hardness result also implies that EAN, parameterized by the optimal solution value, is W[1]-hard. Secondly, we show that the two-sided 2-repetitive EAN problem has an O ( n 0. 5 ) -approximation algorithm, which is tight up to a constant factor.

TCS Journal 2011 Journal Article

Size-constrained tree partitioning: Approximating the multicast k -tree routing problem

  • Zhipeng Cai
  • Randy Goebel
  • Guohui Lin

In the multicast k -tree routing problem, a data copy is sent from the source node to at most k destination nodes in every transmission. The goal is to minimize the total cost of sending data to all destination nodes, which is measured as the sum of the costs of all routing trees. This problem was formulated out of optical networking and has applications in general multicasting. Several approximation algorithms, with increasing performance, have been proposed in the last several years; the most recent ones rely heavily on a tree partitioning technique. In this paper, we present a further improved approximation algorithm along the line. The algorithm has a worst-case performance ratio of 5 4 ρ + 3 2, where ρ denotes the best approximation ratio for the Steiner minimum tree problem. The proofs of the technical routing lemmas also provide some insights into why such a performance ratio could be the best possible that one can get using this tree partitioning technique.

IJCAI Conference 2009 Conference Paper

  • Shane Bergsma
  • Dekang Lin
  • Randy Goebel

Web-scale data has been used in a diverse range of language research. Most of this research has used web counts for only short, fixed spans of context. We present a unified view of using web counts for lexical disambiguation. Unlike previous approaches, our supervised and unsupervised systems combine information from multiple and overlapping segments of context. On the tasks of preposition selection and context-sensitive spelling correction, the supervised system reduces disambiguation error by 20-24% over the current state-of-the-art.

IJCAI Conference 2007 Conference Paper

  • Abhaya C. Nayak
  • Randy Goebel
  • Mehmet A. Orgun

Importance of contraction for belief change notwithstanding, literature on iterated belief change has by and large centered around the issue of iterated belief revision, ignoring the problem of iterated belief contraction. In this paper we examine iterated belief contraction in a principled way, starting with Qualified Insertion, a proposal by Hans Rott. We show that a judicious combination of Qualified Insertion with a well-known Factoring principle leads to what is arguably a pivotal principle of iterated belief contraction. We show that this principle is satisfied by the account of iterated belief contraction modelled by Lexicographic State Contraction, and outline its connection with Lexicographic Revision, Darwiche-Pearl's account of revision as well as Spohn's Ordinal ranking theory. Keywords: Belief Change, Information State Change, Iterated Belief Contraction.

UAI Conference 1990 Conference Paper

Integrating probabilistic, taxonomic and causal knowledge in abductive diagnosis

  • Dekang Lin
  • Randy Goebel

We propose an abductive diagnosis theory that integrates probabilistic, causal and taxonomic knowledge. Probabilistic knowledge allows us to select the most likely explanation; causal knowledge allows us to make reasonable independence assumptions; taxonomic knowledge allows causation to be modeled at different levels of detail, and allows observations be described in different levels of precision. Unlike most other approaches where a causal explanation is a hypothesis that one or more causative events occurred, we define an explanation of a set of observations to be an occurrence of a chain of causation events. These causation events constitute a scenario where all the observations are true. We show that the probabilities of the scenarios can be computed from the conditional probabilities of the causation events. Abductive reasoning is inherently complex even if only modest expressive power is allowed. However, our abduction algorithm is exponential only in the number of observations to be explained, and is polynomial in the size of the knowledge base. This contrasts with many other abduction procedures that are exponential in the size of the knowledge base.

NMR Workshop 1989 Conference Paper

Non-Monotonic Reasoning in Temporal Domains: The Knowledge Independence Problem

  • Scott D. Goodwin
  • Randy Goebel

Abstract Much interest has been focused on nonmonotonic reasoning in temporal domains since Hanks and McDermott discovered that intuitive temporal representations give rise to the multiple extension problem. Here we consider nonmonotonic reasoning in temporal domains from the perspective of the Theorist hypothetical reasoning framework. We show how this framework can be applied to temporal reasoning in a simple and intuitive way to solve many of the problems posed in the recent literature, such as the Yale Shooting problem, Kautz's Vanishing Car problem, Haugh's Assassin problem, and Haugh's Robot problem. The basis of our solution to these problems is the characterization of the notion that the past is independent of the future ( temporal independence ) and the provision of two additional modes of reasoning: conditional explanation and prediction. The problem of representing and reasoning about temporal independence is an instance of a more general problem which we call the knowledge independence problem. In this paper, we provide a preliminary definition of the knowledge independence problem; we leave to future work further development of the obvious connections with statistical independence. Using our preliminary definition, we show how to represent and reason about temporal independence and how this solves many temporal reasoning problems.