Author name cluster

Michael Wick

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

1 author row

AAAI Conference 2019 Conference Paper

Gradient-Based Inference for Networks with Output Constraints

Jay Yoon Lee
Sanket Vaibhav Mehta
Michael Wick
Jean-Baptiste Tristan
Jaime Carbonell

Practitioners apply neural networks to increasingly complex problems in natural language processing, such as syntactic parsing and semantic role labeling that have rich output structures. Many such structured-prediction problems require deterministic constraints on the output values; for example, in sequence-to-sequence syntactic parsing, we require that the sequential outputs encode valid trees. While hidden units might capture such properties, the network is not always able to learn such constraints from the training data alone, and practitioners must then resort to post-processing. In this paper, we present an inference method for neural networks that enforces deterministic constraints on outputs without performing rule-based post-processing or expensive discrete search. Instead, in the spirit of gradient-based training, we enforce constraints with gradient-based inference (GBI): for each input at test-time, we nudge continuous model weights until the network’s unconstrained inference procedure generates an output that satisfies the constraints. We study the efficacy of GBI on three tasks with hard constraints: semantic role labeling, syntactic parsing, and sequence transduction. In each case, the algorithm not only satisfies constraints, but improves accuracy, even when the underlying network is stateof-the-art.

PDF Details

NeurIPS Conference 2019 Conference Paper

Unlocking Fairness: a Trade-off Revisited

Michael Wick
Swetasudha Panda
Jean-Baptiste Tristan

The prevailing wisdom is that a model's fairness and its accuracy are in tension with one another. However, there is a pernicious {\em modeling-evaluating dualism} bedeviling fair machine learning in which phenomena such as label bias are appropriately acknowledged as a source of unfairness when designing fair models, only to be tacitly abandoned when evaluating them. We investigate fairness and accuracy, but this time under a variety of controlled conditions in which we vary the amount and type of bias. We find, under reasonable assumptions, that the tension between fairness and accuracy is illusive, and vanishes as soon as we account for these phenomena during evaluation. Moreover, our results are consistent with an opposing conclusion: fairness and accuracy are sometimes in accord. This raises the question, {\em might there be a way to harness fairness to improve accuracy after all? } Since most notions of fairness are with respect to the model's predictions and not the ground truth labels, this provides an opportunity to see if we can improve accuracy by harnessing appropriate notions of fairness over large quantities of {\em unlabeled} data with techniques like posterior regularization and generalized expectation. Indeed, we find that semi-supervision not only improves fairness, but also accuracy and has advantages over existing in-processing methods that succumb to selection bias on the training set.

PDF Details

AAAI Conference 2016 Conference Paper

Minimally-Constrained Multilingual Embeddings via Artificial Code-Switching

Michael Wick
Pallika Kanani
Adam Pocock

We present a method that consumes a large corpus of multilingual text and produces a single, uniﬁed word embedding in which the word vectors generalize across languages. In contrast to current approaches that require language identiﬁcation, our method is agnostic about the languages with which the documents in the corpus are expressed, and does not rely on parallel corpora to constrain the spaces. Instead we utilize a small set of human provided word translations— which are often freely and readily available. We can encode such word translations as hard constraints in the model’s objective functions; however, we ﬁnd that we can more naturally constrain the space by allowing words in one language to borrow distributional statistics from context words in another language. We achieve this via a process we term artiﬁcial code-switching. As the name suggests, we induce codeswitching so that words across multiple languages appear in contexts together. Not only do embedding models trained on code-switched data learn common cross-lingual structure, the common structure allows an NLP model trained in a source language to generalize to multiple target languages (achieving up to 80% of the accuracy of models trained with targetlanguage data).

PDF Details

NeurIPS Conference 2011 Conference Paper

Query-Aware MCMC

Michael Wick
Andrew McCallum

Traditional approaches to probabilistic inference such as loopy belief propagation and Gibbs sampling typically compute marginals for it all the unobserved variables in a graphical model. However, in many real-world applications the user's interests are focused on a subset of the variables, specified by a query. In this case it would be wasteful to uniformly sample, say, one million variables when the query concerns only ten. In this paper we propose a query-specific approach to MCMC that accounts for the query variables and their generalized mutual information with neighboring variables in order to achieve higher computational efficiency. Surprisingly there has been almost no previous work on query-aware MCMC. We demonstrate the success of our approach with positive experimental results on a wide range of graphical models.

PDF Details