Arrow Research search

Author name cluster

Mark Dredze

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers
2 author rows

Possible papers

12

TMLR Journal 2025 Journal Article

Can Optimization Trajectories Explain Multi-Task Transfer?

  • David Mueller
  • Mark Dredze
  • Nicholas Andrews

Despite the widespread adoption of multi-task training in deep learning, little is understood about how multi-task learning (MTL) affects generalization. Prior work has conjectured that the negative effects of MTL are due to optimization challenges that arise during training, and many optimization methods have been proposed to improve multi-task performance. However, recent work has shown that these methods fail to consistently improve multi-task generalization. In this work, we seek to improve our understanding of these failures by empirically studying how MTL impacts the optimization of tasks, and whether this impact can explain the effects of MTL on generalization. We show that MTL results in a generalization gap (a gap in generalization at comparable training loss) between single-task and multi-task trajectories early into training. However, we find that factors of the optimization trajectory previously proposed to explain generalization gaps in single-task settings cannot explain the generalization gaps between single-task and multi-task models. Moreover, we show that the amount of gradient conflict between tasks is correlated with negative effects to task optimization, but is not predictive of generalization. Our work sheds light on the underlying causes for failures in MTL and, importantly, raises questions about the role of general purpose multi-task optimization algorithms.

AAAI Conference 2016 Conference Paper

Collective Supervision of Topic Models for Predicting Surveys with Social Media

  • Adrian Benton
  • Michael Paul
  • Braden Hancock
  • Mark Dredze

This paper considers survey prediction from social media. We use topic models to correlate social media messages with survey outcomes and to provide an interpretable representation of the data. Rather than rely on fully unsupervised topic models, we use existing aggregated survey data to inform the inferred topics, a class of topic model supervision referred to as collective supervision. We introduce and explore a variety of topic model variants and provide an empirical analysis, with conclusions of the most effective models for this task.

IS Journal 2014 Journal Article

Social Media Analytics for Smart Health

  • Ahmed Abbasi
  • Donald Adjeroh
  • Mark Dredze
  • Michael J. Paul
  • Fatemeh Mariam Zahedi
  • Huimin Zhao
  • Nitin Walia
  • Hemant Jain

This special section of "Trends & Controversies" focuses on social media analytics for smart health. The introduction, called "Social Media Analytics for Smart Health, " is provided by Ahmed Abbasi and Donald Adjeroh. Then Mark Dredze and Michael J. Paul have written "Natural Language Processing for Health and Social Media. " Next, Fatemeh "Mariam" Zahedi, Huimin Zhao, Nitin Walia, Hemant Jain, Patrick Sanvanson, and Reza Shaker discuss "Treating Patients Real Avatars in the Virtual Medical Office. " The fourth selection, by Marco D. Huesch, is "Social Media versus Privacy and Credibility, " The last piece, written by Donald Adjeroh, Richard Beal, Ahmed Abbasi, Wanhong Zheng, Marie Abate, and Arun Ross, is "Signal Fusion for Social Media Analysis of Adverse Drug Events. "

JMLR Journal 2012 Journal Article

Confidence-Weighted Linear Classification for Text Categorization

  • Koby Crammer
  • Mark Dredze
  • Fernando Pereira

Confidence-weighted online learning is a generalization of margin-based learning of linear classifiers in which the margin constraint is replaced by a probabilistic constraint based on a distribution over classifier weights that is updated online as examples are observed. The distribution captures a notion of confidence on classifier weights, and in some cases it can also be interpreted as replacing a single learning rate by adaptive per-weight rates. Confidence-weighted learning was motivated by the statistical properties of natural-language classification tasks, where most of the informative features are relatively rare. We investigate several versions of confidence-weighted learning that use a Gaussian distribution over weight vectors, updated at each observed example to achieve high probability of correct classification for the example. Empirical evaluation on a range of text-categorization tasks show that our algorithms improve over other state-of-the-art online and batch methods, learn faster in the online setting, and lead to better classifier combination for a type of distributed training commonly used in cloud computing. [abs] [ pdf ][ bib ] &copy JMLR 2012. ( edit, beta )

NeurIPS Conference 2012 Conference Paper

Factorial LDA: Sparse Multi-Dimensional Text Models

  • Michael Paul
  • Mark Dredze

Multi-dimensional latent variable models can capture the many latent factors in a text corpus, such as topic, author perspective and sentiment. We introduce factorial LDA, a multi-dimensional latent variable model in which a document is influenced by K different factors, and each word token depends on a K-dimensional vector of latent variables. Our model incorporates structured word priors and learns a sparse product of factors. Experiments on research abstracts show that our model can learn latent factors such as research topic, scientific discipline, and focus (e. g. methods vs. applications. ) Our modeling improvements reduce test perplexity and improve human interpretability of the discovered factors.

IS Journal 2012 Journal Article

How Social Media Will Change Public Health

  • Mark Dredze

Recent work in machine learning and natural language processing has studied the health content of tweets and demonstrated the potential for extracting useful public health information from their aggregation. This article examines the types of health topics discussed on Twitter, and how tweets can both augment existing public health capabilities and enable new ones. The author also discusses key challenges that researchers must address to deliver high-quality tools to the public health community.

IJCAI Conference 2009 Conference Paper

  • Mark Dredze
  • Bill N. Schilit
  • Peter Norvig

Growing email volumes cause flooded inboxes and swelled email archives, making search and new email processing difficult. While emails have rich metadata, such as recipients and folders, suitable for creating filtered views, it is often difficult to choose appropriate filters for new inbox messages without first examining messages. In this work, we consider a system that automatically suggests relevant view filters to the user for the currently viewed messages. We propose several ranking algorithms for suggesting useful filters. Our work suggests that such systems quickly filter groups of inbox messages and find messages more easily during search.

NeurIPS Conference 2009 Conference Paper

Adaptive Regularization of Weight Vectors

  • Koby Crammer
  • Alex Kulesza
  • Mark Dredze

We present AROW, a new online learning algorithm that combines several properties of successful: large margin training, confidence weighting, and the capacity to handle non-separable data. AROW performs adaptive regularization of the prediction function upon seeing each new instance, allowing it to perform especially well in the presence of label noise. We derive a mistake bound, similar in form to the second order perceptron bound, which does not assume separability. We also relate our algorithm to recent confidence-weighted online learning techniques and empirically show that AROW achieves state-of-the-art performance and notable robustness in the case of non-separable data.

ICML Conference 2008 Conference Paper

Confidence-weighted linear classification

  • Mark Dredze
  • Koby Crammer
  • Fernando C. N. Pereira

We introduce confidence-weighted linear classifiers, which add parameter confidence information to linear classifiers. Online learners in this setting update both classifier parameters and the estimate of their confidence. The particular online algorithms we study here maintain a Gaussian distribution over parameter vectors and update the mean and covariance of the distribution with each instance. Empirical evaluation on a range of NLP tasks show that our algorithm improves over other state of the art online and batch methods, learns faster in the online setting, and lends itself to better classifier combination after parallel training.

NeurIPS Conference 2008 Conference Paper

Exact Convex Confidence-Weighted Learning

  • Koby Crammer
  • Mark Dredze
  • Fernando Pereira

Confidence-weighted (CW) learning [6], an online learning method for linear classifiers, maintains a Gaussian distributions over weight vectors, with a covariance matrix that represents uncertainty about weights and correlations. Confidence constraints ensure that a weight vector drawn from the hypothesis distribution correctly classifies examples with a specified probability. Within this framework, we derive a new convex form of the constraint and analyze it in the mistake bound model. Empirical evaluation with both synthetic and text data shows our version of CW learning achieves lower cumulative and out-of-sample errors than commonly used first-order and second-order online methods.

AAAI Conference 2006 Conference Paper

Activity-Centric Email: A Machine Learning Approach

  • Nicholas Kushmerick
  • Mark Dredze

Our use of ordinary desktop applications (such as email, Web, calendars) is often a manifestation of the activities with which we are engaged. Planning a conference trip involves sending travel expense forms, and visits to airline and hotel sites. Renovating a kitchen involves sketches, product specifications, emails with the architect and spreadsheets for tracking expenses. Every enterprise has (often implicit) processes for managing customer queries, requesting maintenance, hiring a new employee, purchasing equipment, and so on. Unfortunately, ordinary desktop applications do not know anything about these activities. Within an enterprise, many activities have been formalized into business workflows such as hiring or ordering equipment. However, the way people interact with these workflows is often through email and desktop applications. If these applications are not aware of the activity context, people bear the burden of organizing their information into activities, typically using crude techniques such as manual search, file directories, and email folders/threads. Email has emerged as the primary tool for people to communicate about their work and manage activities. Motivated by the importance of email in conducting activities, we have recently developed several machine learning algorithms for automatically discovering and tracking activities in email. We observe that activities come in many forms, from structured workflows to informal person-to-person communication. In this paper, we summarize our efforts to provide automated assistance with two types of activities: rigid structured activities, and unstructured conversational activities.