Arrow Research search

Author name cluster

Vinith Misra

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers
1 author row

Possible papers

2

AAAI Conference 2020 Conference Paper

Simplify-Then-Translate: Automatic Preprocessing for Black-Box Translation

  • Sneha Mehta
  • Bahareh Azarnoush
  • Boris Chen
  • Avneesh Saluja
  • Vinith Misra
  • Ballav Bihani
  • Ritwik Kumar

Black-box machine translation systems have proven incredibly useful for a variety of applications yet by design are hard to adapt, tune to a specific domain, or build on top of. In this work, we introduce a method to improve such systems via automatic pre-processing (APP) using sentence simplification. We first propose a method to automatically generate a large in-domain paraphrase corpus through back-translation with a black-box MT system, which is used to train a paraphrase model that “simplifies” the original sentence to be more conducive for translation. The model is used to preprocess source sentences of multiple low-resource language pairs. We show that this preprocessing leads to better translation performance as compared to non-preprocessed source sentences. We further perform side-by-side human evaluation to verify that translations of the simplified sentences are better than the original ones. Finally, we provide some guidance on recommended language pairs for generating the simplification model corpora by investigating the relationship between ease of translation of a language pair (as measured by BLEU) and quality of the resulting simplification model from backtranslations of this language pair (as measured by SARI), and tie this into the downstream task of low-resource translation.

AAAI Conference 2018 Conference Paper

Bernoulli Embeddings for Graphs

  • Vinith Misra
  • Sumit Bhatia

Just as semantic hashing (Salakhutdinov and Hinton 2009) can accelerate information retrieval, binary valued embeddings can significantly reduce latency in the retrieval of graphical data. We introduce a simple but effective model for learning such binary vectors for nodes in a graph. By imagining the embeddings as independent coin flips of varying bias, continuous optimization techniques can be applied to the approximate expected loss. Embeddings optimized in this fashion consistently outperform the quantization of both spectral graph embeddings and various learned real-valued embeddings, on both ranking and pre-ranking tasks for a variety of datasets.