Arrow Research search

Author name cluster

Eric Tzeng

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers
2 author rows

Possible papers

6

ICML Conference 2018 Conference Paper

CyCADA: Cycle-Consistent Adversarial Domain Adaptation

  • Judy Hoffman
  • Eric Tzeng
  • Taesung Park
  • Jun-Yan Zhu
  • Phillip Isola
  • Kate Saenko
  • Alexei A. Efros
  • Trevor Darrell

Domain adaptation is critical for success in new, unseen environments. Adversarial adaptation models have shown tremendous progress towards adapting to new environments by focusing either on discovering domain invariant representations or by mapping between unpaired image domains. While feature space methods are difficult to interpret and sometimes fail to capture pixel-level and low-level domain shifts, image space methods sometimes fail to incorporate high level semantic knowledge relevant for the end task. We propose a model which adapts between domains using both generative image space alignment and latent representation space alignment. Our approach, Cycle-Consistent Adversarial Domain Adaptation (CyCADA), guides transfer between domains according to a specific discriminatively trained task and avoids divergence by enforcing consistency of the relevant semantics before and after adaptation. We evaluate our method on a variety of visual recognition and prediction settings, including digit classification and semantic segmentation of road scenes, advancing state-of-the-art performance for unsupervised adaptation from synthetic to real world driving domains.

JMLR Journal 2016 Journal Article

Large Scale Visual Recognition through Adaptation using Joint Representation and Multiple Instance Learning

  • Judy Hoffman
  • Deepak Pathak
  • Eric Tzeng
  • Jonathan Long
  • Sergio Guadarrama
  • Trevor Darrell
  • Kate Saenko

A major barrier towards scaling visual recognition systems is the difficulty of obtaining labeled images for large numbers of categories. Recently, deep convolutional neural networks (CNNs) trained used 1.2M+ labeled images have emerged as clear winners on object classification benchmarks. Unfortunately, only a small fraction of those labels are available with bounding box localization for training the detection task and even fewer pixel level annotations are available for semantic segmentation. It is much cheaper and easier to collect large quantities of image-level labels from search engines than it is to collect scene-centric images with precisely localized labels. We develop methods for learning large scale recognition models which exploit joint training over both weak (image-level) and strong (bounding box) labels and which transfer learned perceptual representations from strongly-labeled auxiliary tasks. We provide a novel formulation of a joint multiple instance learning method that includes examples from object-centric data with image-level labels when available, and also performs domain transfer learning to improve the underlying detector representation. We then show how to use our large scale detectors to produce pixel level annotations. Using our method, we produce a $>$7.6K category detector and release code and models at lsda.berkeley vision.org. [abs] [ pdf ][ bib ] &copy JMLR 2016. ( edit, beta )

ICRA Conference 2015 Conference Paper

Beyond lowest-warping cost action selection in trajectory transfer

  • Dylan Hadfield-Menell
  • Alex X. Lee
  • Chelsea Finn
  • Eric Tzeng
  • Sandy Han Huang
  • Pieter Abbeel

We consider the problem of learning from demonstrations to manipulate deformable objects. Recent work [1], [2], [3] has shown promising results that enable robotic manipulation of deformable objects through learning from demonstrations. Their approach is able to generalize from a single demonstration to new test situations, and suggests a nearest neighbor approach to select a demonstration to adapt to a given test situation. Such a nearest neighbor approach, however, ignores important aspects of the problem: brittleness (versus robustness) of demonstrations when generalized through this process, and the extent to which a demonstration makes progress towards a goal. In this paper, we frame the problem of selecting which demonstration to transfer as an options Markov decision process (MDP). We present max-margin Q-function estimation: an approach to learn a Q-function from expert demonstrations. Our learned policies account for variability in robustness of demonstrations and the sequential nature of our tasks. We developed two knot-tying benchmarks to experimentally validate the effectiveness of our proposed approach. The selection strategy described in [2] achieves success rates of 70% and 54%, respectively. Our approach performs significantly better, with success rates of 88% and 76%, respectively.

ICML Conference 2014 Conference Paper

DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition

  • Jeff Donahue
  • Yangqing Jia
  • Oriol Vinyals
  • Judy Hoffman
  • Ning Zhang 0014
  • Eric Tzeng
  • Trevor Darrell

We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks. Our generic tasks may differ significantly from the originally trained tasks and there may be insufficient labeled or unlabeled data to conventionally train or adapt a deep architecture to the new tasks. We investigate and visualize the semantic clustering of deep convolutional features with respect to a variety of such tasks, including scene recognition, domain adaptation, and fine-grained recognition challenges. We compare the efficacy of relying on various network levels to define a fixed feature, and report novel results that significantly outperform the state-of-the-art on several important vision challenges. We are releasing DeCAF, an open-source implementation of these deep convolutional activation features, along with all associated network parameters to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms.

NeurIPS Conference 2014 Conference Paper

LSDA: Large Scale Detection through Adaptation

  • Judy Hoffman
  • Sergio Guadarrama
  • Eric Tzeng
  • Ronghang Hu
  • Jeff Donahue
  • Ross Girshick
  • Trevor Darrell
  • Kate Saenko

A major challenge in scaling object detection is the difficulty of obtaining labeled images for large numbers of categories. Recently, deep convolutional neural networks (CNNs) have emerged as clear winners on object classification benchmarks, in part due to training with 1. 2M+ labeled classification images. Unfortunately, only a small fraction of those labels are available for the detection task. It is much cheaper and easier to collect large quantities of image-level labels from search engines than it is to collect detection data and label it with precise bounding boxes. In this paper, we propose Large Scale Detection through Adaptation (LSDA), an algorithm which learns the difference between the two tasks and transfers this knowledge to classifiers for categories without bounding box annotated data, turning them into detectors. Our method has the potential to enable detection for the tens of thousands of categories that lack bounding box annotations, yet have plenty of classification data. Evaluation on the ImageNet LSVRC-2013 detection challenge demonstrates the efficacy of our approach. This algorithm enables us to produce a >7. 6K detector by using available classification data from leaf nodes in the ImageNet tree. We additionally demonstrate how to modify our architecture to produce a fast detector (running at 2fps for the 7. 6K detector). Models and software are available at

IROS Conference 2014 Conference Paper

Unifying scene registration and trajectory optimization for learning from demonstrations with application to manipulation of deformable objects

  • Alex X. Lee
  • Sandy Han Huang
  • Dylan Hadfield-Menell
  • Eric Tzeng
  • Pieter Abbeel

Recent work [1], [2] has shown promising results in enabling robotic manipulation of deformable objects through learning from demonstrations. Their method computes a registration from training scene to test scene, and then applies an extrapolation of this registration to the training scene gripper motion to obtain the gripper motion for the test scene. The warping cost of scene-to-scene registrations is used to determine the nearest neighbor from a set of training demonstrations. Then once the gripper motion has been generalized to the test situation, they apply trajectory optimization [3] to plan for the robot motions that will track the predicted gripper motions. In many situations, however, the predicted gripper motions cannot be followed perfectly due to, for example, joint limits or obstacles. In this case the past work finds a path that minimizes deviation from the predicted gripper trajectory as measured by its Euclidean distance for position and angular distance for orientation. Measuring the error this way during the motion planning phase, however, ignores the underlying structure of the problem-namely the idea that rigid registrations are preferred to generalize from training scene to test scene. Deviating from the gripper trajectory predicted by the extrapolated registration effectively changes the warp induced by the registration in the part of the space where the gripper trajectories are. The main contribution of this paper is an algorithm that considers this effective final warp as the criterion to optimize for in a unified optimization that simultaneously considers the scene-to-scene warping and the robot trajectory (which were separated into two sequential steps by the past work). This results in an approach that adjusts to infeasibility in a way that adapts directly to the geometry of the scene and minimizes the introduction of additional warping cost. In addition, this paper proposes to learn the motion of the gripper pads, whereas past work considered the motion of a coordinate frame attached to the gripper as a whole. This enables learning more precise grasping motions. Our experiments, which consider the task of knot tying, show that both unified optimization and explicit consideration of gripper pad motion result in improved performance.