Arrow Research search

Author name cluster

Minghai Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers
1 author row

Possible papers

2

AAAI Conference 2024 Short Paper

Flow-Event Autoencoder: Event Stream Object Recognition Dataset Generation with Arbitrary High Temporal Resolution

  • Minghai Chen

Event camera has unique advantages in high temporal resolution and dynamic range and has shown potentials in several computer vision tasks. However, due to the novelty of this hardware, there’s a lack of large benchmark DVS event-stream datasets, including datasets for object recognition. In this work, we proposed an encoder-decoder method to augment event stream dataset from image and optical flow with arbitrary temporal resolution for object recognition task. We believe this proposed method can be generalized well in augmenting event stream vision data for object recognition and will help advance the development of event vision paradigm.

AAAI Conference 2017 Conference Paper

Reference Based LSTM for Image Captioning

  • Minghai Chen
  • Guiguang Ding
  • Sicheng Zhao
  • Hui Chen
  • Qiang Liu
  • Jungong Han

Image captioning is an important problem in artificial intelligence, related to both computer vision and natural language processing. There are two main problems in existing methods: in the training phase, it is difficult to find which parts of the captions are more essential to the image; in the caption generation phase, the objects or the scenes are sometimes misrecognized. In this paper, we consider the training images as the references and propose a Reference based Long Short Term Memory (R-LSTM) model, aiming to solve these two problems in one goal. When training the model, we assign different weights to different words, which enables the network to better learn the key information of the captions. When generating a caption, the consensus score is utilized to exploit the reference information of neighbor images, which might fix the misrecognition and make the descriptions more natural-sounding. The proposed R-LSTM model outperforms the state-of-the-art approaches on the benchmark dataset MS COCO and obtains top 2 position on 11 of the 14 metrics on the online test server.