Author name cluster

Hongsong Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers

2 author rows

ECAI Conference 2020 Conference Paper

Behavior Based Dynamic Summarization on Product Aspects via Reinforcement Neighbour Selection

Zheng Gao 0001
Lujun Zhao
Heng Huang
Hongsong Li
Changlong Sun
Luo Si
Xiaozhong Liu 0001

Dynamic summarization on product aspects, as a newly proposed topic, is an important task in E-commerce for tracking and understanding the nature of products. This can benefit both customers and sellers in different downstream tasks, such as explainable recommendations. However, most existing research works focus on analyzing product static reviews but miss dynamic sentiment changes. In this paper, we propose an innovative multi-task model to sample neighbour products whose information is simultaneously utilized to generate product summarization. In detail, a reinforcement learning approach selects neighbour products from a group of seed products by considering their pairwise similarities calculated from user behaviors. Meanwhile, a generative model helps to summarize product aspects via product descriptive phrases and selected neighbour products’ sentimental phrases. To the best of our knowledge, this is the first work that studies dynamic product summarization leveraging user behaviors instead of self-reviews. It means that the proposed approach can naturally address the cold-start scenario where few recent product reviews are available. Extensive experiments are conducted with real-world reviews plus behavior data to validate the proposed method against several strong alternatives.

Details

AAAI Conference 2020 Conference Paper

Cross-Lingual Low-Resource Set-to-Description Retrieval for Global E-Commerce

Juntao Li
Chang Liu
Jian Wang
Lidong Bing
Hongsong Li
Xiaozhong Liu
Dongyan Zhao
Rui Yan

With the prosperous of cross-border e-commerce, there is an urgent demand for designing intelligent approaches for assisting e-commerce sellers to offer local products for consumers from all over the world. In this paper, we explore a new task of cross-lingual information retrieval, i. e. , cross-lingual set-todescription retrieval in cross-border e-commerce, which involves matching product attribute sets in the source language with persuasive product descriptions in the target language. We manually collect a new and high-quality paired dataset, where each pair contains an unordered product attribute set in the source language and an informative product description in the target language. As the dataset construction process is both time-consuming and costly, the new dataset only comprises of 13. 5k pairs, which is a low-resource setting and can be viewed as a challenging testbed for model development and evaluation in cross-border e-commerce. To tackle this cross-lingual set-to-description retrieval task, we propose a novel cross-lingual matching network (CLMN) with the enhancement of context-dependent cross-lingual mapping upon the pre-trained monolingual BERT representations. Experimental results indicate that our proposed CLMN yields impressive results on the challenging task and the contextdependent cross-lingual mapping on BERT yields noticeable improvement over the pre-trained multi-lingual BERT model.

PDF Details

IJCAI Conference 2011 Conference Paper

Short Text Conceptualization Using a Probabilistic Knowledgebase

Yangqiu Song
Haixun Wang
Zhongyuan Wang
Hongsong Li
Weizhu Chen

Most of the text mining tasks, such as clustering, is dominated by statistical approaches that treat text as a bag of words. Semantics in the text is largely ignored in the mining process, and the mining results are often not easily interpretable. One particular challenge faced by such approaches is short text understanding, as short text lacks enough content from which a statistical conclusion can be drawn. For example, traditional topic analysis methods consider topic segments with tens of hundreds of words. Latent topic modeling, such as latent Dirichlet allocation, also requires sufficient words to infer document topic distribution. We enhance machine learning algorithms by first giving the machine a probabilistic knowledgebase that contains as big, rich, and consistent concepts (of worldly facts) as those in our mental world. Then a Bayesian inference mechanism is developed to conceptualize words and short text. We conducted comprehensive tests of our method on conceptualizing set of text terms, as well as clustering Twitter messages (tweets), which are typically approximately ten words long. Compared to latent semantic topic modeling and other four kinds of methods that using WordNet, Freebase and Wikipedia (category links and explicit semantic analysis), we show significant improvements in terms of tweets clustering accuracy.

PDF Details DOI