Author name cluster

Weipeng Yan

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

2 author rows

ICLR Conference 2023 Conference Paper

DynaMS: Dyanmic Margin Selection for Efficient Deep Learning

Jiaxing Wang
Yong Li
Jingwei Zhuo
Xupeng Shi
Weizhong Zhang
Lixing Gong
Tong Tao
Pengzhang Liu

The great success of deep learning is largely driven by training over-parameterized models on massive datasets. To avoid excessive computation, extracting and training only on the most informative subset is drawing increasing attention. Nevertheless, it is still an open question how to select such a subset on which the model trained generalizes on par with the full data. In this paper, we propose dynamic margin selection (DynaMS). DynaMS leverages the distance from candidate samples to the classification boundary to construct the subset, and the subset is dynamically updated during model training. We show that DynaMS converges with large probability, and for the first time show both in theory and practice that dynamically updating the subset can result in better generalization over previous works. To reduce the additional computation incurred by the selection, a light parameter sharing proxy (PSP) is designed. PSP is able to faithfully evaluate instances with respect to the current model, which is necessary for dynamic selection. Extensive analysis and experiments demonstrate the superiority of the proposed approach in data selection against many state-of-the-art counterparts on benchmark datasets.

Details

AAAI Conference 2021 Conference Paper

Probing Product Description Generation via Posterior Distillation

Haolan Zhan
Hainan Zhang
Hongshen Chen
Lei Shen
Zhuoye Ding
Yongjun Bao
Weipeng Yan
Yanyan Lan

In product description generation (PDG), the user-cared aspect is critical for the recommendation system, which can not only improve user’s experiences but also obtain more clicks. High-quality customer reviews can be considered as an ideal source to mine user-cared aspects. However, in reality, a large number of new products (known as long-tailed commodities) cannot gather sufficient amount of customer reviews, which brings a big challenge in the product description generation task. Existing works tend to generate the product description solely based on item information, i. e. , product attributes or title words, which leads to tedious contents and cannot attract customers effectively. To tackle this problem, we propose an adaptive posterior network based on Transformer architecture that can utilize user-cared information from customer reviews. Specifically, we first extend the selfattentive Transformer encoder to encode product titles and attributes. Then, we apply an adaptive posterior distillation module to utilize useful review information, which integrates user-cared aspects to the generation process. Finally, we apply a Transformer-based decoding phase with copy mechanism to automatically generate the product description. Besides, we also collect a large-scare Chinese product description dataset to support our work and further research in this field. Experimental results show that our model is superior to traditional generative models in both automatic indicators and human evaluation.

PDF Details

IJCAI Conference 2020 Conference Paper

An Attention-based Model for Conversion Rate Prediction with Delayed Feedback via Post-click Calibration

Yumin Su
Liang Zhang
Quanyu Dai
Bo Zhang
Jinyao Yan
Dan Wang
Yongjun Bao
Sulong Xu

Conversion rate (CVR) prediction is becoming increasingly important in the multi-billion dollar online display advertising industry. It has two major challenges: firstly, the scarce user history data is very complicated and non-linear; secondly, the time delay between the clicks and the corresponding conversions can be very large, e. g. , ranging from seconds to weeks. Existing models usually suffer from such scarce and delayed conversion behaviors. In this paper, we propose a novel deep learning framework to tackle the two challenges. Specifically, we extract the pre-trained embedding from impressions/clicks to assist in conversion models and propose an inner/self-attention mechanism to capture the fine-grained personalized product purchase interests from the sequential click data. Besides, to overcome the time-delay issue, we calibrate the delay model by learning dynamic hazard function with the abundant post-click data more in line with the real distribution. Empirical experiments with real-world user behavior data prove the effectiveness of the proposed method.

PDF Details DOI

NeurIPS Conference 2020 Conference Paper

Kalman Filtering Attention for User Behavior Modeling in CTR Prediction

Hu Liu
Jing Lu
Xiwei Zhao
Sulong Xu
Hao Peng
Yutong Liu
Zehua Zhang
Jian Li

Click-through rate (CTR) prediction is one of the fundamental tasks for e-commerce search engines. As search becomes more personalized, it is necessary to capture the user interest from rich behavior data. Existing user behavior modeling algorithms develop different attention mechanisms to emphasize query-relevant behaviors and suppress irrelevant ones. Despite being extensively studied, these attentions still suffer from two limitations. First, conventional attentions mostly limit the attention field only to a single user's behaviors, which is not suitable in e-commerce where users often hunt for new demands that are irrelevant to any historical behaviors. Second, these attentions are usually biased towards frequent behaviors, which is unreasonable since high frequency does not necessarily indicate great importance. To tackle the two limitations, we propose a novel attention mechanism, termed Kalman Filtering Attention (KFAtt), that considers the weighted pooling in attention as a maximum a posteriori (MAP) estimation. By incorporating a priori, KFAtt resorts to global statistics when few user behaviors are relevant. Moreover, a frequency capping mechanism is incorporated to correct the bias towards frequent behaviors. Offline experiments on both benchmark and a 10 billion scale real production dataset, together with an Online A/B test, show that KFAtt outperforms all compared state-of-the-arts. KFAtt has been deployed in the ranking system of JD. com, one of the largest B2C e-commerce websites in China, serving the main traffic of hundreds of millions of active users.

PDF Details

AAAI Conference 2018 Conference Paper

Telepath: Understanding Users from a Human Vision Perspective in Large-Scale Recommender Systems

Yu Wang
Jixing Xu
Aohan Wu
Mantian Li
Yang He
Jinghe Hu
Weipeng Yan

Designing an e-commerce recommender system that serves hundreds of millions of active users is a daunting challenge. To our best knowledge, the complex brain activity mechanism behind human shopping activities is never considered in existing recommender systems. From a human vision perspective, we found two key factors that affect users’ behaviors: items’ attractiveness and their matching degrees with users’ interests. This paper proposes Telepath, a visionbased bionic recommender system model, which simulates human brain activities in decision making of shopping, thus understanding users from such perspective. The core of Telepath is a complex deep neural network with multiple subnetworks. In practice, the Telepath model has been launched to JD’s recommender system and advertising system and outperformed the former state-of-the-art method. For one of the major item recommendation blocks on the JD app, clickthrough rate (CTR), gross merchandise value (GMV) and orders have been increased 1. 59%, 8. 16% and 8. 71% respectively by Telepath. For several major ad publishers of JD demand-side platform, CTR, GMV and return on investment have been increased 6. 58%, 61. 72% and 65. 57% respectively by the first launch of Telepath, and further increased 2. 95%, 41. 75% and 41. 37% respectively by the second launch.

PDF Details