Author name cluster

Le Ma

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

1 author row

EAAI Journal 2026 Journal Article

Anomaly detection of bamboo chopsticks based on cross-domain mixup augmentation and self-supervised learning

Le Ma
Zizheng Cheng
Zhenjie Yao
Guanfeng Du
Lingfang Sun

Details DOI

EAAI Journal 2026 Journal Article

Lightweight method of foreign matter detection in coal conveying based on improved you only look once version 8 and embedded equipment

Guanfeng Du
Hongzheng Zhang
Yupeng Luo
Zhibo Bao
Zhiwei Li
Mingxin Zhou
Zhelin Liu
Shengxian Cao

Details DOI

EAAI Journal 2026 Journal Article

Study on the detection of pulverized coal and silica impurities in air pipeline based on improved U-Net

Guanfeng Du
Hongzheng Zhang
Yupeng Luo
Shengxian Cao
Gong Wang
Le Ma

Details DOI

IJCAI Conference 2025 Conference Paper

AI-Assisted Human-Pet Artistic Musical Co-Creation for Wellness Therapy

Zihao Wang
Le Ma
Yuhang Jin
Yongsheng Feng
Xin Pan
Shulei Ji
Kejun Zhang

This paper explores AI-mediated human-pet musical co-creation from an interdisciplinary perspective, leveraging recent advancements in animal-assisted therapy. These advancements have shown significant psychosocial benefits, especially in reducing anxiety and enhancing social engagement. Building on these findings, this study innovatively employs pet vocal timbres as 'digital avatars' to enhance emotional investment during the music creation process. We propose PetCoCre, a novel system that applies pet vocal timbres in three distinct character paradigms within AI music creation: (1) PetRhythm: using pet voices as rhythmic percussion through beat synchronization. (2) PetMelody: enabling pet voices to act as melodic instruments via pitch-shifting alignment. (3) PetVocalia: utilizing pet vocal timbres as the target timbre for SVC (Singing Voice Conversion), where the converted singing voice replaces the original singer's voice, thus preserving the original semantic content. Beyond these character paradigms, our technical innovation lies in proposing SaMoye, the first open-source, high-quality zero-shot SVC model that effectively overcomes existing methods' zero-shot limitations by employing mixed speaker embeddings for timbre enhancement and leveraging a large-scale singing voice dataset. In our experiments, we collected dog and cat vocalization data from pet stores and conducted experiments with 30 participants. Results demonstrate that the human-pet co-creation mode led to significant enhancements in pleasure and creative satisfaction compared to solo AI music generation, along with a significant reduction in participants' anxiety levels. Through collaborative art creation, this research pioneers new paradigms for animal-assisted therapeutic interventions and expands the boundaries of AI-assisted creative collaboration.

PDF Details DOI

AAAI Conference 2025 Conference Paper

SongGLM: Lyric-to-Melody Generation with 2D Alignment Encoding and Multi-Task Pre-Training

Jiaxing Yu
Xinda Wu
Yunfei Xu
Tieyao Zhang
Songruoyao Wu
Le Ma
Kejun Zhang

Lyric-to-melody generation aims to automatically create melodies based on given lyrics, requiring the capture of complex and subtle correlations between them. However, previous works usually suffer from two main challenges: 1) lyric-melody alignment modeling, which is often simplified to one-syllable/word-to-one-note alignment, while others have the problem of low alignment accuracy; 2) lyric-melody harmony modeling, which usually relies heavily on intermediates or strict rules, limiting model's capabilities and generative diversity. In this paper, we propose SongGLM, a lyric-to-melody generation system that leverages 2D alignment encoding and multi-task pre-training based on the General Language Model (GLM) to guarantee the alignment and harmony between lyrics and melodies. Specifically, 1) we introduce a unified symbolic song representation for lyrics and melodies with word-level and phrase-level (2D) alignment encoding to capture the lyric-melody alignment; 2) we design a multi-task pre-training framework with hierarchical blank infilling objectives (n-gram, phrase, and long span), and incorporate lyric-melody relationships into the extraction of harmonized n-grams to ensure the lyric-melody harmony. We also construct a large-scale lyric-melody paired dataset comprising over 200,000 English song pieces for pre-training and fine-tuning. The objective and subjective results indicate that SongGLM can generate melodies from lyrics with significant improvements in both alignment and harmony, outperforming all the previous baseline methods.

PDF Details DOI