Author name cluster

Mi Tian

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers

2 author rows

AAAI Conference 2025 Conference Paper

Cognitive Fluctuations Enhanced Attention Network for Knowledge Tracing

Mingliang Hou
Xueyi Li
Teng Guo
Zitao Liu
Mi Tian
Renqiang Luo
Weiqi Luo

Knowledge tracing (KT) involves using the historical records of student-learning interactions to anticipate their performance on forthcoming questions. Central to this process is the modeling of human cognition to gain deeper insights into how knowledge is acquired and retained. Human cognition is characterized by two key features: long-term cognitive trends, reflecting the gradual accumulation and stabilization of knowledge over time, and short-term cognitive fluctuations, which arise from transient factors such as forgetting or momentary lapses in attention. Although existing attention-based KT models effectively capture long-term cognitive trends, they often fail to adequately address short-term cognitive fluctuations. These limitations lead to overly smoothed cognitive features and reduced model performance, especially when the test data length exceeds the training data length. To address these problems, we propose FlucKT, a novel short-term cognitive fluctuations enhanced attention network for KT tasks. FlucKT improves the attention mechanism in two ways: First, by using a decomposition-based layer with causal convolution to separate and dynamically reweight long-term and short-term cognitive features. Second, by introducing a kernelized bias attention score penalty to enhance focus on short-term fluctuations, improving length generalization capabilities. Our contributions are validated through extensive experiments on three real-world datasets, demonstrating significant improvements in length generalization and prediction performance.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Rethinking and Improving Student Learning and Forgetting Processes for Attention based Knowledge Tracing Models

Youheng Bai
Xueyi Li
Zitao Liu
Yaying Huang
Mi Tian
Weiqi Luo

Knowledge tracing (KT) models students' knowledge states and predicts their future performance based on their historical interaction data. However, attention based KT models struggle to accurately capture diverse forgetting behaviors in ever-growing interaction sequences. First, existing models use uniform time decay matrices, conflating forgetting representations with problem relevance. Second, the fixed-length window prediction paradigm fails to model continuous forgetting processes in expanding sequences. To address these challenges, this paper introduces LefoKT, a unified architecture that enhances attention based KT models by incorporating proposed relative forgetting attention. LefoKT improves forgetting modeling through relative forgetting attention to decouple forgetting patterns from problem relevance. It also enhances attention based KT models' length extrapolation capability for capturing continuous forgetting processes in ever-growing interaction sequences. Extensive experimental results on three datasets validate the effectiveness of LefoKT.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

SolidGeo: Measuring Multimodal Spatial Math Reasoning in Solid Geometry

Peijie Wang
Chao Yang
Zhong-Zhi Li
Fei Yin
Dekang Ran
Mi Tian
Zhilong Ji
Jinfeng Bai

Geometry is a fundamental branch of mathematics and plays a crucial role in evaluating the reasoning capabilities of multimodal large language models (MLLMs). However, existing multimodal mathematics benchmarks mainly focus on plane geometry and largely ignore solid geometry, which requires spatial reasoning and is more challenging than plane geometry. To address this critical gap, we introduce SolidGeo, the first large-scale benchmark specifically designed to evaluate the performance of MLLMs on mathematical reasoning tasks in solid geometry. SolidGeo consists of 3, 113 real-world K–12 and competition-level problems, each paired with visual context and annotated with difficulty levels and fine-grained solid geometry categories. Our benchmark covers a wide range of 3D reasoning subjects such as projection, unfolding, spatial measurement, and spatial vector, offering a rigorous testbed for assessing solid geometry. Through extensive experiments, we observe that MLLMs encounter substantial challenges in solid geometry math tasks, with a considerable performance gap relative to human capabilities on SolidGeo. Moreover, we analyze the performance, inference effiency and error patterns of various models, offering insights into the solid geometric mathematical reasoning capabilities of MLLMs. We hope SolidGeo serves as a catalyst for advancing MLLMs toward deeper geometric reasoning and spatial intelligence. The dataset is released at https: //huggingface. co/datasets/HarryYancy/SolidGeo/

PDF Details

AAAI Conference 2025 Conference Paper

What Are Step-Level Reward Models Rewarding? Counterintuitive Findings from MCTS-Boosted Mathematical Reasoning

Yiran Ma
Zui Chen
Tianqiao Liu
Mi Tian
Zhuo Liu
Zitao Liu
Weiqi Luo

Step-level reward models (SRMs) can significantly enhance mathematical reasoning performance through process supervision or step-level preference alignment based on reinforcement learning. The performance of SRMs is pivotal, as they serve as critical guidelines, ensuring that each step in the reasoning process is aligned with desired outcomes. Recently, AlphaZero-like methods, where Monte Carlo Tree Search (MCTS) is employed for automatic step-level preference annotation, have proven particularly effective. However, the precise mechanisms behind the success of SRMs remain largely unexplored. To address this gap, this study delves into the counterintuitive aspects of SRMs, particularly focusing on MCTS-based approaches. Our findings reveal that the removal of natural language descriptions of thought processes has minimal impact on the efficacy of SRMs. Furthermore, we demonstrate that SRMs are adept at assessing the complex logical coherence present in mathematical language while having difficulty in natural language. These insights provide a nuanced understanding of the core elements that drive effective step-level reward modeling in mathematical reasoning. By shedding light on these mechanisms, this study offers valuable guidance for developing more efficient and streamlined SRMs, which can be achieved by focusing on the crucial parts of mathematical reasoning.

PDF Details DOI

ICRA Conference 2020 Conference Paper

3D Scene Geometry-Aware Constraint for Camera Localization with Deep Learning

Mi Tian
Qiong Nie
Hao Shen

Camera localization is a fundamental and key component of autonomous driving vehicles and mobile robots to localize themselves globally for further environment perception, path planning and motion control. Recently end-to-end approaches based on convolutional neural network have been much studied to achieve or even exceed 3D-geometry based traditional methods. In this work, we propose a compact network for absolute camera pose regression. Inspired from those traditional methods, a 3D scene geometry-aware constraint is also introduced by exploiting all available information including motion, depth and image contents. We add this constraint as a regularization term to our proposed network by defining a pixel-level photometric loss and an image-level structural similarity loss. To benchmark our method, different challenging scenes including indoor and outdoor environment are tested with our proposed approach and state-of-the-arts. And the experimental results demonstrate significant performance improvement of our method on both prediction accuracy and convergence efficiency.

Details

IJCAI Conference 2020 Conference Paper

Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling

Daniel Stoller
Mi Tian
Sebastian Ewert
Simon Dixon

Convolutional neural networks (CNNs) with dilated filters such as the Wavenet or the Temporal Convolutional Network (TCN) have shown good results in a variety of sequence modelling tasks. While their receptive field grows exponentially with the number of layers, computing the convolutions over very long sequences of features in each layer is time and memory-intensive, and prohibits the use of longer receptive fields in practice. To increase efficiency, we make use of the "slow feature" hypothesis stating that many features of interest are slowly varying over time. For this, we use a U-Net architecture that computes features at multiple time-scales and adapt it to our auto-regressive scenario by making convolutions causal. We apply our model ("Seq-U-Net") to a variety of tasks including language and audio generation. In comparison to TCN and Wavenet, our network consistently saves memory and computation time, with speed-ups for training and inference of over 4x in the audio generation experiment in particular, while achieving a comparable performance on real-world tasks.

PDF Details DOI

TIST Journal 2016 Journal Article

Towards Music Structural Segmentation across Genres

Mi Tian
Mark B. Sandler

This article faces the problem of how different audio features and segmentation methods work with different music genres. A new annotated corpus of Chinese traditional Jingju music is presented. We incorporate this dataset with two existing music datasets from the literature in an integrated retrieval system to evaluate existing features, structural hypotheses, and segmentation algorithms outside a Western bias. A harmonic-percussive source separation technique is introduced to the feature extraction process and brings significant improvement to the segmentation. Results show that different features capture the structural patterns of different music genres in different ways. Novelty- or homogeneity-based segmentation algorithms and timbre features can surpass the investigated alternatives for the structure analysis of Jingju due to their lack of harmonic repetition patterns. Findings indicate that the design of audio features and segmentation algorithms as well as the consideration of contextual information related to the music corpora should be accounted dependently in an effective segmentation system.

Details DOI