Author name cluster

Junsheng Zhou

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers

2 author rows

AAAI Conference 2026 Conference Paper

Counterfactual Question Generation Uncovering Learner Contradictions

Bo Zhang
Hao Yu
Wenjie Dong
Yvhang Yang
Dezhuang Miao
Fengyi Song
Yanhui Gu
Xiaoming Zhang

Conventional feedback, even when accompanied by brief explanations, rarely uncovers the hidden contradictions that trigger a learner's mistake. We bridge this gap with counterfactual question generation (CFQG): given a learner's answer, generate a follow-up question that deliberately contradicts it, compelling the learner to confront the underlying conflict. CFQG thus transforms assessment from passive scoring into an interactive and contradiction-centered dialogue that supports knowledge repair. To automate CFQG, we propose GapProbe, which probes the knowledge gap between a learner’s belief and curated facts through a knowledge graph (KG), then designs counterfactual questions (CFQs) that negate the belief. Identifying contradiction-aware triples, and more importantly, selecting those most likely to confuse the learner, are highly challenging in large-scale KGs. GapProbe tackles these challenges with an iterative ProConB cycle coupled with a schema-aware KGMap. By caching one- and multi-hop schema patterns of the KG, KGMap provides ``roadmap'' to guide LLMs jump to deep and contradiction-aware triples, beyond traditional step-wise graph traversal. We present the CFQG benchmark and corresponding metrics for evaluating how generated CFQs trigger, focus, and deepen learner reflection through explicit contradictions. Experiments on multiple datasets and LLMs show that GapProbe boosts LLM reasoning over KGs and generates follow-up questions that consistently promote deeper and more focused learner reflection.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

U-CAN: Unsupervised Point Cloud Denoising with Consistency-Aware Noise2Noise Matching

Junsheng Zhou
XingYu Shi
Haichuan Song
Yi Fang
Yu-Shen Liu
Zhizhong Han

Point clouds captured by scanning sensors are often perturbed by noise, which have a highly negative impact on downstream tasks (e. g. surface reconstruction and shape understanding). Previous works mostly focus on training neural networks with noisy-clean point cloud pairs for learning denoising priors, which requires extensively manual efforts. In this work, we introduce U-CAN, an Unsupervised framework for point cloud denoising with Consistency-Aware Noise2Noise matching. Specifically, we leverage a neural network to infer a multi-step denoising path for each point of a shape or scene with a noise to noise matching schema. We achieve this by a novel loss which enables statistical reasoning on multiple noisy point cloud observations. We further introduce a novel constraint on the denoised geometry consistency for learning consistency-aware denoising patterns. We justify that the proposed constraint is a general term which is not limited to 3D domain and can also contribute to the area of 2D image denoising. Our evaluations under the widely used benchmarks in point cloud denoising, upsampling and image denoising show significant improvement over the state-of-the-art unsupervised methods, where U-CAN also produces comparable results with the supervised methods. Project page: https: //gloriasze. github. io/U-CAN/.

PDF Details

AAAI Conference 2025 Conference Paper

What Is a Good Question? Assessing Question Quality via Meta-Fact Checking

Bo Zhang
Jianghua Zhu
Chaozhuo Li
Hao Yu
Li Kong
Zhan Wang
Dezhuang Miao
Xiaoming Zhang

Knowledge-based questions are typically employed to evaluate LLM's knowledge boundaries; meanwhile, numerous studies focus on question generation as a means to enhance the capabilities of both models and individuals. However, there is a lack of in-depth exploration about what constitutes a good question from the perspective of knowledge cognition. This paper proposes aligning the complete knowledge underlying questions with educational criteria effectively employed in physics courses, thereby developing novel knowledge-intensive metrics of question quality. To this end, we propose Meta-Fact Checking (MFC), which transforms questions into knowledge graph (KG) triples utilizing LLMs through few-shot prompting, thereby quantifying question quality based on the patterns observed within these triples. MFC introduces a novel interaction mechanism for KGs that communicates meta-facts, illustrating the types of knowledge that KGs can offer to the LLM for reasoning questions, rather than relying solely on the original triples. This strategy ensures that MFC remains unaffected by unexplored triples that LLM has not yet encountered within KGs compared to the retrieve-while-reasoning routine. Experiments across multiple datasets and LLMs demonstrate that MFC significantly improves the accuracy and efficiency of both question answering and assessing. This research marks a pioneering effort to automate the evaluation of question quality based on cognitive capabilities.

PDF Details DOI

ICRA Conference 2024 Conference Paper

3D-OAE: Occlusion Auto-Encoders for Self-Supervised Learning on Point Clouds

Junsheng Zhou
Xin Wen 0003
Baorui Ma
Yu-Shen Liu
Yue Gao 0002
Yi Fang 0006
Zhizhong Han

The manual annotation for large-scale point clouds is still tedious and unavailable for many harsh real-world tasks. Self-supervised learning, which is used on raw and unlabeled data to pre-train deep neural networks, is a promising approach to address this issue. Existing works usually take the common aid from auto-encoders to establish the self-supervision by the self-reconstruction schema. However, the previous auto-encoders merely focus on the global shapes and do not distinguish the local and global geometric features apart. To address this problem, we present a novel and efficient self-supervised point cloud representation learning framework, named 3D Occlusion Auto-Encoder (3D-OAE), to facilitate the detailed supervision inherited in local regions and global shapes. We propose to randomly occlude some local patches of point clouds and establish the supervision via inpainting the occluded patches using the remaining ones. Specifically, we design an asymmetrical encoder-decoder architecture based on standard Transformer, where the encoder operates only on the visible subset of patches to learn local patterns, and a lightweight decoder is designed to leverage these visible patterns to infer the missing geometries via self-attention. We find that occluding a very high proportion of the input point cloud (e. g. 75%) will still yield a nontrivial self-supervisory performance, which enables us to achieve 3-4 times faster during training but also improve accuracy. Experimental results show that our approach outperforms the state-of-the-art on a diverse range of down-stream discriminative and generative tasks. Code is available at https://github.com/junshengzhou/3D-OAE.

Details

NeurIPS Conference 2024 Conference Paper

Binocular-Guided 3D Gaussian Splatting with View Consistency for Sparse View Synthesis

Liang Han
Junsheng Zhou
Yu-Shen Liu
Zhizhong Han

Novel view synthesis from sparse inputs is a vital yet challenging task in 3D computer vision. Previous methods explore 3D Gaussian Splatting with neural priors (e. g. depth priors) as an additional supervision, demonstrating promising quality and efficiency compared to the NeRF based methods. However, the neural priors from 2D pretrained models are often noisy and blurry, which struggle to precisely guide the learning of radiance fields. In this paper, We propose a novel method for synthesizing novel views from sparse views with Gaussian Splatting that does not require external prior as supervision. Our key idea lies in exploring the self-supervisions inherent in the binocular stereo consistency between each pair of binocular images constructed with disparity-guided image warping. To this end, we additionally introduce a Gaussian opacity constraint which regularizes the Gaussian locations and avoids Gaussian redundancy forimproving the robustness and efficiency of inferring 3D Gaussians from sparse views. Extensive experiments on the LLFF, DTU, and Blender datasets demonstrate that our method significantly outperforms the state-of-the-art methods.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

DiffGS: Functional Gaussian Splatting Diffusion

Junsheng Zhou
Weiqi Zhang
Yu-Shen Liu

3D Gaussian Splatting (3DGS) has shown convincing performance in rendering speed and fidelity, yet the generation of Gaussian Splatting remains a challenge due to its discreteness and unstructured nature. In this work, we propose DiffGS, a general Gaussian generator based on latent diffusion models. DiffGS is a powerful and efficient 3D generative model which is capable of generating Gaussian primitives at arbitrary numbers for high-fidelity rendering with rasterization. The key insight is to represent Gaussian Splatting in a disentangled manner via three novel functions to model Gaussian probabilities, colors and transforms. Through the novel disentanglement of 3DGS, we represent the discrete and unstructured 3DGS with continuous Gaussian Splatting functions, where we then train a latent diffusion model with the target of generating these Gaussian Splatting functions both unconditionally and conditionally. Meanwhile, we introduce a discretization algorithm to extract Gaussians at arbitrary numbers from the generated functions via octree-guided sampling and optimization. We explore DiffGS for various tasks, including unconditional generation, conditional generation from text, image, and partial 3DGS, as well as Point-to-Gaussian generation. We believe that DiffGS provides a new direction for flexibly modeling and generating Gaussian Splatting. Project page: https: //junshengzhou. github. io/DiffGS.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Learning Continuous Implicit Field with Local Distance Indicator for Arbitrary-Scale Point Cloud Upsampling

Shujuan Li
Junsheng Zhou
Baorui Ma
Yu-Shen Liu
Zhizhong Han

Point cloud upsampling aims to generate dense and uniformly distributed point sets from a sparse point cloud, which plays a critical role in 3D computer vision. Previous methods typically split a sparse point cloud into several local patches, upsample patch points, and merge all upsampled patches. However, these methods often produce holes, outliers or non-uniformity due to the splitting and merging process which does not maintain consistency among local patches.To address these issues, we propose a novel approach that learns an unsigned distance field guided by local priors for point cloud upsampling. Specifically, we train a local distance indicator (LDI) that predicts the unsigned distance from a query point to a local implicit surface. Utilizing the learned LDI, we learn an unsigned distance field to represent the sparse point cloud with patch consistency. At inference time, we randomly sample queries around the sparse point cloud, and project these query points onto the zero-level set of the learned implicit field to generate a dense point cloud. We justify that the implicit field is naturally continuous, which inherently enables the application of arbitrary-scale upsampling without necessarily retraining for various scales. We conduct comprehensive experiments on both synthetic data and real scans, and report state-of-the-art results under widely used benchmarks. Project page: https://lisj575.github.io/APU-LDI

PDF Details DOI

AAAI Conference 2024 Conference Paper

NeuSurf: On-Surface Priors for Neural Surface Reconstruction from Sparse Input Views

Han Huang
Yulun Wu
Junsheng Zhou
Ge Gao
Ming Gu
Yu-Shen Liu

Recently, neural implicit functions have demonstrated remarkable results in the field of multi-view reconstruction. However, most existing methods are tailored for dense views and exhibit unsatisfactory performance when dealing with sparse views. Several latest methods have been proposed for generalizing implicit reconstruction to address the sparse view reconstruction task, but they still suffer from high training costs and are merely valid under carefully selected perspectives. In this paper, we propose a novel sparse view reconstruction framework that leverages on-surface priors to achieve highly faithful surface reconstruction. Specifically, we design several constraints on global geometry alignment and local geometry refinement for jointly optimizing coarse shapes and fine details. To achieve this, we train a neural network to learn a global implicit field from the on-surface points obtained from SfM and then leverage it as a coarse geometric constraint. To exploit local geometric consistency, we project on-surface points onto seen and unseen views, treating the consistent loss of projected features as a fine geometric constraint. The experimental results with DTU and BlendedMVS datasets in two prevalent sparse settings demonstrate significant improvements over the state-of-the-art methods.

PDF Details DOI

ICLR Conference 2024 Conference Paper

Uni3D: Exploring Unified 3D Representation at Scale

Junsheng Zhou
Jinsheng Wang
Baorui Ma
Yu-Shen Liu
Tiejun Huang 0003
Xinlong Wang

Scaling up representations for images or text has been extensively investigated in the past few years and has led to revolutions in learning vision and language. However, scalable representation for 3D objects and scenes is relatively unexplored. In this work, we present Uni3D, a 3D foundation model to explore the unified 3D representation at scale. Uni3D uses a 2D initialized ViT end-to-end pretrained to align the 3D point cloud features with the image-text aligned features. Via the simple architecture and pretext task, Uni3D can leverage abundant 2D pretrained models as initialization and image-text aligned models as the target, unlocking the great potential of 2D model zoos and scaling-up strategies to the 3D world. We efficiently scale up Uni3D to one billion parameters, and set new records on a broad range of 3D tasks, such as zero-shot classification, few-shot classification, open-world understanding and zero-shot part segmentation. We show that the strong Uni3D representation also enables applications such as 3D painting and retrieval in the wild. We believe that Uni3D provides a new direction for exploring both scaling up and efficiency of the representation in 3D domain.

Details

NeurIPS Conference 2024 Conference Paper

Zero-Shot Scene Reconstruction from Single Images with Deep Prior Assembly

Junsheng Zhou
Yu-Shen Liu
Zhizhong Han

Large language and vision models have been leading a revolution in visual computing. By greatly scaling up sizes of data and model parameters, the large models learn deep priors which lead to remarkable performance in various tasks. In this work, we present deep prior assembly, a novel framework that assembles diverse deep priors from large models for scene reconstruction from single images in a zero-shot manner. We show that this challenging task can be done without extra knowledge but just simply generalizing one deep prior in one sub-task. To this end, we introduce novel methods related to poses, scales, and occlusion parsing which are keys to enable deep priors to work together in a robust way. Deep prior assembly does not require any 3D or 2D data-driven training in the task and demonstrates superior performance in generalizing priors to open-world scenes. We conduct evaluations on various datasets, and report analysis, numerical and visual comparisons with the latest methods to show our superiority. Project page: https: //junshengzhou. github. io/DeepPriorAssembly.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

Differentiable Registration of Images and LiDAR Point Clouds with VoxelPoint-to-Pixel Matching

Junsheng Zhou
Baorui Ma
Wenyuan Zhang
Yi Fang
Yu-Shen Liu
Zhizhong Han

Cross-modality registration between 2D images captured by cameras and 3D point clouds from LiDARs is a crucial task in computer vision and robotic. Previous methods estimate 2D-3D correspondences by matching point and pixel patterns learned by neural networks, and use Perspective-n-Points (PnP) to estimate rigid transformation during post-processing. However, these methods struggle to map points and pixels to a shared latent space robustly since points and pixels have very different characteristics with patterns learned in different manners (MLP and CNN), and they also fail to construct supervision directly on the transformation since the PnP is non-differentiable, which leads to unstable registration results. To address these problems, we propose to learn a structured cross-modality latent space to represent pixel features and 3D features via a differentiable probabilistic PnP solver. Specifically, we design a triplet network to learn VoxelPoint-to-Pixel matching, where we represent 3D elements using both voxels and points to learn the cross-modality latent space with pixels. We design both the voxel and pixel branch based on CNNs to operate convolutions on voxels/pixels represented in grids, and integrate an additional point branch to regain the information lost during voxelization. We train our framework end-to-end by imposing supervisions directly on the predicted pose distribution with a probabilistic PnP solver. To explore distinctive patterns of cross-modality features, we design a novel loss with adaptive-weighted optimization for cross-modality feature description. The experimental results on KITTI and nuScenes datasets show significant improvements over the state-of-the-art methods.

PDF Details

AAAI Conference 2023 Conference Paper

NeAF: Learning Neural Angle Fields for Point Normal Estimation

Shujuan Li
Junsheng Zhou
Baorui Ma
Yu-Shen Liu
Zhizhong Han

Normal estimation for unstructured point clouds is an important task in 3D computer vision. Current methods achieve encouraging results by mapping local patches to normal vectors or learning local surface fitting using neural networks. However, these methods are not generalized well to unseen scenarios and are sensitive to parameter settings. To resolve these issues, we propose an implicit function to learn an angle field around the normal of each point in the spherical coordinate system, which is dubbed as Neural Angle Fields (NeAF). Instead of directly predicting the normal of an input point, we predict the angle offset between the ground truth normal and a randomly sampled query normal. This strategy pushes the network to observe more diverse samples, which leads to higher prediction accuracy in a more robust manner. To predict normals from the learned angle fields at inference time, we randomly sample query vectors in a unit spherical space and take the vectors with minimal angle values as the predicted normals. To further leverage the prior learned by NeAF, we propose to refine the predicted normal vectors by minimizing the angle offsets. The experimental results with synthetic data and real scans show significant improvements over the state-of-the-art under widely used benchmarks. Project page: https://lisj575.github.io/NeAF/.

PDF Details DOI

NeurIPS Conference 2022 Conference Paper

Learning Consistency-Aware Unsigned Distance Functions Progressively from Raw Point Clouds

Junsheng Zhou
Baorui Ma
Yu-Shen Liu
Yi Fang
Zhizhong Han

Surface reconstruction for point clouds is an important task in 3D computer vision. Most of the latest methods resolve this problem by learning signed distance functions (SDF) from point clouds, which are limited to reconstructing shapes or scenes with closed surfaces. Some other methods tried to represent shapes or scenes with open surfaces using unsigned distance functions (UDF) which are learned from large scale ground truth unsigned distances. However, the learned UDF is hard to provide smooth distance fields near the surface due to the noncontinuous character of point clouds. In this paper, we propose a novel method to learn consistency-aware unsigned distance functions directly from raw point clouds. We achieve this by learning to move 3D queries to reach the surface with a field consistency constraint, where we also enable to progressively estimate a more accurate surface. Specifically, we train a neural network to gradually infer the relationship between 3D queries and the approximated surface by searching for the moving target of queries in a dynamic way, which results in a consistent field around the surface. Meanwhile, we introduce a polygonization algorithm to extract surfaces directly from the gradient field of the learned UDF. The experimental results in surface reconstruction for synthetic and real scan data show significant improvements over the state-of-the-art under the widely used benchmarks.

PDF Details

IJCAI Conference 2013 Conference Paper

Efficient Latent Structural Perceptron with Hybrid Trees for Semantic Parsing

Junsheng Zhou
Juhong Xu
Weiguang Qu

Discriminative structured prediction models have been widely used in many natural language processing tasks, but it is challenging to apply the method to semantic parsing. In this paper, by introducing hybrid tree as a latent structure variable to close the gap between the input sentences and output representations, we formulate semantic parsing as a structured prediction problem, based on the latent variable perceptron model incorporated with a tree edit-distance loss as optimization criterion. The proposed approach maintains the advantage of a discriminative model in accommodating flexible combination of features and naturally incorporates an efficient decoding algorithm in learning and inference. Furthermore, in order to enhance the efficiency and accuracy of inference, we design an effective approach based on vector space model to extract a smaller subset of relevant MR productions for test examples. Experimental results on publicly available corpus show that our approach significantly outperforms previous systems.

PDF Details DOI