Author name cluster

Tianhao Wu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

CADGrasp: Learning Contact and Collision Aware General Dexterous Grasping in Cluttered Scenes

Jiyao Zhang
Zhiyuan Ma
Tianhao Wu
Zeyuan Chen
Hao Dong

Dexterous grasping in cluttered environments presents substantial challenges due to the high degrees of freedom of dexterous hands, occlusion, and potential collisions arising from diverse object geometries and complex layouts. To address these challenges, we propose CADGrasp, a two-stage algorithm for general dexterous grasping using single-view point cloud inputs. In the first stage, we predict a scene-decoupled, contact- and collision-aware representation—sparse IBS—as the optimization target. Sparse IBS compactly encodes the geometric and contact relationships between the dexterous hand and the scene, enabling stable and collision-free dexterous grasp pose optimization. To enhance the prediction of this high-dimensional representation, we introduce an occupancy-diffusion model with voxel-level conditional guidance and force closure score filtering. In the second stage, we develop several energy functions and ranking strategies for optimization based on sparse IBS to generate high-quality dexterous grasp poses. Extensive experiments in both simulated and real-world settings validate the effectiveness of our approach, demonstrating its capability to mitigate collisions while maintaining a high grasp success rate across diverse objects and complex scenes.

PDF Details

NeurIPS Conference 2024 Conference Paper

Expressive Gaussian Human Avatars from Monocular RGB Video

Hezhen Hu
Zhiwen Fan
Tianhao Wu
Yihan Xi
Seoyoung Lee
Georgios Pavlakos
Zhangyang Wang

Nuanced expressiveness, especially through detailed hand and facial expressions, is pivotal for enhancing the realism and vitality of digital human representations. In this work, we aim to learn expressive human avatars from a monocular RGB video; a setting that introduces new challenges in capturing and animating fine-grained details. To this end, we introduce EVA, a drivable human model that can recover fine details based on 3D Gaussians and an expressive parametric human model, SMPL-X. Focused on enhancing expressiveness, our work makes three key contributions. First, we highlight the importance of aligning the SMPL-X model with the video frames for effective avatar learning. Recognizing the limitations of current methods for estimating SMPL-X parameters from in-the-wild videos, we introduce a reconstruction module that significantly improves the image-model alignment. Second, we propose a context-aware adaptive density control strategy, which is adaptively adjusting the gradient thresholds to accommodate the varied granularity across body parts. Third, we develop a feedback mechanism that predicts per-pixel confidence to better guide the optimization of 3D Gaussians. Extensive experiments on two benchmarks demonstrate the superiority of our approach both quantitatively and qualitatively, especially on the fine-grained hand and facial details. We make our code available at the project website: https: //evahuman. github. io.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

A Reduction-based Framework for Sequential Decision Making with Delayed Feedback

Yunchang Yang
Han Zhong
Tianhao Wu
Bin Liu
Liwei Wang
Simon S. Du

We study stochastic delayed feedback in general single-agent and multi-agent sequential decision making, which includes bandits, single-agent Markov decision processes (MDPs), and Markov games (MGs). We propose a novel reduction-based framework, which turns any multi-batched algorithm for sequential decision making with instantaneous feedback into a sample-efficient algorithm that can handle stochastic delays in sequential decision making. By plugging different multi-batched algorithms into our framework, we provide several examples demonstrating that our framework not only matches or improves existing results for bandits, tabular MDPs, and tabular MGs, but also provides the first line of studies on delays in sequential decision making with function approximation. In summary, we provide a complete set of sharp results for single-agent and multi-agent sequential decision making with delayed feedback.

PDF Details

NeurIPS Conference 2023 Conference Paper

Learning Score-based Grasping Primitive for Human-assisting Dexterous Grasping

Tianhao Wu
Mingdong Wu
Jiyao Zhang
Yunchong Gan
Hao Dong

The use of anthropomorphic robotic hands for assisting individuals in situations where human hands may be unavailable or unsuitable has gained significant importance. In this paper, we propose a novel task called human-assisting dexterous grasping that aims to train a policy for controlling a robotic hand's fingers to assist users in grasping objects. Unlike conventional dexterous grasping, this task presents a more complex challenge as the policy needs to adapt to diverse user intentions, in addition to the object's geometry. We address this challenge by proposing an approach consisting of two sub-modules: a hand-object-conditional grasping primitive called Grasping Gradient Field (GraspGF), and a history-conditional residual policy. GraspGF learns 'how' to grasp by estimating the gradient of a synthesised success grasping example set, while the residual policy determines 'when' and at what speed the grasping action should be executed based on the trajectory history. Experimental results demonstrate the superiority of our proposed method compared to baselines, highlighting the user-awareness and practicality in real-world applications. The codes and demonstrations can be viewed at https: //sites. google. com/view/graspgf.

PDF Details

NeurIPS Conference 2023 Conference Paper

Neural Fields with Hard Constraints of Arbitrary Differential Order

Fangcheng Zhong
Kyle Fogarty
Param Hanji
Tianhao Wu
Alejandro Sztrajman
Andrew Spielberg
Andrea Tagliasacchi
Petra Bosilj

While deep learning techniques have become extremely popular for solving a broad range of optimization problems, methods to enforce hard constraints during optimization, particularly on deep neural networks, remain underdeveloped. Inspired by the rich literature on meshless interpolation and its extension to spectral collocation methods in scientific computing, we develop a series of approaches for enforcing hard constraints on neural fields, which we refer to as Constrained Neural Fields (CNF). The constraints can be specified as a linear operator applied to the neural field and its derivatives. We also design specific model representations and training strategies for problems where standard models may encounter difficulties, such as conditioning of the system, memory consumption, and capacity of the network when being constrained. Our approaches are demonstrated in a wide range of real-world applications. Additionally, we develop a framework that enables highly efficient model and constraint specification, which can be readily applied to any downstream task where hard constraints need to be explicitly satisfied during optimization.

PDF Details

AAAI Conference 2023 Conference Paper

Practical Markov Boundary Learning without Strong Assumptions

Xingyu Wu
Bingbing Jiang
Tianhao Wu
Huanhuan Chen

Theoretically, the Markov boundary (MB) is the optimal solution for feature selection. However, existing MB learning algorithms often fail to identify some critical features in real-world feature selection tasks, mainly because the strict assumptions of existing algorithms, on either data distribution, variable types, or correctness of criteria, cannot be satisfied in application scenarios. This paper takes further steps toward opening the door to real-world applications for MB. We contribute in particular to a practical MB learning strategy, which can maintain feasibility and effectiveness in real-world data where variables can be numerical or categorical with linear or nonlinear, pairwise or multivariate relationships. Specifically, the equivalence between MB and the minimal conditional covariance operator (CCO) is investigated, which inspires us to design the objective function based on the predictability evaluation of the mapping variables in a reproducing kernel Hilbert space. Based on this, a kernel MB learning algorithm is proposed, where nonlinear multivariate dependence could be considered without extra requirements on data distribution and variable types. Extensive experiments demonstrate the efficacy of these contributions.

PDF Details DOI

NeurIPS Conference 2022 Conference Paper

D^2NeRF: Self-Supervised Decoupling of Dynamic and Static Objects from a Monocular Video

Tianhao Wu
Fangcheng Zhong
Andrea Tagliasacchi
Forrester Cole
Cengiz Oztireli

Given a monocular video, segmenting and decoupling dynamic objects while recovering the static environment is a widely studied problem in machine intelligence. Existing solutions usually approach this problem in the image domain, limiting their performance and understanding of the environment. We introduce Decoupled Dynamic Neural Radiance Field (D^2NeRF), a self-supervised approach that takes a monocular video and learns a 3D scene representation which decouples moving objects, including their shadows, from the static background. Our method represents the moving objects and the static background by two separate neural radiance fields with only one allowing for temporal changes. A naive implementation of this approach leads to the dynamic component taking over the static one as the representation of the former is inherently more general and prone to overfitting. To this end, we propose a novel loss to promote correct separation of phenomena. We further propose a shadow field network to detect and decouple dynamically moving shadows. We introduce a new dataset containing various dynamic objects and shadows and demonstrate that our method can achieve better performance than state-of-the-art approaches in decoupling dynamic and static 3D objects, occlusion and shadow removal, and image segmentation for moving objects. Project page: https: //d2nerf. github. io/

PDF Details

NeurIPS Conference 2022 Conference Paper

Generalization Bounds for Estimating Causal Effects of Continuous Treatments

Xin Wang
Shengfei Lyu
Xingyu Wu
Tianhao Wu
Huanhuan Chen

We focus on estimating causal effects of continuous treatments (e. g. , dosage in medicine), also known as dose-response function. Existing methods in causal inference for continuous treatments using neural networks are effective and to some extent reduce selection bias, which is introduced by non-randomized treatments among individuals and might lead to covariate imbalance and thus unreliable inference. To theoretically support the alleviation of selection bias in the setting of continuous treatments, we exploit the re-weighting schema and the Integral Probability Metric (IPM) distance to derive an upper bound on the counterfactual loss of estimating the average dose-response function (ADRF), and herein the IPM distance builds a bridge from a source (factual) domain to an infinite number of target (counterfactual) domains. We provide a discretized approximation of the IPM distance with a theoretical guarantee in the practical implementation. Based on the theoretical analyses, we also propose a novel algorithm, called Average Dose- response estiMatIon via re-weighTing schema (ADMIT). ADMIT simultaneously learns a re-weighting network, which aims to alleviate the selection bias, and an inference network, which makes factual and counterfactual estimations. In addition, the effectiveness of ADMIT is empirically demonstrated in both synthetic and semi-synthetic experiments by outperforming the existing benchmarks.

PDF Details

NeurIPS Conference 2022 Conference Paper

Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning

Yuanpei Chen
Tianhao Wu
Shengjie Wang
Xidong Feng
Jiechuan Jiang
Zongqing Lu
Stephen McAleer
Hao Dong

Achieving human-level dexterity is an important open problem in robotics. However, tasks of dexterous hand manipulation even at the baby level are challenging to solve through reinforcement learning (RL). The difficulty lies in the high degrees of freedom and the required cooperation among heterogeneous agents (e. g. , joints of fingers). In this study, we propose the Bimanual Dexterous Hands Benchmark (Bi-DexHands), a simulator that involves two dexterous hands with tens of bimanual manipulation tasks and thousands of target objects. Tasks in Bi-DexHands are first designed to match human-level motor skills according to literature in cognitive science, and then are built in Issac Gym; this enables highly efficient RL trainings, reaching 30, 000+ FPS by only one single NVIDIA RTX 3090. We provide a comprehensive benchmark for popular RL algorithms under different settings; this includes multi-agent RL, offline RL, multi-task RL, and meta RL. Our results show that PPO type on-policy algorithms can learn to solve simple manipulation tasks that are equivalent up to 48-month human baby (e. g. , catching a flying object, opening a bottle), while multi-agent RL can further help to learn manipulations that require skilled bimanual cooperation (e. g. , lifting a pot, stacking blocks). Despite the success on each individual task, when it comes to mastering multiple manipulation skills, existing RL algorithms fail to work in most of the multi-task and the few-shot learning tasks, which calls for more future development from the RL community. Our project is open-sourced at https: //github. com/PKU-MARL/DexterousHands.

PDF Details

NeurIPS Conference 2020 Conference Paper

Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot

Jingtong Su
Yihang Chen
Tianle Cai
Tianhao Wu
Ruiqi Gao
Liwei Wang
Jason D. Lee

Network pruning is a method for reducing test-time computational resource requirements with minimal performance degradation. Conventional wisdom of pruning algorithms suggests that: (1) Pruning methods exploit information from training data to find good subnetworks; (2) The architecture of the pruned network is crucial for good performance. In this paper, we conduct sanity checks for the above beliefs on several recent unstructured pruning methods and surprisingly find that: (1) A set of methods which aims to find good subnetworks of the randomly-initialized network (which we call initial tickets''), hardly exploits any information from the training data; (2) For the pruned networks obtained by these methods, randomly changing the preserved weights in each layer, while keeping the total number of preserved weights unchanged per layer, does not affect the final performance. These findings inspire us to choose a series of simple \emph{data-independent} prune ratios for each layer, and randomly prune each layer accordingly to get a subnetwork (which we call random tickets''). Experimental results show that our zero-shot random tickets outperforms or attains similar performance compared to existing initial tickets''. In addition, we identify one existing pruning method that passes our sanity checks. We hybridize the ratios in our random ticket with this method and propose a new method called hybrid tickets'', which achieves further improvement.

PDF Details

IROS Conference 2019 Conference Paper

End-to-End Driving Model for Steering Control of Autonomous Vehicles with Future Spatiotemporal Features

Tianhao Wu
Ao Luo
Rui Huang 0008
Hong Cheng 0002
Yang Zhao 0024

End-to-end deep learning has gained considerable interests in autonomous driving vehicles in both academic and industrial fields, especially in decision making process. One critical issue in decision making process of autonomous driving vehicles is steering control. Researchers has already trained different artificial neural networks to predict steering angle with front-facing camera data stream. However, existing end-to-end methods only consider the spatiotemporal relation on a single layer and lack the ability of extracting future spatiotemporal information. In this paper, we propose an end-to-end driving model based on Convolutional Long Short-Term Memory (Conv-LSTM) neural network with a Multi-scale Spatiotemporal Integration (MSI) module, which aiming to encode the spatiotemporal information from different scales for steering angle prediction. Moreover, we employ future sequential information to enhance spatiotemporal features of the end-to-end driving model. We demonstrate the efficiency of proposed end-to-end driving model on the public Udacity dataset with comparison of some existing methods. Experimental results show that the proposed model has better performances than other existing methods, especially in some complex scenarios. Furthermore, we evaluate the proposed driving model on a real-time autonomous vehicle, and results show that the proposed driving model is able to predict the steering angle with high accuracy compared to skilled human driver.

Details