Arrow Research search

Author name cluster

Chao Zhou

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers
2 author rows

Possible papers

14

TMLR Journal 2025 Journal Article

\copyright Plug-in Authorization for Human Copyright Protection in Text-to-Image Model

  • Chao Zhou
  • Huishuai Zhang
  • Jiang Bian
  • Weiming Zhang
  • Nenghai Yu

This paper addresses the contentious issue of copyright infringement in images generated by text-to-image models, sparking debates among AI developers, content creators, and legal entities. State-of-the-art models create high-quality content without crediting original creators, causing concern in the artistic community and model providers. To mitigate this, we propose the ©Plug-in Authorization framework, introducing three operations: addition, extraction, and combination. Addition involves training a ©plug-in for specific copyright, facilitating proper credit attribution. The extraction allows creators to reclaim copyright from infringing models, and the combination enables users to merge different ©plug-ins. These operations act as permits, incentivizing fair use and providing flexibility in authorization. We present innovative approaches, ``Reverse LoRA'' for extraction and ``EasyMerge'' for seamless combination. Experiments in artist-style replication and cartoon IP recreation demonstrate ©plug-ins' effectiveness, offering a valuable solution for human copyright protection in the age of generative AIs. The code is available at \url{https://github.com/zc1023/-Plug-in-Authorization.git}

IJCAI Conference 2025 Conference Paper

Accelerating Diffusion-based Super-Resolution with Dynamic Time-Spatial Sampling

  • Rui Qin
  • Qijie Wang
  • Ming Sun
  • Haowei Zhu
  • Chao Zhou
  • Bin Wang

Diffusion models have gained attention for their success in modeling complex distributions, achieving impressive perceptual quality in SR tasks. However, existing diffusion-based SR methods often suffer from high computational costs, requiring numerous iterative steps for training and inference. Existing acceleration techniques, such as distillation and solver optimization, are generally task-agnostic and do not fully leverage the specific characteristics of low-level tasks like super-resolution (SR). In this study, we analyze the frequency- and spatial-domain properties of diffusion-based SR methods, revealing key insights into the temporal and spatial dependencies of high-frequency signal recovery. Specifically, high-frequency details benefit from concentrated optimization during early and late diffusion iterations, while spatially textured regions demand adaptive denoising strategies. Building on these observations, we propose the Time-Spatial-aware Sampling strategy (TSS) for the acceleration of Diffusion SR without any extra training cost. TSS combines Time Dynamic Sampling (TDS), which allocates more iterations to refining textures, and Spatial Dynamic Sampling (SDS), which dynamically adjusts strategies based on image content. Extensive evaluations across multiple benchmarks demonstrate that TSS achieves state-of-the-art (SOTA) performance with significantly fewer iterations, improving MUSIQ scores by 0. 2~3. 0 and outperforming the current acceleration methods with only half the number of steps.

ICML Conference 2025 Conference Paper

Mirror, Mirror of the Flow: How Does Regularization Shape Implicit Bias?

  • Tom Jacobs
  • Chao Zhou
  • Rebekka Burkholz

Implicit bias plays an important role in explaining how overparameterized models generalize well. Explicit regularization like weight decay is often employed in addition to prevent overfitting. While both concepts have been studied separately, in practice, they often act in tandem. Understanding their interplay is key to controlling the shape and strength of implicit bias, as it can be modified by explicit regularization. To this end, we incorporate explicit regularization into the mirror flow framework and analyze its lasting effects on the geometry of the training dynamics, covering three distinct effects: positional bias, type of bias, and range shrinking. Our analytical approach encompasses a broad class of problems, including sparse coding, matrix sensing, single-layer attention, and LoRA, for which we demonstrate the utility of our insights. To exploit the lasting effect of regularization and highlight the potential benefit of dynamic weight decay schedules, we propose to switch off weight decay during training, which can improve generalization, as we demonstrate in experiments.

NeurIPS Conference 2025 Conference Paper

Neural B-frame Video Compression with Bi-directional Reference Harmonization

  • Yuxi Liu
  • jin dengchao
  • Shuai Huo
  • Jiawen Gu
  • Chao Zhou
  • Huihui Bai
  • Ming Lu
  • Zhan Ma

Neural video compression (NVC) has made significant progress in recent years, while neural B-frame video compression (NBVC) remains underexplored compared to P-frame compression. NBVC can adopt bi-directional reference frames for better compression performance. However, NBVC's hierarchical coding may complicate continuous temporal prediction, especially at some hierarchical levels with a large frame span, which could cause the contribution of the two reference frames to be unbalanced. To optimize reference information utilization, we propose a novel NBVC method, termed Bi-directional Reference Harmonization Video Compression (BRHVC), with the proposed Bi-directional Motion Converge (BMC) and Bi-directional Contextual Fusion (BCF). BMC converges multiple optical flows in motion compression, leading to more accurate motion compensation on a larger scale. Then BCF explicitly models the weights of reference contexts under the guidance of motion compensation accuracy. With more efficient motions and contexts, BRHVC can effectively harmonize bi-directional references. Experimental results indicate that our BRHVC outperforms previous state-of-the-art NVC methods, even surpassing the traditional coding, VTM-RA (under random access configuration), on the HEVC datasets. The source code will be released. The source code is released at https: //github. com/kwai/NVC.

NeurIPS Conference 2025 Conference Paper

Pay Attention to Small Weights

  • Chao Zhou
  • Tom Jacobs
  • Advait Gadhikar
  • Rebekka Burkholz

Finetuning large pretrained neural networks is known to be resource-intensive, both in terms of memory and computational cost. To mitigate this, a common approach is to restrict training to a subset of the model parameters. By analyzing the relationship between gradients and weights during finetuning, we observe a notable pattern: large gradients are often associated with small-magnitude weights. This correlation is more pronounced in fine-tuning settings than in training from scratch. Motivated by this observation, we propose \textsc{NanoAdam}, which dynamically updates only the small-magnitude weights during fine-tuning and offers several practical advantages: first, the criterion is \emph{gradient-free}—the parameter subset can be determined without gradient computation; second, it preserves large-magnitude weights, which are likely to encode critical features learned during pre-training, thereby reducing the risk of catastrophic forgetting; thirdly, it permits the use of larger learning rates and consistently leads to better generalization performance in experiments. We demonstrate this for both NLP and vision tasks.

AAAI Conference 2025 Conference Paper

Plug-and-Play Tri-Branch Invertible Block for Image Rescaling

  • Jingwei Bao
  • Jinhua Hao
  • Pengcheng Xu
  • Ming Sun
  • Chao Zhou
  • Shuyuan Zhu

High-resolution (HR) images are commonly downscaled to low-resolution (LR) to reduce bandwidth, followed by upscaling to restore their original details. Recent advancements in image rescaling algorithms have employed invertible neural networks (INNs) to create a unified framework for downscaling and upscaling, ensuring a one-to-one mapping between LR and HR images. Traditional methods, utilizing dual-branch based vanilla invertible blocks, process high-frequency and low-frequency information separately, often relying on specific distributions to model high-frequency components. However, processing the low-frequency component directly in the RGB domain introduces channel redundancy, limiting the efficiency of image reconstruction. To address these challenges, we propose a plug-and-play tri-branch invertible block (T-InvBlocks) that decomposes the low- frequency branch into luminance (Y) and chrominance (CbCr) components, reducing redundancy and enhancing feature processing. Additionally, we adopt an all-zero mapping strategy for high-frequency components during upscaling, focusing essential rescaling information within the LR image. Our T-InvBlocks can be seamlessly integrated into existing rescaling models, improving performance in both general rescaling tasks and scenarios involving lossy compression. Extensive experiments confirm that our method advances the state of the art in HR image reconstruction.

JBHI Journal 2025 Journal Article

PMMNet: A Dual Branch Fusion Network of Point Cloud and Multi-View for Intracranial Aneurysm Classification and Segmentation

  • Ruifen Cao
  • Dongwei Zhang
  • Pijing Wei
  • Yun Ding
  • Chunhou Zheng
  • Dayu Tan
  • Chao Zhou

Intracranial aneurysm (IA) is a vascular disease of the brain arteries caused by pathological vascular dilation, which can result in subarachnoid hemorrhage if ruptured. Automatically classification and segmentation of intracranial aneurysms are essential for their diagnosis and treatment. However, the majority of current research is focused on two-dimensional images, ignoring the 3D spatial information that is also critical. In this work, we propose a novel dual-branch fusion network called the Point Cloud and Multi-View Medical Neural Network (PMMNet) for IA classification and segmentation. Specifically, one branch based on 3D point clouds serves the purpose of extracting spatial features, whereas the other branch based on multi-view images acquires 2D pixel features. Ultimately, the two types of features are fused for IA classification and segmentation. To extract both local and global features from 3D point clouds, Multilayer Perceptron (MLP) and the attention mechanism are used in parallel. In addition, a SPSA module is proposed for multi-view image feature learning, which extracts more exquisite channel and spatial multi-scale features from 2D images. Experiments conducted on the IntrA dataset outperform other state-of-the-art methods, demonstrating that the proposed PMMNet exhibits strong superiority on the medical 3D dataset. We also obtain competitive results on public datasets, including ModelNet40, ModelNet10, and ShapeNetPart, which further validate the robustness and generality of the PMMNet.

NeurIPS Conference 2025 Conference Paper

Sign-In to the Lottery: Reparameterizing Sparse Training

  • Advait Gadhikar
  • Tom Jacobs
  • Chao Zhou
  • Rebekka Burkholz

The performance gap between training sparse neural networks from scratch (PaI) and dense-to-sparse training presents a major roadblock for efficient deep learning. According to the Lottery Ticket Hypothesis, PaI hinges on finding a problem specific parameter initialization. As we show, to this end, determining correct parameter signs is sufficient. Yet, they remain elusive to PaI. To address this issue, we propose Sign-In, which employs a dynamic reparameterization that provably induces sign flips. Such sign flips are complementary to the ones that dense-to-sparse training can accomplish, rendering Sign-In as an orthogonal method. While our experiments and theory suggest performance improvements of PaI, they also carve out the main open challenge to close the gap between PaI and dense-to-sparse training.

ICML Conference 2025 Conference Paper

Ultra Lowrate Image Compression with Semantic Residual Coding and Compression-aware Diffusion

  • Anle Ke
  • Xu Zhang 0027
  • Tong Chen 0004
  • Ming Lu 0003
  • Chao Zhou
  • Jiawen Gu
  • Zhan Ma 0001

Existing multimodal large model-based image compression frameworks often rely on a fragmented integration of semantic retrieval, latent compression, and generative models, resulting in suboptimal performance in both reconstruction fidelity and coding efficiency. To address these challenges, we propose a residual-guided ultra lowrate image compression named ResULIC, which incorporates residual signals into both semantic retrieval and the diffusion-based generation process. Specifically, we introduce Semantic Residual Coding (SRC) to capture the semantic disparity between the original image and its compressed latent representation. A perceptual fidelity optimizer is further applied for superior reconstruction quality. Additionally, we present the Compression-aware Diffusion Model (CDM), which establishes an optimal alignment between bitrates and diffusion time steps, improving compression-reconstruction synergy. Extensive experiments demonstrate the effectiveness of ResULIC, achieving superior objective and subjective performance compared to state-of-the-art diffusion-based methods with -80. 7%, -66. 3% BD-rate saving in terms of LPIPS and FID.

ICRA Conference 2022 Conference Paper

An Indeterministic Vision-Based State Observer for Growing Magnetic Microrobot Motion Status Estimation

  • Zhiyong Sun 0002
  • Yu Cheng 0006
  • Chao Zhou
  • Erkang Cheng
  • Gengliang Chen
  • Lixin Dong
  • Bo Song

To date, untethered micro/nanorobots have attracted considerable attention in various aspects due to their unique potential for in-vivo applications such as the targeted therapy. One of the most promising types of micro/nanorobots is the class of ferromagnetic microrobots which can be efficiently actuated via gradient/rotational magnetic field generated by less costly electromagnetic coil systems. For performing successful operations, locomotion control of the magnetic microrobots is non-trivial. Modern controllers commonly require motion status-based feedback. To fully utilize those advanced approaches, motion state of one microrobot should be supplied, however it is still challenging in cases. It is noted that, during locomotion, one ferromagnetic microrobot can combine with others to form an unstructured larger one, namely growing magnetic microrobot (GMM), whose dynamic behavior keeps changing, and thus the model-based observers are never applicable. Besides, tracking and estimating states of those unstructured time-varying GMMs in complex surroundings are always challenging, especially for an uneven sampling scenario. In order to accurately estimate the GMM motion status in a complex environment via micro-vision, this study develops an indeterministic observer leveraging on the approach of discriminative correlation filter with channel/spatial reliability (CSR-DCF) and the variable-step finite-time sliding mode (FSM-V) state estimation theory. Experimental study verifies that the proposed observation scheme can effectively estimate motion states of one GMM moving in obstacle surroundings throughout.

IROS Conference 2022 Conference Paper

Modeling and Characterization of Artificial Bacteria Flagella with Micro-structured Soft-magnetic Teeth

  • Zejie Yu
  • Chaojian Hou
  • Shuideng Wang
  • Kun Wang 0036
  • Donglei Chen
  • Wenqi Zhang 0004
  • Zhi Qu
  • Zhiyong Sun 0002

Sub-structures such as micro-structured magnetic teeth fabricated with an artificial bacteria flagellum (ABF) are designed for achieving more motion modes, higher precision, and better controllability. To achieve these, a more precise model considering the non-circular cross-sectional features is setup without simplifying the structure as a helical filament with a circular cross-section as having been used in previous investigations, making it possible to include the effects of the substructures into the motion equation. Analyses and experiments verified the correctness. Besides of the geometric effects, our experimental observation also shows an anomalous step-out frequency appeared in an ABF. This asynchronous motion is attributed to the lag of magnetization with respect to the external rotating magnetic field due to the geometries and the soft-magnetic materials of the ribbons, which is different from the regular asynchronous motion solely caused by low Reynolds number of fluid to microscopic swimmers. While the lag of magnetization can be further attributed initiatively to the soft magnetic materials adopted, the feasibility to arrange the easy axis will enable many new possibilities, which is of particular interest in generating more modes for swarms such as cascade stepping out of ABFs with the same nominal overall sizes and for more precise positioning using stepping motion.

ICRA Conference 2021 Conference Paper

3D Periodic Magnetic Servoing System for Microrobot Actuation Using Decoupled Asynchronous Repetitive Control Approach

  • Zhiyong Sun 0002
  • Yu Cheng 0006
  • Chao Zhou
  • Erkang Cheng
  • Gengliang Chen
  • Lixin Dong
  • Bo Song

To date, untethered microrobots have been receiving tremendous attention for playing implacable roles of maneuverable tools in fields such as microfabrication and biomanipulation. Typical actuation of such untethered tiny robots is the magnetic field-based approaches, including gradient and rotational methods. Compared to the gradient type method, the rotational approach requires much less magnetic field strength to generate efficient actuation for magnetic microrobots. To actuate microrobots desirably, a precise periodic magnetic field should be provided. To generate precise periodic magnetic field with enhanced strength, this paper develops a prototype of 3D magnetic servoing system based on integrated solenoids, performance of which are enhanced by employing iron cores and extended number of coils. Each solenoid is equipped with a Hall sensor to provide real-time feedback signal for performing precise magnetic field control. To precisely regulate this setup, a decoupled asynchronous repetitive control (DARC) scheme is established to generate a desirable 3D periodic magnetic field with noise-level tracking error under the situation of missing execution opportunity randomly. Experimental results demonstrate the effectiveness of the proposed magnetic servoing system, which is promising for dynamic properties characterization of magnetic microrobots.

ICRA Conference 2003 Conference Paper

Mobile robot self-localization based on global visual appearance features

  • Chao Zhou
  • Yucheng Wei
  • Tieniu Tan

The paper presents a novel method for mobile robot localization using visual appearance features. A multidimensional histogram is used to describe the global appearance features of an image such as colors, edge density, gradient magnitude, textures and so on. The matching of histograms determines the location of the robot. The method has been evaluated in an indoor environment, and the system correctly determines the location of 82. 9% of the input scene images.