Arrow Research search

Author name cluster

Bin Fu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

28 papers
1 author row

Possible papers

28

AAAI Conference 2026 Conference Paper

GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and a Comprehensive Multimodal Dataset Towards General Medical AI

  • Tianbin Li
  • Yanzhou Su
  • Wei Li
  • Bin Fu
  • Zhe Chen
  • Ziyan Huang
  • Guoan Wang
  • Chenglong Ma

Despite significant advancements in general AI, its effectiveness in the medical domain is limited by the lack of specialized medical knowledge. To address this, we formulate GMAI-VL-5.5M, a multimodal medical dataset created by converting hundreds of specialized medical datasets with various annotations into high-quality image-text pairs. This dataset offers comprehensive task coverage, diverse modalities, and rich image-text data. Building upon this dataset, we develop GMAI-VL, a 7B-parameter general medical vision-language model, with a three-stage training strategy that enhances the integration of visual and textual information. This approach significantly improves the model's ability to process multimodal data, supporting accurate diagnoses and clinical decision-making. Experiments show that GMAI-VL achieves state-of-the-art performance across various multimodal medical tasks, including visual question answering and medical image diagnosis.

EAAI Journal 2025 Journal Article

A dual-branch convolutional neural network with domain-informed attention for arrhythmia classification of 12-lead electrocardiograms

  • Rucheng Jiang
  • Bin Fu
  • Renfa Li
  • Rui Li
  • Danny Z. Chen
  • Yan Liu
  • Guoqi Xie
  • Keqin Li

The automatic classification of arrhythmia is an important task in the intelligent auxiliary diagnosis of an electrocardiogram. Its efficiency and accuracy are vital for practical deployment and applications in the medical field. For the 12-lead electrocardiogram, we know that the comprehensive utilization of lead characteristics is key to enhancing diagnostic accuracy. However, existing classification methods (1) neglect the similarities and differences between the limb lead group and the precordial lead group; (2) the commonly adopted attention mechanisms struggle to capture the domain characteristics in an electrocardiogram. To address these issues, we propose a new dual-branch convolutional neural network with domain-informed attention, which is novel in two ways. First, it adopts a dual-branch network to extract intra-group similarities and inter-group differences of limb and precordial leads. Second, it proposes a domain-informed attention mechanism to embed the critical domain knowledge of electrocardiogram, multiple RR (R wave to R wave) intervals, into coordinated attention to adaptively assign attention weights to key segments, thereby effectively capturing the characteristics of the electrocardiogram domain. Experimental results show that our method achieves an F1-score of 0. 905 and a macro area under the curve of 0. 936 on two widely used large-scale datasets, respectively. Compared to state-of-the-art methods, our method shows significant performance improvements with a drastic reduction in model parameters.

TCS Journal 2025 Journal Article

Reachability in restricted chemical reaction networks

  • Robert M. Alaniz
  • Bin Fu
  • Timothy Gomez
  • Elise Grizzell
  • Andrew Rodriguez
  • Marco Rodriguez
  • Robert Schweller
  • Tim Wylie

The popularity of molecular computation has given rise to several models of abstraction, one of the more recent ones being Chemical Reaction Networks (CRNs). These are equivalent to other popular computational models, such as Vector Addition Systems and Petri-Nets, and restricted versions are equivalent to Population Protocols. This paper continues the work on core reachability questions related to Chemical Reaction Networks; given two configurations, can one reach the other according to the system's rules? With no restrictions, reachability was recently shown to be Ackermann-complete, which resolved a decades-old problem. In this work, we fully characterize monotone reachability problems based on various restrictions such as the allowed rule size, the number of rules that may create a species (k-source), the number of rules that may consume a species (k-consuming), the volume, and whether the rules have an acyclic production order (feed-forward). We show PSPACE-completeness of reachability with only bimolecular reactions in two-source and two-consuming rules. This proves hardness of reachability in a restricted form of Population Protocols. This is accomplished using new techniques within the motion planning framework. We give several important results for feed-forward CRNs, where rules are single-source or single-consuming. We show that reachability is solvable in polynomial time as long as the system does not contain special void or autogenesis rules. We then fully characterize all systems of this type and show that with void/autogenesis rules, or more than one source and one consuming, the problems become NP-complete. Finally, we show several interesting special cases of CRNs based on these restrictions or slight relaxations and note future significant open questions related to this taxonomy.

NeurIPS Conference 2024 Conference Paper

GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI

  • Pengcheng Chen
  • Jin Ye
  • Guoan Wang
  • Yanjun Li
  • Zhongying Deng
  • Wei Li
  • Tianbin Li
  • Haodong Duan

Large Vision-Language Models (LVLMs) are capable of handling diverse data types such as imaging, text, and physiological signals, and can be applied in various fields. In the medical field, LVLMs have a high potential to offer substantial assistance for diagnosis and treatment. Before that, it is crucial to develop benchmarks to evaluate LVLMs' effectiveness in various medical applications. Current benchmarks are often built upon specific academic literature, mainly focusing on a single domain, and lacking varying perceptual granularities. Thus, they face specific challenges, including limited clinical relevance, incomplete evaluations, and insufficient guidance for interactive LVLMs. To address these limitations, we developed the GMAI-MMBench, the most comprehensive general medical AI benchmark with well-categorized data structure and multi-perceptual granularity to date. It is constructed from 284 datasets across 38 medical image modalities, 18 clinical-related tasks, 18 departments, and 4 perceptual granularities in a Visual Question Answering (VQA) format. Additionally, we implemented a lexical tree structure that allows users to customize evaluation tasks, accommodating various assessment needs and substantially supporting medical AI research and applications. We evaluated 50 LVLMs, and the results show that even the advanced GPT-4o only achieves an accuracy of 53. 96\%, indicating significant room for improvement. Moreover, we identified five key insufficiencies in current cutting-edge LVLMs that need to be addressed to advance the development of better medical applications. We believe that GMAI-MMBench will stimulate the community to build the next generation of LVLMs toward GMAI.

NeurIPS Conference 2024 Conference Paper

MeshXL: Neural Coordinate Field for Generative 3D Foundation Models

  • Sijin Chen
  • Xin Chen
  • Anqi Pang
  • Xianfang Zeng
  • Wei Cheng
  • Yijun Fu
  • Fukun Yin
  • Zhibin Wang

The polygon mesh representation of 3D data exhibits great flexibility, fast rendering speed, and storage efficiency, which is widely preferred in various applications. However, given its unstructured graph representation, the direct generation of high-fidelity 3D meshes is challenging. Fortunately, with a pre-defined ordering strategy, 3D meshes can be represented as sequences, and the generation process can be seamlessly treated as an auto-regressive problem. In this paper, we validate Neural Coordinate Field (NeurCF), an explicit coordinate representation with implicit neural embeddings, is a simple-yet-effective representation for large-scale sequential mesh modeling. After that, we present MeshXL, a family of generative pre-trained auto-regressive models that addresses 3D mesh generation with modern large language model approaches. Extensive experiments show that MeshXL is able to generate high-quality 3D meshes, and can also serve as foundation models for various down-stream applications.

AAAI Conference 2024 Conference Paper

Point2Real: Bridging the Gap between Point Cloud and Realistic Image for Open-World 3D Recognition

  • Hanxuan Li
  • Bin Fu
  • Ruiping Wang
  • Xilin Chen

Recognition in open-world scenarios is an important and challenging field, where Vision-Language Pre-training paradigms have greatly impacted the 2D domain. This inspires a growing interest in introducing 2D pre-trained models, such as CLIP, into the 3D domain to enhance the ability of point cloud understanding. Considering the difference between discrete 3D point clouds and real-world 2D images, reducing the domain gap is crucial. Some recent works project point clouds onto a 2D plane to enable 3D zero-shot capabilities without training. However, this simplistic approach leads to an unclear or even distorted geometric structure, limiting the potential of 2D pre-trained models in 3D. To address the domain gap, we propose Point2Real, a training-free framework based on the realistic rendering technique to automate the transformation of the 3D point cloud domain into the Vision-Language domain. Specifically, Point2Real leverages a shape recovery module that devises an iterative ball-pivoting algorithm to convert point clouds into meshes, narrowing the gap in shape at first. To simulate photo-realistic images, a set of refined textures as candidates is applied for rendering, where the CLIP confidence is utilized to select the suitable one. Moreover, to tackle the viewpoint challenge, a heuristic multi-view adapter is implemented for feature aggregation, which exploits the depth surface as an effective indicator of view-specific discriminability for recognition. We conduct experiments on ModelNet10, ModelNet40, and ScanObjectNN datasets, and the results demonstrate that Point2Real outperforms other approaches in zero-shot and few-shot tasks by a large margin.

NeurIPS Conference 2023 Conference Paper

Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation

  • Zibo Zhao
  • Wen Liu
  • Xin Chen
  • Xianfang Zeng
  • Rui Wang
  • Pei Cheng
  • Bin Fu
  • Tao Chen

We present a novel alignment-before-generation approach to tackle the challenging task of generating general 3D shapes based on 2D images or texts. Directly learning a conditional generative model from images or texts to 3D shapes is prone to producing inconsistent results with the conditions because 3D shapes have an additional dimension whose distribution significantly differs from that of 2D images and texts. To bridge the domain gap among the three modalities and facilitate multi-modal-conditioned 3D shape generation, we explore representing 3D shapes in a shape-image-text-aligned space. Our framework comprises two models: a Shape-Image-Text-Aligned Variational Auto-Encoder (SITA-VAE) and a conditional Aligned Shape Latent Diffusion Model (ASLDM). The former model encodes the 3D shapes into the shape latent space aligned to the image and text and reconstructs the fine-grained 3D neural fields corresponding to given shape embeddings via the transformer-based decoder. The latter model learns a probabilistic mapping function from the image or text space to the latent shape space. Our extensive experiments demonstrate that our proposed approach can generate higher-quality and more diverse 3D shapes that better semantically conform to the visual or textural conditional inputs, validating the effectiveness of the shape-image-text-aligned space for cross-modality 3D shape generation.

TCS Journal 2023 Journal Article

Streaming approximation scheme for minimizing total completion time on parallel machines subject to varying processing capacity

  • Bin Fu
  • Yumei Huo
  • Hairong Zhao

We study the problem of minimizing total completion time on parallel machines subject to varying processing capacity. In this paper, we develop an approximation scheme for the problem under the data stream model where the input data is massive and cannot fit into memory and thus can only be scanned a few times. Our algorithm can compute an approximate value of the optimal total completion time in one pass and output the schedule with the approximate value in two passes.

EAAI Journal 2023 Journal Article

Train a central traffic prediction model using local data: A spatio-temporal network based on federated learning

  • Hao Huang
  • Zhiqun Hu
  • Yueting Wang
  • Zhaoming Lu
  • Xiangming Wen
  • Bin Fu

With the growing complexity of onboard sensors and the widespread deployment of road sensors, deep learning enables fine-grained traffic prediction using massive amounts of raw traffic data, which facilitates accurate analysis of traffic information in the Internet of Vehicles (IoV). However, most existing studies focus on using all the local data to jointly build a prediction model, facing severe challenges of data security and privacy concerns as well as substantial communication overhead. To address these challenges, in this paper, we propose the Spatial-Temporal Traffic Prediction Network based on federated learning (F-STTP-Net), which only updates the model parameters to the centralized server without any private data. Firstly, we design a sub-area division method, which divides the road network into sub-areas with different macroscopic fundamental diagram properties. Then, we propose a local training model for each sub-area, which uses the graph attention network (GAT) and the long short-term memory (LSTM) to capture the spatio-temporal dependence of the road network. The model uses the branch structure to predict the traffic volume of each intersection in the sub-area. Finally, the local models are aggregated based on federated learning to form a powerful central model, which bridges the constraints on global data sharing and privacy guarantee. We conduct experiments on the real-life dataset in Xuchang Lotus Lake 5G automated vehicles demonstration area to demonstrate that F-STTP-Net can achieve excellent prediction performance without the interaction of sub-area raw data. In addition, the proposed model has a strong generalization ability and can be quickly transferred to a new sub-area.

NeurIPS Conference 2022 Conference Paper

Hierarchical Normalization for Robust Monocular Depth Estimation

  • Chi Zhang
  • Wei Yin
  • Billzb Wang
  • Gang Yu
  • Bin Fu
  • Chunhua Shen

In this paper, we address monocular depth estimation with deep neural networks. To enable training of deep monocular estimation models with various sources of datasets, state-of-the-art methods adopt image-level normalization strategies to generate affine-invariant depth representations. However, learning with the image-level normalization mainly emphasizes the relations of pixel representations with the global statistic in the images, such as the structure of the scene, while the fine-grained depth difference may be overlooked. In this paper, we propose a novel multi-scale depth normalization method that hierarchically normalizes the depth representations based on spatial information and depth distributions. Compared with previous normalization strategies applied only at the holistic image level, the proposed hierarchical normalization can effectively preserve the fine-grained details and improve accuracy. We present two strategies that define the hierarchical normalization contexts in the depth domain and the spatial domain, respectively. Our extensive experiments show that the proposed normalization strategy remarkably outperforms previous normalization methods, and we set new state-of-the-art on five zero-shot transfer benchmark datasets.

TCS Journal 2021 Journal Article

Approximate set union via approximate randomization

  • Bin Fu
  • Pengfei Gu
  • Yuming Zhao

We develop a randomized approximation algorithm for the size of set union problem | A 1 ∪ A 2 ∪. .. ∪ A m |, which is given a list of sets A 1, .. ., A m with approximate set size m i for A i with m i ∈ ( ( 1 − β L ) | A i |, ( 1 + β R ) | A i | ), and biased random generators with probability Prob ( x = RandomElement ( A i ) ) ∈ [ 1 − α L | A i |, 1 + α R | A i | ] for each input set A i and element x ∈ A i, where i = 1, 2, .. ., m and α L, α R, β L, β R ∈ ( 0, 1 ). The approximation ratio for | A 1 ∪ A 2 ∪. .. ∪ A m | is in the range [ ( 1 − ϵ ) ( 1 − α L ) ( 1 − β L ), ( 1 + ϵ ) ( 1 + α R ) ( 1 + β R ) ] for any ϵ ∈ ( 0, 1 ). The complexity of the algorithm is measured by both time complexity and round complexity. One round of the algorithm has non-adaptive accesses to those RandomElement ( A i ) functions 1 ≤ i ≤ m, and membership queries ( x ∈ A i?) to input sets A i with 1 ≤ i ≤ m. Our algorithm gives an approximation scheme with O ( m ⋅ ( log ⁡ m ) 7 ) running time and O ( log ⁡ m ) rounds in contrast to the existing algorithm [1] that needs Ω ( m ) rounds in the worst case with O ( ( 1 + ϵ ) m / ϵ 2 ) running time, where m is the number of sets. Our algorithm gives a flexible tradeoff with time complexity O ( m 1 + ξ ) and round complexity O ( 1 ξ ) for any ξ ∈ ( 0, 1 ). Our algorithm runs sublinear in time under certain condition that each element in A 1 ∪ A 2 ∪. .. ∪ A m belongs to m a sets for any fixed a > 0, to our best knowledge, we have not seen any sublinear results about this problem. Our algorithm can handle input sets that can generate random elements with bias, and its approximation ratio depends on the bias. We prove that it is #P-hard to count the number of lattice points in a set of balls, and we also show that there is no polynomial time algorithm to approximate the number of lattice points in the intersection of n-dimensional balls unless P=NP. As applications of our algorithm, we propose approximation algorithms for counting the number of lattice points in a union of high dimensional balls and for the maximal coverage problem with balls.

TCS Journal 2020 Journal Article

An approximation algorithm for the l-pseudoforest deletion problem

  • Mugang Lin
  • Qilong Feng
  • Bin Fu
  • Jianxin Wang

An l-pseudoforest is a graph each of whose connected components is at most l edges removal being a tree. The l-Pseudoforest Deletion problem is to delete a vertex set P of minimum weight from a given vertex-weighted graph G = ( V, E ) such that the remaining graph G [ V ∖ P ] is an l-pseudoforest. The Feedback Vertex Set problem is a special case of the l-Pseudoforest Deletion problem with l = 0. In this paper, we present a polynomial time 4l-approximation algorithm for the l-Pseudoforest Deletion problem with l ≥ 1 by using the local ratio technique. When l = 1, we get a better approximation ratio 2 for the problem by further analyzing the local ratio, which matches the current best constant approximation factor for the Feedback Vertex Set problem.

AAAI Conference 2020 Conference Paper

Dynamic Sampling Network for Semantic Segmentation

  • Bin Fu
  • Junjun He
  • Zhengfu Zhang
  • Yu Qiao

Sampling is a basic operation of modern convolutional neural networks (CNN) since down-sampling operators are employed to enlarge the receptive field while up-sampling operators are adopted to increase resolution. Most existing deep segmentation networks employ regular grid sampling operators, which can be suboptimal for semantic segmentation task due to large shape and scale variance. To address this problem, this paper proposes a Context Guided Dynamic Sampling (CGDS) module to obtain an effective representation with rich shape and scale information by adaptively sampling useful segmentation information in spatial space. Moreover, we utilize the multi-scale contextual representations to guide the sampling process. Therefore, our CGDS can adaptively capture shape and scale information according to not only the input feature map but also the multi-scale semantic context. CGDS provides a plug-and-play module which can be easily incorporated in deep segmentation networks. We incorporate our proposed CGDS module into Dynamic Sampling Network (DSNet) and perform extensive experiments on segmentation datasets. Experimental results show that our CGDS significantly improves semantic segmentation performance and achieves state-of-the-art performance on PASCAL VOC 2012 and ADE20K datasets. Our model achieves 85. 2% mIOU on PASCAL VOC 2012 test set without MS COCO dataset pre-trained and 46. 4% on ADE20K validation set. The codes will become publicly available after publication.

AAAI Conference 2020 Conference Paper

Multi-Point Semantic Representation for Intent Classification

  • Jinghan Zhang
  • Yuxiao Ye
  • Yue Zhang
  • Likun Qiu
  • Bin Fu
  • Yang Li
  • Zhenglu Yang
  • Jian Sun

Detecting user intents from utterances is the basis of natural language understanding (NLU) task. To understand the meaning of utterances, some work focuses on fully representing utterances via semantic parsing in which annotation cost is labor-intentsive. While some researchers simply view this as intent classification or frequently asked questions (FAQs) retrieval, they do not leverage the shared utterances among different intents. We propose a simple and novel multi-point semantic representation framework with relatively low annotation cost to leverage the fine-grained factor information, decomposing queries into four factors, i. e. , topic, predicate, object/condition, query type. Besides, we propose a compositional intent bi-attention model under multi-task learning with three kinds of attention mechanisms among queries, labels and factors, which jointly combines coarse-grained intent and fine-grained factor information. Extensive experiments show that our framework and model significantly outperform several state-of-the-art approaches with an improvement of 1. 35%-2. 47% in terms of accuracy.

TCS Journal 2017 Journal Article

Concentration independent random number generation in tile self-assembly

  • Cameron T. Chalk
  • Bin Fu
  • Eric Martinez
  • Robert Schweller
  • Tim Wylie

In this paper we introduce the robust random number generation problem where the goal is to design an abstract tile assembly system (aTAM system) whose terminal assemblies can be split into n partitions such that a resulting assembly of the system lies within each partition with probability 1/n, regardless of the relative concentration assignment of the tile types in the system. First, we show this is possible for n = 2 (a robust fair coin flip) within the aTAM, and that such systems guarantee a worst case O ( 1 ) space usage. We accompany our primary construction with variants that show trade-offs in space complexity, initial seed size, temperature, tile complexity, bias, and extensibility, and also prove some negative results. As an application, we combine our coin-flip system with a result of Chandran, Gopalkrishnan, and Reif to show that for any positive integer n, there exists a O ( log ⁡ n ) tile system that assembles a constant-width linear assembly of expected length n for any concentration assignment. We then extend our robust fair coin flip result to solve the problem of robust random number generation in the aTAM for all n. Two variants of robust random bit generation solutions are presented: an unbounded space solution and a bounded space solution which incurs a small bias. Further, we consider the harder scenario where tile concentrations change arbitrarily at each assembly step and show that while this is not possible in the aTAM, the problem can be solved by exotic tile assembly models from the literature.

AAAI Conference 2017 Conference Paper

Generalized Ambiguity Decompositions for Classification with Applications in Active Learning and Unsupervised Ensemble Pruning

  • Zhengshen Jiang
  • Hongzhi Liu
  • Bin Fu
  • Zhonghai Wu

Error decomposition analysis is a key problem for ensemble learning. Two commonly used error decomposition schemes, the classic Ambiguity Decomposition and Bias-Variance- Covariance decomposition, are only suitable for regression tasks with square loss. We generalized the classic Ambiguity Decomposition from regression problems with square loss to classification problems with any loss functions that are twice differentiable, including the logistic loss in Logistic Regression, the exponential loss in Boosting methods, and the 0- 1 loss in many other classification tasks. We further proved several important properties of the Ambiguity term, armed with which the Ambiguity terms of logistic loss, exponential loss and 0-1 loss can be explicitly computed and optimized. We further discussed the relationship between margin theory, “good” and “bad” diversity theory and our theoretical results, and provided some new insights for ensemble learning. We demonstrated the applications of our theoretical results in active learning and unsupervised ensemble pruning, and the experimental results confirmed the effectiveness of our methods.

IS Journal 2016 Journal Article

Point-of-Interest Recommendations via a Supervised Random Walk Algorithm

  • Guandong Xu
  • Bin Fu
  • Yanhui Gu

Recently, location-based social networks (LBSNs) such as Foursquare and Whrrl have emerged as a new application for users to establish personal social networks and review various points of interest (POIs), triggering a new recommendation service aimed at helping users locate more preferred POIs. Although users' check-in activities could be explicitly considered as user ratings, in turn being utilized directly for collaborative filtering-based recommendations, such solutions don't differentiate the sentiment of reviews accompanying check-ins, resulting in unsatisfactory recommendations. This article proposes a new POI recommendation framework by simultaneously incorporating user check-ins and reviews along with side information into a tripartite graph and predicting personalized POI recommendations via a sentiment-supervised random walk algorithm. The experiments conducted on real data demonstrate the superiority of this approach in comparison with state-of-the-art techniques.

TCS Journal 2016 Journal Article

The label cut problem with respect to path length and label frequency

  • Peng Zhang
  • Bin Fu

Given a graph with labels defined on edges and a source-sink pair ( s, t ), the Label s - t Cut problem asks for a minimum number of labels such that the removal of edges with these labels disconnects s and t. Similarly, the Global Label Cut problem asks for a minimum number of labels to disconnect G itself. For these two problems, we identify two useful parameters, i. e. , l max, the maximum length of any s - t path (only applies to Label s - t Cut), and f max, the maximum number of appearances of any label in the graph (applies to the two problems). We show that l max = 2 and f max = 2 are two complexity thresholds for Label s - t Cut. Furthermore, we give (i) an O ⁎ ( c k ) time parameterized algorithm for Label s - t Cut with l max bounded from above, where parameter k is the number of labels in a solution, and c is a constant with l max − 1 < c < l max, (ii) a combinatorial l max -approximation algorithm for Label s - t Cut, and (iii) a polynomial time exact algorithm for Global Label Cut with f max bounded from above.

TCS Journal 2015 Journal Article

Competitive algorithms for unbounded one-way trading

  • Francis Y.L. Chin
  • Bin Fu
  • Jiuling Guo
  • Shuguang Han
  • Jueliang Hu
  • Minghui Jiang
  • Guohui Lin
  • Hing-Fung Ting

In the one-way trading problem, a seller has L units of product to be sold to a sequence σ of buyers u 1, u 2, …, u σ arriving online and he needs to decide, for each u i, the amount of product to be sold to u i at the then-prevailing market price p i. The objective is to maximize the seller's revenue. We note that all previous algorithms for the problem need to impose some artificial upper bound M and lower bound m on the market prices, and the seller needs to know either the values of M and m, or their ratio M / m, at the outset. This paper gives a one-way trading algorithm that does not impose any bounds on market prices and whose performance guarantee depends directly on the input. In particular, we give a class of one-way trading algorithms such that for any positive integer h and any positive number ϵ, we have an algorithm A h, ϵ that has competitive ratio O ( log ⁡ r ⁎ ( log ( 2 ) ⁡ r ⁎ ) … ( log ( h − 1 ) ⁡ r ⁎ ) ( log ( h ) ⁡ r ⁎ ) 1 + ϵ ) if the value of r ⁎ = p ⁎ / p 1, the ratio of the highest market price p ⁎ = max i ⁡ p i and the first price p 1, is large and satisfies log ( h ) ⁡ r ⁎ > 1, where log ( i ) ⁡ x denotes the application of the logarithm function i times to x; otherwise, A h, ϵ has a constant competitive ratio Γ h. We also show that our algorithms are near optimal by showing that given any positive integer h and any one-way trading algorithm A, we can construct a sequence of buyers σ with log ( h ) ⁡ r ⁎ > 1 such that the ratio between the optimal revenue and the revenue obtained by A is Ω ( log ⁡ r ⁎ ( log ( 2 ) ⁡ r ⁎ ) … ( log ( h − 1 ) ⁡ r ⁎ ) ( log ( h ) ⁡ r ⁎ ) ). A special case of the one-way trading is also studied, in which the L units of product are comprised of L items, each of which must be sold atomically (or equivalently, the amount of product sold to each buyer must be an integer). Furthermore, a complementary problem to the one-way trading problem, say, the one-way buying problem, is studied in this paper. In the one-way buying problem, a buyer wants to purchase one unit of product through a sequence of n sellers v 1, v 2, …, v n arriving online, and she needs to decide the fraction to purchase from each v i at the then-prevailing market price p i. Her objective is to minimize the cost. The optimal competitive algorithms whose performance guarantees depend only on the lowest market price p ⁎ = min i ⁡ p i, and one of M and φ, the price fluctuation ratio, are presented.

TCS Journal 2014 Journal Article

On the approximability of the exemplar adjacency number problem for genomes with gene repetitions

  • Zhixiang Chen
  • Bin Fu
  • Randy Goebel
  • Guohui Lin
  • Weitian Tong
  • Jinhui Xu
  • Boting Yang
  • Zhiyu Zhao

In this paper, we apply a measure, exemplar adjacency number, which complements and extends the well-studied breakpoint distance between two permutations, to measure the similarity between two genomes (or in general, between any two sequences drawn from the same alphabet). For two genomes G and H drawn from the same set of n gene families and containing gene repetitions, we consider the corresponding Exemplar Adjacency Number problem (EAN), in which we delete duplicated genes from G and H such that the resultant exemplar genomes (permutations) G and H have the maximum adjacency number. We obtain the following results. First, we prove that the one-sided 2-repetitive EAN problem, i. e. , when one of G and H is given exemplar and each gene occurs in the other genome at most twice, can be linearly reduced from the Maximum Independent Set problem. This implies that EAN does not admit any O ( n 0. 5 − ϵ ) -approximation algorithm, for any ϵ > 0, unless P = NP. This hardness result also implies that EAN, parameterized by the optimal solution value, is W[1]-hard. Secondly, we show that the two-sided 2-repetitive EAN problem has an O ( n 0. 5 ) -approximation algorithm, which is tight up to a constant factor.

TCS Journal 2013 Journal Article

Algebraic data retrieval algorithms for multi-channel wireless data broadcast

  • Xiaofeng Gao
  • Zaixin Lu
  • Weili Wu
  • Bin Fu

Wireless data broadcast is an important data dissemination method for distributing public information to mobile users. Due to the exponentially increasing number of mobile network users, it is necessary to develop efficient data retrieval protocols for end users to download data items effectively. In this paper, we concentrate on investigating scheduling algorithms for retrieving a set of data items from a multichannel wireless data broadcast system. As we know, the most important issues in mobile computing are energy efficiency and query response efficiency. However, in data broadcast the objectives of reducing access latency and energy cost can be contradictive to each other. Consequently, we define a new problem named Minimum Constraint Data Retrieval Problem (MCDR). We prove that MCDR is NP-hard, and then show a fixed parameter tractable algorithm which can balance two factors together. It has computational time O ( 2 k ( h n t ) O ( 1 ) ), where n is the number of channels, k is the number of required data items, t is the maximal time slot, and h is the maximal number of channel switches.

TCS Journal 2013 Journal Article

On testing monomials in multivariate polynomials

  • Zhixiang Chen
  • Bin Fu
  • Yang Liu
  • Robert Schweller

This paper presents a summary of our initial work on developing a theory of testing monomials in multivariate polynomials. The central question is to ask whether a polynomial represented by certain economically compact structure has a multilinear monomial in its sum-product expansion. The complexity aspects of this problem and its variants are investigated with two objectives. One is to understand how this problem relates to critical problems in complexity, and if so to what extent. The other is to exploit possibilities of applying algebraic properties of polynomials to the study of those problems. A series of results about Π Σ Π and Π Σ polynomials is obtained in this paper, laying a basis for further study along this line. Several randomized and deterministic algorithms are devised for testing multilinear monomials or p -monomials in certain respective types of polynomials, where p is prime.

TCS Journal 2012 Journal Article

Coordinated scheduling of production and delivery with production window and delivery capacity constraints

  • Bin Fu
  • Yumei Huo
  • Hairong Zhao

This paper addresses the problem of coordinated scheduling of production and delivery subject to the production window constraint and the delivery capacity constraint. We have a planning horizon consisting of one or more delivery times each with a unique delivery capacity. There is a set of jobs each with a committed delivery time, processing time, production window, and profit. The company can earn the profit only if the job is processed in its production window and delivered before its committed delivery time. From the company point of view, we are interested in selecting a subset of jobs to process and deliver so as to maximize the total profit subject to the delivery capacity constraint. We consider both the single delivery time case and the multiple delivery time case. In both cases, the problem is strongly NP-hard since the subproblems at the production stage and at the delivery stage are both strongly NP-hard. Our goal is to design approximation algorithms. Suppose the jobs are k -disjoint, that is, the jobs can be partitioned into k lists of jobs such that the jobs in each list have disjoint production windows. When k is a constant, we developed the first PTAS for the single delivery case. For multiple delivery time case, we also develop a PTAS when the number of delivery times is a constant as well.

TCS Journal 2011 Journal Article

Linear and sublinear time algorithms for the basis of abelian groups

  • Li Chen
  • Bin Fu

It is well known that every finite abelian group G can be represented as a direct product of cyclic groups: G ≅ G 1 × G 2 × ⋯ × G t, where each G i is a cyclic group of order p j for some prime p and integer j ≥ 1. If a i generates the cyclic group of G i, i = 1, 2, …, t, then the elements a 1, a 2, …, a t are called a basis of G. We show a randomized algorithm such that given a set of generators M = { x 1, …, x k } for an abelian group G and the prime factorization of order ord ( x i ) ( i = 1, …, k ), it computes a basis of G in O ( | M | ( log n ) 2 + ∑ i = 1 t n i p i n i / 2 ) time, where n = | G | has prime factorization p 1 n 1 p 2 n 2 ⋯ p t n t (which is not a part of input). This generalizes Buchmann and Schmidt’s algorithm that takes O ( | M | | G | ) time. In another model, all elements in an abelian group are put into a list as a part of input. We obtain an O ( n ) time deterministic algorithm and a sublinear time randomized algorithm for computing a basis of an abelian group.

TCS Journal 2009 Journal Article

Exponential inapproximability and FPTAS for scheduling with availability constraints

  • Bin Fu
  • Yumei Huo
  • Hairong Zhao

We investigate the problems of scheduling n weighted jobs to one or more identical machines with the constraint that the machines may be unavailable in some specified time intervals. The objective is to find a schedule that minimizes the total weighted completion time. We consider both non-resumable and resumable schedules. Our first contributions concern approximability. For both resumable problem and non-resumable problem, we show that they cannot be approximated within an exponential factor by any polynomial time algorithm for multiple machines where each of them has an unavailable interval, even if the weight of each job equals to its processing time. Additionally, the non-resumable problem is also exponentially inapproximable for a single machine with two or more unavailable intervals. Then we develop the first FPTASs for the problems with a single unavailable interval among all machines. The running time is O ( c n log d n ( 1 ϵ log w ) d ) for the non-resumable problem, and O ( c n log d n ( 1 ϵ log w ) d + 1 ) for the resumable problem, where w is the product of the total weight and the total processing time of all jobs, c is the number of machines that are always available and d = 6 c + 12. Thus our results give a clear boundary delineating the inapproximable cases and approximable cases. When there is a single machine and w = O ( n log n O ( 1 ) ), our algorithms greatly improve the current results. Note that instead of conventional ways of sequentially processing the jobs, our fast schemes process jobs in a divide-and-conquer fashion, which greatly reduces the running time. This may give some insight for some other related problems.

TCS Journal 1993 Journal Article

Exponential-time and subexponential-time sets

  • Shouwen Tang
  • Bin Fu
  • Tian Liu

In this paper, we prove that the symmetric difference of a ⩽P k-parity-hard set for E and a subexponential-time-computable set is still ⩽P k-parity-hard for E. This remains true for ⩽P m-hard set for E since a 1-parity reduction is a many—one reduction. In addition, we show that this property fails to hold for some other types of reductions. We introduce and study the notions of E-complete kernel and E-hard kernel.

TCS Journal 1993 Journal Article

On symmetric differences of NP-hard sets with weakly P-selective sets

  • Bin Fu
  • Hong-zhou Li

The symmetric differences of NP-hard sets with weakly-P-selective sets are investigated. We show that if there exist a weakly-P-selective set A and an NP-⩽P m-hard set H such that H - AϵP btt(sparse) and A — HϵP m(sparse) then P = NP. So no NP-⩽P m-hard set has sparse symmetric difference with any weakly-P-selective set unless P = NP. The proof of our main result is an interesting application of the tree prunning techniques (Fortune 1979; Mahaney 1982). In addition, we show that there exists a P-selective set which has exponentially dense symmetric difference with every set in P btt(sparse).