Latest Computer Vision Research Papers

Research on image recognition, object detection, image segmentation, and visual understanding using deep learning techniques.

14 Papers
Showing 14 of 14 papers

Motion 3-to-4: 3D Motion Reconstruction for 4D Synthesis

Hongyuan Chen, Xingyu Chen, Youjia Zhang +2 more

We present Motion 3-to-4, a feed-forward framework for synthesising high-quality 4D dynamic objects from a single monocular video and an optional 3D reference mesh. While recent advances have significantly improved 2D, video, and 3D content generation, 4D synthesis remains difficult due to limited t...

4D dynamic objectsmonocular video3D reference meshcanonical reference meshmotion latent representation+3 more
Jan 20, 20268

Implicit Neural Representation Facilitates Unified Universal Vision Encoding

Matthew Gwilliam, Xiao Wang, Xuefeng Hu +1 more

Models for image representation learning are typically designed for either recognition or generation. Various forms of contrastive learning help models learn to convert images to embeddings that are useful for classification, detection, and segmentation. On the other hand, models can be trained to r...

contrastive learningimage representation learningrecognitiongenerationhyper-network+5 more
Jan 20, 20265

VideoMaMa: Mask-Guided Video Matting via Generative Prior

Sangbeom Lim, Seoung Wug Oh, Jiahui Huang +3 more

Generalizing video matting models to real-world videos remains a significant challenge due to the scarcity of labeled data. To address this, we present Video Mask-to-Matte Model (VideoMaMa) that converts coarse segmentation masks into pixel accurate alpha mattes, by leveraging pretrained video diffu...

video mattingvideo diffusion modelspseudo-labelingMatting Anything in VideoSAM2+3 more
Jan 20, 202612

CoDance: An Unbind-Rebind Paradigm for Robust Multi-Subject Animation

Shuai Tan, Biao Gong, Ke Ma +5 more

Character image animation is gaining significant importance across various domains, driven by the demand for robust and flexible multi-subject rendering. While existing methods excel in single-person animation, they struggle to handle arbitrary subject counts, diverse character types, and spatial mi...

Unbind-Rebind frameworkpose shift encoderstochastic perturbationslocation-agnostic motion representationsemantic guidance+3 more
Jan 16, 20266

VQ-Seg: Vector-Quantized Token Perturbation for Semi-Supervised Medical Image Segmentation

Sicheng Yang, Zhaohu Xing, Lei Zhu

Consistency learning with feature perturbation is a widely used strategy in semi-supervised medical image segmentation. However, many existing perturbation methods rely on dropout, and thus require a careful manual tuning of the dropout rate, which is a sensitive hyperparameter and often difficult t...

vector quantizationfeature perturbationconsistency learningsemi-supervised medical image segmentationdropout+4 more
Jan 15, 20263

Medical SAM3: A Foundation Model for Universal Prompt-Driven Medical Image Segmentation

Chongcong Jiang, Tianxingjian Ding, Chuhan Song +7 more

Promptable segmentation foundation models such as SAM3 have demonstrated strong generalization capabilities through interactive and concept-based prompting. However, their direct applicability to medical image segmentation remains limited by severe domain shifts, the absence of privileged spatial pr...

foundation modelprompt-driven segmentationmedical image segmentationSAM3fine-tuning+7 more
Jan 15, 202615

Alterbute: Editing Intrinsic Attributes of Objects in Images

Tal Reiss, Daniel Winter, Matan Cohen +4 more

We introduce Alterbute, a diffusion-based method for editing an object's intrinsic attributes in an image. We allow changing color, texture, material, and even the shape of an object, while preserving its perceived identity and scene context. Existing approaches either rely on unsupervised priors th...

diffusion-based methodintrinsic attributesobject editingidentity preservationvisual named entities+5 more
Jan 15, 202626
Latest Computer Vision Research | Computer Vision Papers