Weakly and Self-Supervised Class-Agnostic Motion Prediction for Autonomous Driving
Communication-Efficient Federated Multi-View Clustering
Efficient High-Order Spatial Interactions for Visual Perception
Foundation Model for Skeleton-Based Human Action Understanding
Unsupervised Gaze Representation Learning by Switching Features
An End-to-End Depth-Based Pipeline for Selfie Image Rectification
Creating Multimodal Interactive Digital Twin Characters From Videos: A Dataset and Baseline
I-Filtering: Implicit Filtering for Learning Neural Distance Functions From 3D Point Clouds
Make Identity Indistinguishable: Utility-Preserving Face Dataset Publication With Provable Privacy Guarantees
Active Learning for Multiple Target Models
Graph-Oriented Instruction Tuning of Large Language Models for Generic Graph Mining
SparseTSF: Lightweight and Robust Time Series Forecasting via Sparse Modeling
GAN-Based Domain Adaptation for Image-Aware Layout Generation in Advertising Poster Design
Deep Lookup Network
Structure-Induced Gradient Regulation for Generalizable Vision-Language Models
Partial Multiview Incomplete Multilabel Learning via Uncertainty-Driven Reliable Dynamic Fusion
Hierarchical Spherical CNNs With Lifting-Based Adaptive Wavelets for Pooling and Unpooling
Hypergraph-Based High-Order Correlation Analysis for Large-Scale Long-Tailed Data Classification
PRANCE: Joint Token-Optimization and Structural Channel-Pruning for Adaptive ViT Inference
Sentence-Level Relation Semantics Learning via Contrastive Sentences
Compositional Generative Model of Unbounded 4D Cities
Spherical Vision Transformers for Audio-Visual Saliency Prediction in 360$^{\circ }$∘ Videos
EBSnoR: Event-Based Snow Removal by Optimal Dwell Time Thresholding
Bidirectional Beta-Tuned Diffusion Model
MovieChat+: Question-Aware Sparse Memory for Long Video Question Answering
Learning Heterogeneous Mixture of Scene Experts for Large-Scale Neural Radiance Fields
PMGT-VR: A Decentralized Proximal-Gradient Algorithmic Framework With Variance Reduction
Parse Trees Guided LLM Prompt Compression
Task-Distributionally Robust Data-Free Meta-Learning
Improving Generalized Visual Grounding With Instance-Aware Joint Learning
Reinterpreting Hypergraph Kernels: Insights Through Homomorphism Analysis
ADA-Track++: End-to-End Multi-Camera 3D Multi-Object Tracking With Alternating Detection and Association
Segmenting the Motion Components of a Video: A Long-Term Unsupervised Model
H2OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers
Transfer Learning of Stochastic Kriging for Individualized Prediction
Toward Effective Knowledge Distillation: Navigating Beyond Small-Data Pitfall
Object Detection Data Synthesis via Box-to-Image Generation Based on Diffusion Models
Semantic-Assisted Object Clustering for Multi-Modal Referring Video Segmentation
Mask-DiFuser: A Masked Diffusion Model for Unified Unsupervised Image Fusion
MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection
SNNTracker: Online High-Speed Multi-Object Tracking With Spike Camera
A Unified Perspective for Loss-Oriented Imbalanced Learning via Localization
Translating Images to Road Network: A Sequence-to-Sequence Perspective
End-to-End Autonomous Driving Without Costly Modularization and 3D Manual Annotation
REST: Holistic Learning for End-to-End Semantic Segmentation of Whole-Scene Remote Sensing Imagery
Bayesian Unsupervised Disentanglement of Anatomy and Geometry for Deep Groupwise Image Registration
HSIGene: A Foundation Model for Hyperspectral Image Generation
Efficient 3D Surface Super-Resolution via Normal-Based Multimodal Restoration
Revisiting Transferable Adversarial Images: Systemization, Evaluation, and New Insights
Self-Guidance: Boosting Flow and Diffusion Generation on Their Own