StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing
Towards depth foundation models: Recent trends in vision-based depth estimation
A comprehensive survey on the research and development of RGB-T salient object detection
Joint point cloud upsampling and cleaning with octree-based CNNs
Neural scene baking for permutation invariant transparency rendering with real-time global illumination
Neural reconstruction and super-resolution for foveated real-time rendering
EG-HumanNeRF: Efficient generalizable human NeRF utilizing human prior for sparse view
DragTex: Generative point-based texture editing on 3D mesh
StoreSketcher: An interactive framework for planning commercial retail scene layout
See more, know more: Richer prior knowledge for novel class discovery
Multi-color compressive hologram synthesis with learned wave propagation
M2HF: Multi-branch multi-modal hybrid fusion for text–video retrieval
Open-vocabulary camouflaged object segmentation with cascaded vision language models
PraNet-V2: Dual-supervised reverse attention for medical image segmentation
FEDNet: A feature-enhanced diffusion network for efficient and universal texture synthesis
Immersive Analytics Meets Artificial Intelligence: A Systematic Review
A review of learning based visual relocalization methods
Video-Bench: A comprehensive benchmark and toolkit for evaluating video-based large language models
FaceCLIP: CLIP-driven accurate and detailed 3D face reconstruction from a single image
Attention-guided reference point shifting for Gaussian-mixture-based partial point set registration
Flow-deformation-aware point cloud completion network for 3D metal bent tube
BoostPoint: Boosting point cloud backbones with image pre-training for 3D understanding
Real-time woven fabric rendering using SGGX fitting
Sketchformer++: A hierarchical transformer architecture for vector sketch representation
Language interprets vision: Adaptive encoding and decoding for referring image segmentation
A multi-scale yarn appearance model with fiber details
Pyramid-angular-constraint network for light field super-resolution
Personalized image generation with deep generative models: A decade survey
Gaussian-plus-SDF SLAM: High-fidelity 3D reconstruction at 150+ fps
GarTrans: Transformer-based architecture for dynamic and detailed garment deformation
Learning multi-grained interpretable latent representation for 3D face manipulation
Human pose estimation with general contact
VarGes: Improving variation in co-speech 3D gesture generation via StyleCLIPS
PuzzleSorter: Certainty-aware visual restoration of multiple cultural artifacts
Continuous indexed points for multivariate volume visualization
GRIG: Data-efficient generative residual image inpainting
Adaptive content-aware correction for wide-angle portrait photos
Multi-task gradual inference with a single encoder–decoder network for automatic portrait matting
Heuristic weakly supervised 3D human pose estimation
Remote sensing tuning: A survey
PCAC-GAN: A sparse-tensor-based generative adversarial network for 3D point cloud attribute compression
A discrete microfacet model for transparent glints rendering
FRNeRF: Fusion and regularization fields for dynamic view synthesis
BDA: Bi-directional attention for zero-shot learning
Prediction of scene plausibility
Class incremental learning via feature space calibration
LDSwap: A semantic-related latent code disentangling method in StyleSpace towards high-resolution face swapping
MDFP-Net: A model-driven deep neural network for Fourier ptychography
TransCeption: Enhancing medical image segmentation with an inception-like transformer design for efficient feature fusion
JVCSR+: Adaptively learned video compressive sensing reconstruction with joint in-loop reference enhancement and out-loop super-resolution