Linked e-resources
Details
Table of Contents
DFNet: Enhance Absolute Pose Regression with Direct Feature Matching
Cornerformer: Purifying Instances for Corner-Based Detectors
Robust Object Detection with Inaccurate Bounding Boxes
Efficient Decoder-Free Object Detection with Transformers
Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection
ReAct: Temporal Action Detection with Relational Queries
Towards Accurate Active Camera Localization
Camera Pose Auto-Encoders for Improving Pose Regression
Improving the Intra-Class Long-Tail in 3D Detection via Rare Example Mining
Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization
UC-OWOD: Unknown-Classified Open World Object Detection
RayTran: 3D Pose Estimation and Shape Reconstruction of Multiple Objects from Videos with Ray-Traced Transformers
GTCaR: Graph Transformer for Camera Re-Localization
3D Object Detection with a Self-Supervised Lidar Scene Flow Backbone
Open Vocabulary Object Detection with Pseudo Bounding-Box Labels
Few-Shot Object Detection by Knowledge Distillation Using Bag-of-Visual-Words Representations
SALISA: Saliency-Based Input Sampling for Efficient Video Object Detection
ECO-TR: Efficient Correspondences Finding via Coarse-to-Fine Refinement
Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial Keypoint Voting
Long-Tailed Instance Segmentation Using Gumbel Optimized Loss
DetMatch: Two Teachers Are Better than One for Joint 2D and 3D Semi-Supervised Object Detection
ObjectBox: From Centers to Boxes for Anchor-Free Object Detection
Is Geometry Enough for Matching in Visual Localization?
SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds
PCR-CG: Point Cloud Registration via Deep Explicit Color and Geometry
GLAMD: Global and Local Attention Mask Distillation for Object Detectors
FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection
Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles
Class-Agnostic Object Detection with Multi-modal Transformer
Enhancing Multi-modal Features Using Local Self-Attention for 3D Object Detection
Object Detection As Probabilistic Set Prediction
Weakly-Supervised Temporal Action Detection for Fine-Grained Videos with Hierarchical Atomic Actions
Neural Correspondence Field for Object Pose Estimation
On Label Granularity and Object Localization
OIMNet++: Prototypical Normalization and Localization-Aware Learning for Person Search
Out-of-Distribution Identification: Let Detector Tell Which I Am Not Sure
Learning with Free Object Segments for Long-Tailed Instance Segmentation
Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction
3D Random Occlusion and Multi-layer Projection for Deep Multi-Camera Pedestrian Localization
A Simple Single-Scale Vision Transformer for Object Detection and Instance Segmentation
Simple Open-Vocabulary Object Detection with Vision Transformers.
Cornerformer: Purifying Instances for Corner-Based Detectors
Robust Object Detection with Inaccurate Bounding Boxes
Efficient Decoder-Free Object Detection with Transformers
Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection
ReAct: Temporal Action Detection with Relational Queries
Towards Accurate Active Camera Localization
Camera Pose Auto-Encoders for Improving Pose Regression
Improving the Intra-Class Long-Tail in 3D Detection via Rare Example Mining
Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization
UC-OWOD: Unknown-Classified Open World Object Detection
RayTran: 3D Pose Estimation and Shape Reconstruction of Multiple Objects from Videos with Ray-Traced Transformers
GTCaR: Graph Transformer for Camera Re-Localization
3D Object Detection with a Self-Supervised Lidar Scene Flow Backbone
Open Vocabulary Object Detection with Pseudo Bounding-Box Labels
Few-Shot Object Detection by Knowledge Distillation Using Bag-of-Visual-Words Representations
SALISA: Saliency-Based Input Sampling for Efficient Video Object Detection
ECO-TR: Efficient Correspondences Finding via Coarse-to-Fine Refinement
Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial Keypoint Voting
Long-Tailed Instance Segmentation Using Gumbel Optimized Loss
DetMatch: Two Teachers Are Better than One for Joint 2D and 3D Semi-Supervised Object Detection
ObjectBox: From Centers to Boxes for Anchor-Free Object Detection
Is Geometry Enough for Matching in Visual Localization?
SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds
PCR-CG: Point Cloud Registration via Deep Explicit Color and Geometry
GLAMD: Global and Local Attention Mask Distillation for Object Detectors
FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection
Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles
Class-Agnostic Object Detection with Multi-modal Transformer
Enhancing Multi-modal Features Using Local Self-Attention for 3D Object Detection
Object Detection As Probabilistic Set Prediction
Weakly-Supervised Temporal Action Detection for Fine-Grained Videos with Hierarchical Atomic Actions
Neural Correspondence Field for Object Pose Estimation
On Label Granularity and Object Localization
OIMNet++: Prototypical Normalization and Localization-Aware Learning for Person Search
Out-of-Distribution Identification: Let Detector Tell Which I Am Not Sure
Learning with Free Object Segments for Long-Tailed Instance Segmentation
Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction
3D Random Occlusion and Multi-layer Projection for Deep Multi-Camera Pedestrian Localization
A Simple Single-Scale Vision Transformer for Object Detection and Instance Segmentation
Simple Open-Vocabulary Object Detection with Vision Transformers.