Recently, the full paper titled "R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object" published in top tier peer-reviewed conference AAAI 2021 was ranked the top 1 of the AAAI'21 Most Influential Papers announced by the PaperDigest website. The main authors are Prof. Junchi Yan and his Ph.D. student Xue Yang, with the Institute of Artificial Intelligence & Department of Computer Science and Engineering of Shanghai Jiao Tong University. The AI model called R3Det is an oriented object detection neural network, which effectively solves the issue of locating multi-orientation objects and separating objects from the background accurately and quickly. It can be well used in text detection, aerial object detection, and etc.
The AAAI Conference on Artificial Intelligence is one of the top international academic conferences in the field of artificial intelligence (21.4% acceptance rate in 2021, 1692/7911). PaperDigest is a Sci-Tech knowledge graph & text analysis platform for scientific literature tracking, summarization and search. It was initially developed by the researchers at Tokyo Institute of Technology in 2018. It maintains one of the world's largest scientific and technological knowledge graphs. The PaperDigest team analyzes all papers published in AAAI over the past three years and presents a list of the 15 most influential papers each year, which is automatically constructed based on citations from research papers and granted patents, and is frequently updated to reflect the latest change. It is one of the most authoritative lists in a third-party role.
Object detection is one of the basic tasks in computer vision. Oriented object detection refers to the task of accurate object location and recognition for a given image, which is often used for security checks and face authentication in stations/airports/museums, automatic text extraction and recognition on cards/documents, and cars, pedestrians and traffic signs detection and recognition for autonomous driving scenario. However, due to the complexity of real scenes, it is often difficult to locate multi-orientation objects. Thus, oriented object detection has always been a very challenging task. Starting from objects with large aspect ratios, dense arrangements and drastic scale changes, this study proposes an end-to-end cascaded oriented object detector called Refined Rotated RetinaNet Detector (R3Det). R3Det detects objects quickly and accurately through a progressive regression method from coarse to fine, and integrates a feature refinement module to obtain more accurate features to improve object detection performance.
Figure 1 shows the overall structure of R3Det and the core of the feature refinement module is to re-encode the position information of the current refined bounding box to the corresponding feature points through pixel-wise feature interpolation to achieve feature reconstruction and alignment.
The proposal of the R3Det provides innovative ideas and methods for solving the problem of feature misalignment in oriented object detection. The effectiveness of the proposed method is verified on three remote sensing datasets DOTA, HRSC2016, UCAS-AOD and a scene text dataset ICDAR2015. It can be applied to face recognition, aerial images, medical images, automatic driving and other scenarios in the future for more accurate oriented object detection and analysis.
Figure 2. Detection visualization on remote sensing images, R3Det can accurately locate aircraft positions in different directions in the airport.
In the past three years, Prof. Yan Junchi's research group has successively published a series of oriented object detection papers at the top artificial intelligence conferences, such as ICCV19, ECCV20, AAAI21, CVPR21, ICML21, NeurIPS21, and IJCV22. At the same time, the research group has released two open source frameworks for oriented object detection, namely MMRotate and AlphaRotate, which have become the most popular open source frameworks in the field of oriented object detection.