Dates
Thursday, October 06, 2022 - 10:00am to Thursday, October 06, 2022 - 11:30am
Location
Room 220, New CS Building
Event Description

Abstract: Visual counting is an important visual task with wide applications. However, most previous counting work only considers specified categories like the crowd, cell in images for object counting, or some simple actions in the gym like pull-ups for event counting. Most category-specific object counting works require dot annotations for millions of objects on several thousands of training images for one single category, making it hard to obtain a general visual counter that can handle a large number of visual categories. Some recent works use the spatial similarity between the exemplar object and query image to perform few-shot counting due to this challenge. However, the various sizes, colors, aspect ratios of the target object, the large density, and the occlusion leads to obvious, unavoidable error for the similarity-based few-shot visual counter. Considering the problems in the current few-shot visual counter, we propose a novel interactive framework for counting objects in images, where a human user can provide feedback to improve the accuracy of the counting result. Furthermore, our approach can plug into any similarity-based visual counter without retraining. The core components of our framework are an intuitive visualizer to collect the user's feedback and an effective mechanism to utilize the user's feedback. In each iteration of our method, we visualize the density map produced by the counting algorithm to help the user understand the current prediction result. For intuitive visualization, we develop a novel method to segment the density map into non-overlapping regions where the number of objects in each region can be easily verified. The user can provide feedback by first selecting a region with obvious counting errors and then specifying an estimated range for the number of objects in that region. To improve the counting result, we develop an interactive adaptation loss to force the visual counter to output the predicted count in the range provided by the user. Experiments on several counting benchmark datasets, including crowd counting and few-shot counting, show that our interactive method can reduce the mean absolute error of the state-of-the-art counting methods by approximately 20% to 30% with a few mouse clicks.

Event Title
Ph.D. Research Proficiency Presentation: Yifeng Huang, 'Counting and recounting of what matter'