Dates
Thursday, January 18, 2024 - 08:30am to Thursday, January 18, 2024 - 09:30am
Location
NCS 220
Event Description

Abstract: Video cameras are widespread, and the use of machine learning for video analytics has grown increasingly popular, with applications across a broad spectrum of areas, including surveillance, traffic analysis, and autonomous driving. Traditionally, machine learning models are static post-deployment; they lack the capability to self-adapt to individual scenes or self-improve to address the challenges presented by ever-changing video streams. My thesis posits that a video stream contains a wealth of information about the camera and the scene, which, if properly harnessed, can enable a model to self-improve.

In this proposal talk, I will demonstrate that a model can indeed learn to self-improve in terms of timeliness, precision, or efficiency by employing self-supervised training signals acquired concurrently with task execution. First, I will illustrate the conversion of a model--from recognizing specific target events post-occurrence to an early event detector--by leveraging a video stream and updating the detector through delayed self-supervision. Second, I will introduce a framework capable of enhancing the detection performance of an object detector across various new scenes via self-supervised object labeling and ensuring consistency between objects and backgrounds. Third, I will present a method to enhance the efficiency of the scene-adaptive object detection framework by integrating a mixture-of-experts architecture, which bolsters detection performance through self-distillation training while maintaining a low computational footprint and model size. Lastly, I will discuss my proposed work for the thesis that enables an object detector to adapt to diverse scenes without altering the original model's weights, as dynamic model updates are often impractical in real-world applications where the model must remain fixed post-optimization.

Event Title
Ph.D. Proposal Defense: 'Self-Supervised Improvement of Event and Object Detection from Video Streams', Zekun Zhang