Dates
Friday, July 08, 2022 - 11:00am to Friday, July 08, 2022 - 12:30pm
Location
Zoom
Event Description

ABSTRACT:

Despite the ever-changing nature of computing systems, operating systems and storage systems are still following the architectures, algorithms, and structures built decades ago.  Modern software stacks generate complicated and dynamic workloads which are running on statically configured storage stacks.  To provide the best performance for various dynamic workloads, we need self-adaptive, dynamically configured storage systems.  However, considering the current design principles of storage and operating systems, there is no support system to achieve self-adaptability.

One of the possible solutions to fulfill the self-adaptability needed in storage and operating systems is approaching operating system problems with machine learning assistance.  Researchers have tried using machine learning to solve operating system problems; however, existing solutions are either not practical or not versatile enough.  Therefore, we propose a complete pipeline to build machine learning models to improve operating system components, especially I/O subsystems and their performance.  First, we provide a low-overhead and high-fidelity data-collection framework to trace and collect data from inside operating systems.  We then develop a lightweight and efficient machine learning (ML) framework that can run at the kernel level and tune kernel parameters to improve I/O performance.

We have applied our machine learning framework, called KML, to tune disk readahead sizes according to workload-type predictions.  We used RocksDB as our benchmarking platform.  We can improve I/O performance for RocksDB's benchmark workloads, including realistic ones (e.g., Facebook's mixgraph), by up to 2.3x We also include another use case: NFS rsize.  We observed as much as 15x performance improvements for the NFS rsize use-case.

It is our thesis that operating systems have many heuristics built largely by hand over many years, and yet operating systems cannot easily adapt to changing environment and workload conditions; therefore, we believe that compact and efficient machine learning engines should become a first-class citizen inside operating systems and be used to improve I/O subsystems.

https://stonybrook.zoom.us/j/94696940975?pwd=UVh3TUk1UmhSZGFKSk1TN0JYekx3dz09
Meeting ID: 946 9694 0975
Passcode: 728144

Event Title
Ph.D. Proposal Defense: Ibrahim Akgun,'Using Machine Learning to Improve Operating Systems' I/O Subsystems'