Introduction
The rapid advancement of AI has highlighted the critical role of machine learning systems in training and deploying large models. Optimizing these systems is essential—not only to enhance user experience but also to reduce operational costs. At Sea AI Lab, our research focuses on developing novel techniques to improve machine learning systems, with three primary objectives:
- Maximizing Throughput – Increasing system efficiency to reduce commercial costs. Higher throughput enables processing more tasks at the same cost or completing the same task at a lower cost—or both.
- Minimizing Latency – Enhancing responsiveness for end users and AI developers. While throughput and latency often conflict (since batching improves throughput but increases delay), we aim to strike a balance between the two.
- Overcoming Resource Constraints – Reducing memory consumption and other limitations to enable deployment in environments where the deploying of these systems would be otherwise impractical or inefficient.