HKUBS_Standard_Coloured_Shield_Blue_Text_Logo - 380
  • Global Presence
  • About Us
        • About us about us
        • A Premier Business School in Asia with Global Influence
        • Overview
          • Message from the Dean
          • Vision & Mission
          • Milestones
          • Partnerships & Global Network
          • Beta Gamma Sigma HKU Chapter
          • Rankings & Accreditations
          • Corporate Collaboration
        • Leadership
          • Faculty Management
          • International Advisory Council
        • Our Stories
        • Contact Us
        • Staff Portal
        • Campuses
  • Programmes
        • Programmes programmes
        • Diversified Learning Experience Empowers our Students
        • Undergraduate
        • Masters
          • MAA
          • MAcct
          • MCGRM
          • MEcon
          • MFWM
          • MFin
          • MFFinTech
          • MGM
          • MSAF
          • MSc(BA)
          • MScMktg
          • MWM
        • MBAs & EMBAs
          • MBA
          • IMBA
          • HKU EMBA
        • PhD
        • Executive Education
        • DBA
  • People
        • People people
        • International Faculty Body that Offers Quality Education
        • Faculty
        • Post-Doctoral Fellows
        • Demonstrators/ Teaching Assistants
        • Research Postgraduate Students
        • Administration
  • Research
        • Research research
        • Lead the Frontier of Knowledge Creation
        • Thought Leadership
          • Latest Research Publications
          • HKEJ Column
          • FT Chinese Column
          • In the Media
        • Academic Areas
        • Faculty Members
        • Research Grants
        • Seminars & Conferences
          • Edward K Y Chen Distinguished Lecture Series
        • Research Centres & Institutes
        • Hong Kong Macroeconomics Forecast
        • HKU Knowledge Exchange
        • HKU Scholars Hub
        • Shenzhen Research Institutes
  • Events
  • Media
        • Media media
        • Inspire the Society and Community with Rigorous Insights
        • School News
        • In the Media
        • Press Release
  • Career
        • Career career
        • Fostering Strategic Partnerships and Talent Development
        • Recruit our Talents
        • Student Career Development
        • Career Opportunities
          • Faculty Recruitments
          • HKU Career Site
  • 繁
  • 简
Type To Search
HKUBS_Standard_Coloured_Shield_Blue_Text_Logo - 380
Type To Search
  • Global Presence
  • About Us
        • About us about us
        • A Premier Business School in Asia with Global Influence
        • Overview
          • Message from the Dean
          • Vision & Mission
          • Milestones
          • Partnerships & Global Network
          • Beta Gamma Sigma HKU Chapter
          • Rankings & Accreditations
          • Corporate Collaboration
        • Leadership
          • Faculty Management
          • International Advisory Council
        • Our Stories
        • Contact Us
        • Staff Portal
        • Campuses
  • Programmes
        • Programmes programmes
        • Diversified Learning Experience Empowers our Students
        • Undergraduate
        • Masters
          • MAA
          • MAcct
          • MCGRM
          • MEcon
          • MFWM
          • MFin
          • MFFinTech
          • MGM
          • MSAF
          • MSc(BA)
          • MScMktg
          • MWM
        • MBAs & EMBAs
          • MBA
          • IMBA
          • HKU EMBA
        • PhD
        • Executive Education
        • DBA
  • People
        • People people
        • International Faculty Body that Offers Quality Education
        • Faculty
        • Post-Doctoral Fellows
        • Demonstrators/ Teaching Assistants
        • Research Postgraduate Students
        • Administration
  • Research
        • Research research
        • Lead the Frontier of Knowledge Creation
        • Thought Leadership
          • Latest Research Publications
          • HKEJ Column
          • FT Chinese Column
          • In the Media
        • Academic Areas
        • Faculty Members
        • Research Grants
        • Seminars & Conferences
          • Edward K Y Chen Distinguished Lecture Series
        • Research Centres & Institutes
        • Hong Kong Macroeconomics Forecast
        • HKU Knowledge Exchange
        • HKU Scholars Hub
        • Shenzhen Research Institutes
  • Events
  • Media
        • Media media
        • Inspire the Society and Community with Rigorous Insights
        • School News
        • In the Media
        • Press Release
  • Career
        • Career career
        • Fostering Strategic Partnerships and Talent Development
        • Recruit our Talents
        • Student Career Development
        • Career Opportunities
          • Faculty Recruitments
          • HKU Career Site
HKUBS_Standard_Coloured_Shield_Blue_Text_Logo - 380
  • Global Presence
  • About Us
    • Overview
      • Message from the Dean
      • Vision & Mission
      • Milestones
      • Partnerships & Global Network
      • Beta Gamma Sigma HKU Chapter
      • Rankings & Accreditations
      • Corporate Collaboration
    • Leadership
      • Faculty Management
      • International Advisory Council
    • Our Stories
    • Contact Us
    • Staff Portal
    • Campuses
  • Programmes
    • Undergraduate
    • Masters
      • MAA
      • MAcct
      • MCGRM
      • MEcon
      • MFWM
      • MFin
      • MFFinTech
      • MGM
      • MSAF
      • MSc(BA)
      • MScMktg
      • MWM
    • MBAs & EMBAs
      • MBA
      • IMBA
      • HKU EMBA
    • DBA
    • PhD
    • Executive Education
  • People
    • Faculty
    • Post-Doctoral Fellows
    • Demonstrators/ Teaching Assistants
    • Research Postgraduate Students
    • Administration
  • Research
    • Thought Leadership
      • Latest Research Publications
      • HKEJ Column
      • FT Chinese Column
      • In the Media
    • Academic Areas
    • Faculty Members
    • Research Grants
    • Seminars & Conferences
      • Edward K Y Chen Distinguished Lecture Series
    • Research Centres & Institutes
    • Hong Kong Macroeconomics Forecast
    • HKU Knowledge Exchange
    • HKU Scholars Hub
    • Shenzhen Research Institutes
  • Events
  • Media
    • School News
    • In the Media
    • Press Release
  • Career
    • Recruit our Talents
    • Student Career Development
    • Career Opportunities
      • Faculty Recruitments
      • HKU Career Site

Events

Home Events
Foundations and Frontiers: Operations Research for Large Language Model Inference
Foundations and Frontiers: Operations Research for Large Language Model Inference
Foundations and Frontiers: Operations Research for Large Language Model Inference
Foundations and Frontiers: Operations Research for Large Language Model Inference
22Nov
Seminar Calendar, Information and Innovation Management

Foundations and Frontiers: Operations Research for Large Language Model Inference

22 Nov 2024 | 2:00 p.m. — 3:30 p.m.
KK 315, K. K. Leung Building, HKU
Share on TwitterShare on FacebookShare on WhatsappShare on LinkedInShare on Email
SPEAKER

Mr. Zijie (Jerry) Zhou
Ph.D. Candidate in MIT ORC & LIDS
Massachusetts Institute of Technology

ABSTRACT

Large Language Model (LLM) inference involves the computational techniques and strategies used to process input prompts and generate responses through a large language model. This field intersects significantly with online optimization, particularly in areas like online batching, scheduling, and resource allocation, making it a topic of keen interest to the OR/OM community. In this talk, I will introduce a foundational model for managing computational tasks on a single GPU, focusing on reducing redundant computations through a special memory-saving mechanism. This mechanism temporarily stores information from each word the model processes into the KV (key-value) cache to avoid recalculating it repeatedly. However, as more words are processed, this storage can quickly reach its limit. When this happens, the system incurs substantial extra costs by reprocessing tasks. We optimize batching and scheduling strategies to manage KV cache memory usage and minimize the inference latency to improve efficiency and sustainability.
We address this challenge by first analyzing a semi-online model, where all prompts arrive initially and must be processed sequentially. For this case, we develop a polynomial-time algorithm that achieves exact optimality. Next, we examine the fully online setting with sequential prompt arrivals. For adversarial sequences, we demonstrate that no algorithm can achieve a constant competitive ratio. For stochastic arrivals, we present a fast algorithm that guarantees constant regret, using a novel framework based on compensated coupling to prove it. Finally, Using the Vidur simulator on a public conversation dataset, we compare our algorithm to benchmark algorithms on 2 linked A100 GPUs with the Llama-70B model. After optimizing benchmark parameters, we find that in high-demand scenarios, our algorithm’s average latency grows only one-third as fast as the best benchmark, and in low-demand cases, only one-eighth as fast.

Voting on Public Goods: Citizens vs. Shareholders20 Nov 2024
An Equilibrium Analysis of the Effects of Neighborhood-based Interventions on Children22 November 2024
Other Events
214th Congregation (Summer Congregation 2025)
7Jul-8Jul
Faculty Level | Event
214th Congregation (Summer Congregation 2025)
Grand Hall, Centennial Campus, HKU
Caregiving And Consumption Sacrifice: How Caregiving Affects Choices For The Self
Caregiving And Consumption Sacrifice: How Caregiving Affects Choices For The Self
Caregiving And Consumption Sacrifice: How Caregiving Affects Choices For The Self
Caregiving And Consumption Sacrifice: How Caregiving Affects Choices For The Self
11Jun
Marketing | Seminar
Caregiving And Consumption Sacrifice: How Caregiving Affects Choices For The Self
11 Jun, 2025 | 10:30 a.m. - 12:00 p.m.
KK 1121, K. K. Leung Building, HKU
Prof. Peggy Liu
Foundations and Frontiers: Operations Research for Large Language Model Inference
About Speaker
Mr. Zijie (Jerry) Zhou

Ph.D. Candidate in MIT ORC & LIDS | Massachusetts Institute of Technology

Foundations and Frontiers: Operations Research for Large Language Model Inference
22 Nov 20242:00 p.m. — 3:30 p.m.
KK 315, K. K. Leung Building, HKU
Sign up for upcoming news and events
LinkedIn WeChat Instagram Facebook Weibo Twitter YouTube

©2025, HKU Business School. All Rights Reserved. | Privacy Policy | Web Accessibility Statement