Benchmarking Large Language Models on Network Optimization

June 27, 2025

Paper Title:

Authors:

Ethan Cui, Foothill College, USA

Abstract:

The most recent large language models (LLMs) have impressive problem-solving capabilities in tasks such as code generation and completing math problems. However, a majority of current benchmarks on recent LLMs test only on academic or competition style questions, leading to gaps in our understanding of the capabilities of LLMs. This paper introduces the first benchmark that evaluates LLMs on network optimization problems from operations research, including shortest-path routing, min-cost flow, and multicommodity flow. Such problems test LLM’s abilities to both follow structured instructions and conforming to complex constraints along with long-term reasoning on complex tasks. By controlling the complexity of synthetically generated instances, we reveal critical differences in the reasoning and planning capabilities of seven frontier models. Our work bridges the Operations Research (OR) and LLM communities, positioning network optimization as a rigorous and scalable framework for evaluating and advancing LLM reasoning.

Keywords:

Network Flow Optimization, Large language model, Model Evaluation

Volume URL: https://airccse.com/oraj/vol12.html

Pdf URL: https://airccse.com/oraj/papers/12225oraj01.pdf

#networkflowoptimization #largelanguagemodel #modelevaluation #callforpapers #researchpapers #cfp #researchers #phdstudent #education #learning #online #researchScholar #journalpaper #submission #journalsubmission #operationsresearch #optimisation #scheduling

Search This Blog

Operations Research and Applications

Benchmarking Large Language Models on Network Optimization

Comments

Post a Comment

Popular posts from this blog

Operations Research and Applications : An International Journal (ORAJ)

Operations Research and Applications : An International Journal (ORAJ)

Operations Research and Applications : An International Journal (ORAJ)