Benchmarking Large Language Models on Network Optimization

 Paper Title:

Benchmarking Large Language Models on Network Optimization


Authors:

Ethan Cui, Foothill College, USA


Abstract:

The most recent large language models (LLMs) have impressive problem-solving capabilities in tasks such as code generation and completing math problems. However, a majority of current benchmarks on recent LLMs test only on academic or competition style questions, leading to gaps in our understanding of the capabilities of LLMs. This paper introduces the first benchmark that evaluates LLMs on network optimization problems from operations research, including shortest-path routing, min-cost flow, and multicommodity flow. Such problems test LLM’s abilities to both follow structured instructions and conforming to complex constraints along with long-term reasoning on complex tasks. By controlling the complexity of synthetically generated instances, we reveal critical differences in the reasoning and planning capabilities of seven frontier models. Our work bridges the Operations Research (OR) and LLM communities, positioning network optimization as a rigorous and scalable framework for evaluating and advancing LLM reasoning. 


Keywords:

Network Flow Optimization, Large language model, Model Evaluation


Volume URL: https://airccse.com/oraj/vol12.html


Pdf URL: https://airccse.com/oraj/papers/12225oraj01.pdf


#networkflowoptimization #largelanguagemodel #modelevaluation #callforpapers #researchpapers #cfp #researchers #phdstudent #education #learning #online #researchScholar #journalpaper #submission #journalsubmission #operationsresearch #optimisation #scheduling



Comments

Popular posts from this blog

Operations Research and Applications : An International Journal (ORAJ)

Operations Research and Applications : An International Journal (ORAJ)

Operations Research and Applications : An International Journal (ORAJ)