Benchmarking Large Language Models on Network Optimization
Paper Title:
Benchmarking Large Language Models on Network Optimization
Authors:
Ethan Cui, Foothill College, USA
Abstract:
The most recent large language models (LLMs) have impressive problem-solving capabilities in tasks such as code generation and completing math problems. However, a majority of current benchmarks on recent LLMs test only on academic or competition style questions, leading to gaps in our understanding of the capabilities of LLMs. This paper introduces the first benchmark that evaluates LLMs on network optimization problems from operations research, including shortest-path routing, min-cost flow, and multicommodity flow. Such problems test LLM’s abilities to both follow structured instructions and conforming to complex constraints along with long-term reasoning on complex tasks. By controlling the complexity of synthetically generated instances, we reveal critical differences in the reasoning and planning capabilities of seven frontier models. Our work bridges the Operations Research (OR) and LLM communities, positioning network optimization as a rigorous and scalable framework for evaluating and advancing LLM reasoning.
Keywords:
Network Flow Optimization, Large language model, Model Evaluation
Volume URL: https://airccse.com/oraj/vol12.html
Pdf URL: https://airccse.com/oraj/papers/12225oraj01.pdf
#networkflowoptimization #largelanguagemodel #modelevaluation #callforpapers #researchpapers #cfp #researchers #phdstudent #education #learning #online #researchScholar #journalpaper #submission #journalsubmission #operationsresearch #optimisation #scheduling
Comments
Post a Comment