%0 Conference Article %T INFRASTRUCTURE-LEVEL NEURAL NETWORK INFERENCE OPTIMIZATION: MODEL SERVING PLATFORMS AND DISTRIBUTED COMPUTING %A Merzlyakov, N.V. %A Semkin, A.A. %A Matviychuk, B.S. %A Sedykh, D.A. %A Dudnik, S.P. %A Vytovtov, P.D. %K neural network inference, model serving, Triton Inference Server, Ray Serve, vLLM, dynamic batching, PagedAttention, autoscaling, Kubernetes, distributed computing %J MODELING INFORMATION SYSTEMS AND TECHNOLOGIES – 2026 %D 2026 %P 9 %I FSBE Institution of Higher Education Voronezh State University of Forestry and Technologies named after G.F. Morozov