TY CONF TI INFRASTRUCTURE-LEVEL NEURAL NETWORK INFERENCE OPTIMIZATION: MODEL SERVING PLATFORMS AND DISTRIBUTED COMPUTING KW neural network inference KW model serving KW Triton Inference Server KW Ray Serve KW vLLM KW dynamic batching KW PagedAttention KW autoscaling KW Kubernetes KW distributed computing JO MODELING INFORMATION SYSTEMS AND TECHNOLOGIES – 2026 AU Merzlyakov, N.V. AU Semkin, A.A. AU Matviychuk, B.S. AU Sedykh, D.A. AU Dudnik, S.P. AU Vytovtov, P.D. PY 2026 PB FSBE Institution of Higher Education Voronezh State University of Forestry and Technologies named after G.F. Morozov