LLM Inference Optimization

Automat-it Launches LLM Selection Optimizer to Slash Startup LLM Costs by up to 60%

AWS Premier Tier Partner leverages its AI Services Competency and expertise to help founders cut LLM costs using ...

AI inference costs dropped up to 10x on Nvidia's Blackwell — but hardware is only half the equation

New deployment data from four inference providers shows where the savings actually come from — and what teams should evaluate ...

VentureBeat

New LLM optimization technique slashes memory costs up to 75%

Researchers at the Tokyo-based startup Sakana AI have developed a new technique that enables language models to use memory more efficiently, helping enterprises cut the costs of building applications ...

Network World

Nvidia claims 10x cost savings with open-source inference models

Nvidia noted that cost per token went from 20 cents on the older Hopper platform to 10 cents on Blackwell. Moving to Blackwell’s native low-precision NVFP4 format further reduced the cost to just 5 ...

Semiconductor Engineering

HW-SW Co-Designed System With 3 Core Optimization Pathways For Long-Context Agentic LLM Inference (Cambridge, ICL)

A new technical paper titled “Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference” was published by researchers at University of Cambridge, Imperial College London ...

Business Wire

MangoBoost Launches Mango LLMBoost™: AI Inference Optimization Software with Up to 12.6x Relative Performance Improvement and 92% Cost Savings

BELLEVUE, Wash.--(BUSINESS WIRE)--MangoBoost, a provider of cutting-edge system solutions designed to maximize AI data center efficiency, is announcing the launch of Mango LLMBoost™, system ...

Search Engine Land

Show inaccessible results

Automat-it Launches LLM Selection Optimizer to Slash Startup LLM Costs by up to 60%

AI inference costs dropped up to 10x on Nvidia's Blackwell — but hardware is only half the equation

New LLM optimization technique slashes memory costs up to 75%

Nvidia claims 10x cost savings with open-source inference models

HW-SW Co-Designed System With 3 Core Optimization Pathways For Long-Context Agentic LLM Inference (Cambridge, ICL)

MangoBoost Launches Mango LLMBoost™: AI Inference Optimization Software with Up to 12.6x Relative Performance Improvement and 92% Cost Savings

LLM optimization in 2026: Tracking, visibility, and what’s next for AI discovery

Tencent Hunyuan AI Infra Launches HPC-Ops, Boosting Inference Throughput by 30%

NVIDIA Boosts LLM Inference Performance With New TensorRT-LLM Software Library

I served a 200 billion parameter LLM from a Lenovo workstation the size of a Mac Mini