A report from Cast AI reveals that the average GPU utilization across Kubernetes clusters in major cloud services is only 5%, prompting concerns over wasted capacity and escalating costs. Companies in the tech sector are estimated to be paying for 20 times more GPU resources than necessary, highlighting significant inefficiencies in resource management.
The findings indicate a stark increase in overprovisioning, with CPU overprovisioning rising from 40% to 69% and memory overprovisioning reaching 79% over the past year. CPU utilization has also seen a decline, falling from 10% to 8%, while memory utilization dropped from 23% to 20%. These metrics reflect a troubling trend where organizations reserve more resources than required.
The report also details the financial implications of idle resources. Idle GPUs incur costs of dollars per hour, in stark contrast to idle CPUs, which only cost cents per hour. This discrepancy amplifies the urgency for improved resource management, particularly as GPU prices see their first increase since 2006. AWS raised H200 Capacity Block prices by 15% in January 2026 due to supply and demand factors, breaking a two-decade trend of declining prices.
Laurent Gil, co-founder and President of Cast AI, emphasized the severity of the situation, stating, “At 5% utilization, the math doesn’t work.” He noted that despite the availability of advanced AI tools for managing applications, these tools are not effectively utilized for optimizing underlying infrastructure.
While a minority of organizations report higher GPU utilization rates, achieving 49% on H200s and 30% on H100s, the report attributes these successes to automation practices rather than superior hardware alone. Many organizations prioritize resource safety through overprovisioning, often leading to increased costs.
Companies appear reluctant to shift their resource management habits, continuing to incur excessive fees instead of leveraging existing tools for automation, such as automated rightsizing and GPU sharing. This resistance undermines efforts to enhance operational efficiency within the tech industry.
The report suggests that transformative change in resource management could significantly mitigate unnecessary expenditures, but many organizations seem willing to maintain their current practices despite the escalating costs.








