The accelerating adoption of artificial intelligence (AI) and machine learning (ML) workloads is dramatically reshaping cloud infrastructure demands, leading to an urgent need for efficient resource management. However, a significant gap persists in the industry’s willingness to automate crucial "right-sizing" decisions for cloud resources, particularly within Kubernetes environments. This hesitation, despite the growing pressure of escalating cloud bills, presents a critical challenge for organizations aiming to optimize costs and performance.
Yasmin Rajabi, chief operating officer at CloudBolt, articulated this dichotomy, highlighting an imbalance in how automation is perceived. "We’re happy to automate decisions that result in more productivity and processes – but what about when it comes to turning the dial to the left? For some reason, there’s hesitation there," Rajabi stated in an interview with The New Stack. "Trust is super-high when it comes to traditional automation, but there’s still a lot of caution when it comes to right-sizing. The same engineers who are deploying multiple times a day through CI/CD aren’t questioning [automation] anymore, but when it comes to delegating right-sizing to the machine, the bar to earn that trust is much higher."
This disconnect is underscored by recent industry data. According to the March 2026 CloudBolt Research Report, an overwhelming 89% of organizations now identify automated right-sizing processes as a priority. This heightened focus is directly attributed to the soaring cloud expenditures driven by GPU-intensive AI workloads. Historically, the imperative to maintain constant availability often led to over-provisioning, with the associated higher cloud bills seen as a necessary trade-off for guaranteed uptime. However, the current economic climate and the unbridled growth of AI compute needs have made this an untenable strategy for many.
Despite this acknowledged priority, the practical implementation of automated right-sizing lags significantly. The same CloudBolt report reveals that 71% of Kubernetes engineers still mandate human review for resource optimization. Consequently, only a modest 27% of organizations permit automated adjustments to CPU and memory allocations. This indicates a substantial chasm between recognizing the need for automated right-sizing and embedding it into operational workflows. The motivation to cut costs and improve efficiency is evident, but the trust required to fully delegate these adjustments to automated systems remains elusive.
The AI Workload Catalyst
The surge in AI and ML adoption has been a primary driver for this shift in cloud resource strategy. Training complex AI models and deploying inference engines are notoriously resource-intensive, often requiring substantial GPU power and large amounts of memory. While these workloads promise transformative innovation and competitive advantages, their operational costs can quickly spiral out of control if not meticulously managed. Organizations are finding themselves caught between the need to accelerate AI initiatives and the financial implications of inefficient cloud resource allocation.
This situation creates a pressing need for sophisticated automation that can dynamically adjust resources based on actual demand, rather than relying on static, often oversized, allocations. The challenge lies not only in identifying when resources are over-provisioned but also in accurately predicting future needs and safely scaling down without impacting performance or availability. The complexity of modern cloud-native applications, particularly those orchestrated by Kubernetes, adds another layer of difficulty.
Building Trust in Automated Resource Optimization
Rajabi emphasized that right-sizing is not a simplistic task; it’s a multi-dimensional problem. "It spans increasingly complex workloads in increasingly complex environments, so that when something goes wrong, it feels almost impossible to reverse," she explained. To effectively automate right-sizing, trust must be cultivated not only across individual teams but also scaled throughout the entire organization.
The fragility of this trust was also a key point raised by Rajabi. "It takes a long time to build up trust in an automation solution, and it’s very fast to eliminate or significantly undermine that trust," she warned. "It takes one production incident to take an application team from being willing to entertain automated resourcing to absolutely not, ‘not on my application, we’re special.’" This highlights the critical importance of robust testing, transparent processes, and effective rollback strategies to maintain confidence in automated systems.
The Kubernetes Context
Kubernetes, with its declarative configuration and orchestration capabilities, is the de facto standard for deploying and managing containerized applications, including AI workloads. However, its inherent complexity can also make resource optimization a challenging endeavor. Engineers often grapple with intricate configuration files, dynamic scaling policies, and the potential for cascading failures if resource adjustments are mismanaged.
The data suggests that while engineers are comfortable with automation in areas like continuous integration and continuous delivery (CI/CD), delegating decisions about resource allocation – the "turning the dial to the left" aspect Rajabi mentioned – requires a higher threshold of confidence. This is understandable, as mismanaging resource allocation can lead to application downtime, degraded performance, and significant financial repercussions.
Expert Insights and Upcoming Discussion
To delve deeper into this critical issue, The New Stack will host a live discussion on Wednesday, June 24, at 9 a.m. Pacific/5 p.m. BST. The webinar will feature Yasmin Rajabi from CloudBolt and Reid Vandewiele, product lead at StormForge, who will explore the urgency of the right-sizing gap, particularly in the context of Kubernetes workloads for AI.
This session aims to provide attendees with actionable insights into measuring their organization’s automation maturity. Crucially, it will also focus on strategies for building and maintaining trust in automated resource management over time. Discussions are expected to cover best practices in CPU throttling, managing out-of-memory (OOM) events, and implementing effective rollback patterns to mitigate risks associated with automated adjustments.
Broader Implications and Future Outlook
The implications of failing to bridge the Kubernetes right-sizing trust gap are significant. Organizations that cannot effectively automate resource optimization risk continuing to overspend on cloud infrastructure, thereby impacting their profitability and ability to invest in further innovation. This could also lead to a competitive disadvantage for companies that are unable to efficiently scale their AI initiatives.
Conversely, organizations that successfully build trust in automated right-sizing can unlock substantial cost savings, improve the performance and reliability of their applications, and gain greater agility in responding to changing market demands. This improved efficiency can free up valuable engineering resources to focus on higher-value tasks, such as developing new features and driving business growth.
The upcoming webinar on June 24 offers a timely opportunity for industry professionals to gain a clearer understanding of the true cost of their AI workloads and to develop a strategic plan that removes the guesswork from cloud provisioning. By addressing the trust gap head-on, organizations can pave the way for more efficient, cost-effective, and scalable cloud operations in the era of AI. The conversation is not just about saving money; it’s about fundamentally rethinking how we leverage cloud resources to power the next wave of technological advancement.
