Did Big Tech Waste Billions? NVIDIA Challenges the AI Craze with Large Language Model Investments
- EricTheVogi
- Aug 21
- 3 min read
NVIDIA quietly told the world that LLMs may not be the top right approach to the future economy buildout of agentic AI. In a new research report they claimed that big tech spending might have largely been wasted because of the premise that AI models' scale equated to capability. Perhaps all the massive data centers and investments in energy, infrastructure and management was overdone to say the least.

In the report released 2 months ago, we conducted a brief research analysis on why they think that requiring LLMs for all agentic AI tasks constitutes a poor deployment of computational resources, resulting in economic inefficiency and widespread environmental unsustainability. As opposed to having one or the other, they envisioned that for tasks requiring broader reasoning, hybrid systems that default to SLMs and selectively call on LLMs can blend the two enabling agents that are both capable and cost‑effective. In sum, with rising costs and environmental concerns, normalizing SLM use in agent workflows would be the key to more responsible, sustainable AI.
Newer small language models now have the capabilities of earlier large models, and their reasoning can be further strengthened at inference (when a finished, trained model is being used) through techniques like ‘self-consistency’ (solving the same problem multiple times with slightly different answers) and ‘verifier feedback’ (responses are checked by a separate model). In agentic workflows where specialization and continuous refinement matter most, these advantages held by SLMs prove particularly impactful. Ultimately modern methods in training and prompting show that capability, not model size, is the real constraint to building out better agents.
They referred to a “Lego-like” composition of agentic intelligence where scaling out with small, specialized experts rather than scaling up monolithic models produces systems that are more cost‑efficient, easier to debug and deploy, and better suited to the operational diversity of real world agents. This makes it easy to upload new skills and adjust to changing requirements while their smaller size reduces costs to build, making them easier to adapt and deploy. This affordability and agility enable the practical use of multiple specialized expert models tailored to different agentic routines. For example, rapid iteration (improvement through ongoing cycles) and adaptation (the system’s ability to change in response to external influences) allows systems to meet evolving user needs by supporting new behaviors, accommodating updated output formats, and ensuring compliance with changing local regulations.
On-device AI capabilities are growing, and one of the primary reasons is that edge AI deployment is advancing by allowing SLMs to operate locally on standard GPUs because they reduce latency while improving privacy and control. When we use LLM models for most tasks, much of the signal is sparse, with only a fraction of the parameters activated per input. In contrast, this effect is less pronounced in SLMs, implying that their smaller scale may yield greater efficiency, since fewer parameters contribute to inference cost without diminishing output quality.
Another key advantage of merging LLMs with flexible SLMs is the democratization of agent development. Lower costs and accessibility allow more individuals and organizations to build models for their own deployment, expanding the diversity of perspectives and societal needs represented. This broader participation reduces systemic bias, fosters competition and innovation, and accelerates progress in the field.
This research analysis highlights that SLMs are often better suited than LLMs for many applications. Even as systems scale, small models retain advantages such as cross‑device agility, on top of the fact their design and development continue to build on foundations established by LLM research which means the industry isn’t due for a major overhaul soon. On benchmarks measuring agentic utility, SLMs routinely outperform larger models, which supposedly will make them particularly well aligned with industrial needs. Although barriers to adoption remain driven largely by existing investments in LLMs, these are likely to diminish as the economic benefits of SLM deployment in agentic workflows become clearer. Given the modular nature of agentic systems, it is natural to expect a gradual shift from reliance on LLM generalists toward specialized SLMs across many interfaces.
As long as strict policies on AI are avoided and the benefits are realized, small language models could grow into new markets without easy accessibility to current models. This would likely change how current large language models and their companies operate. It will be an interesting development to watch for signs of this shift, so keep up to date by joining a Vogi Group.
Comments