We are tapping into supercomputers for tasks we used to do in our heads. According to OpenAI’s CEO, adding simple responses such as “thank you” to ChatGPT can result in millions of dollars in costs for the company. In addition, the additional responses are adding to the mounting processing pressure. In February this year, Altman announced that the company had already run out of GPUs. Although OpenAI has plans to increase their processing power, this begs the question: What happens when the machines start to slow down, and the power runs dry?
It usually starts with something small. A request to summarize a case. A quick legal email you do not feel like writing. A contract clause you want phrased a little better. The AI delivers what you need in seconds. Out of instinct, or perhaps misplaced politeness, you type back: “Thanks.”
But that simple gesture is not just a harmless closing line. Behind it is a machine that springs back into motion. That “thank you” is parsed, tokenized, and processed by tens of billions of parameters across multiple data centers, routed through servers that consume megawatts of power just to produce the sentence, “You’re welcome.” Multiply that single moment by the hundreds of millions of users doing the exact same thing around the world, and suddenly politeness has an environmental price tag.
Artificial intelligence may seem invisible, frictionless, almost magical. But it is not free. It runs on energy. And its common cause that energy is not infinite.
When Computation Feels Effortless, but Costs the Earth
Generative AI tools like ChatGPT, Claude, Gemini, and others are powered by large language models that rely on astonishing amounts of computation to produce their outputs. These models are not simple if-then engines. They are probabilistic pattern recognition networks operating across massive neural architectures trained on countless books, websites, legal filings, codebases, and conversations.
And every prompt you enter, no matter how short or seemingly insignificant, activates a fraction of that entire machine. Researchers have estimated that a single query to GPT-4 may consume between 0.001 to 0.01 kWh of electricity. That means that a single ChatGPT prompt could use as much energy as a small HD television uses for a full hour. That may not sound like much, until you realize OpenAI’s services alone field over a four hundred million users weekly, each generating anywhere from a few prompts to dozens or hundreds per session.
Many of those prompts are productive. Some are transformative. But a large portion are what you could call casual or even wasteful. People ask AI to generate jokes, write tweets, suggest what to cook, or answer questions they could Google in five seconds. They use it to write to-do lists or copy-and-paste bits of code that will never be tested. They say thank you. They say goodnight.
This is not just about behavior, it is about infrastructure. Those idle, throwaway queries still demand electricity, water for cooling, and constant chip cycles on GPU farms around the world. As this usage grows exponentially, so does its carbon footprint. We are, quite literally, using planetary-scale computing to avoid mild inconvenience.
The Machines Are Not Tired, but the Grid Might Be
What most people don’t realize is that the explosion in AI adoption is pushing against very real physical limits. Unlike social media apps or document-sharing tools, generative AI does not scale cheaply. It runs on a finite supply of high-end chips primarily from NVIDIA, and requires robust power and cooling systems to function at scale.
This is why leaders in the field, including OpenAI CEO Sam Altman, have been sounding the alarm. Not about existential threats, but about compute shortages. There are not enough GPUs. There is not enough energy. And we are running out of sufficient water to cool the data centers being built to fuel this boom.
In early 2024, Altman stated plainly: “We’re so far from the compute we need.” The company has since launched a multi-billion dollar effort to develop its own chips and secure long-term access to compute infrastructure. This is not a nice-to-have, it is survival strategy in a world where silicon is becoming the new oil.
Already, some users have noticed intermittent slowdowns and rate limits. OpenAI has introduced “downtime” periods for free users. Claude and Gemini limit query depth on their no-cost tiers. And you can expect these throttles to become more common as demand outpaces infrastructure growth.
This is not science fiction. It is the real-world economics of energy and access.
Who Will Be Cut Off First?
If these shortages become more severe, which seems likely in the short to medium term, we can expect AI platforms to make hard choices about who gets priority. The first to lose access will not be enterprise clients. It will be everyday users relying on free or ad-supported tools.
Imagine a world where GPT-4 becomes a luxury product. Where legal teams must pay premium rates to ensure uninterrupted access during case preparation. Where casual users get shunted to slower models, or locked out entirely during peak hours. This tiered future is already beginning to materialize. ChatGPT Plus subscribers gain access to GPT-4 while free users remain limited to GPT-3.5. Similar distinctions exist for Claude and Gemini.
And that raises a deeper concern. What happens to the AI companies whose business models are built on subsidized or free access? If compute becomes expensive and scarce, how many of those startups, writing assistants, contract generators, research tools, can afford to survive? Will they pass on the costs to users? Will they fold? Or will they get acquired and absorbed by the larger players who control the infrastructure?
AI has enjoyed a golden age of abundance. But abundance never lasts forever.
A New Kind of Digital Divide
There is a more human cost to all this. If access to high-functioning AI becomes expensive and exclusive, we risk creating a new kind of inequality, not of wealth or information, but of cognitive power. The firms and individuals who can afford premium AI will gain enormous advantages in productivity, research, litigation, and client service. Those who cannot may fall behind, not because they lack skill or knowledge, but because they cannot afford to tap into the same digital brainpower.
This is not just a question of comfort. It is a question of access to justice, of fair representation, of whether technology becomes a democratizing force or a deepening wedge between the haves and the have-nots.
The Polite Apocalypse
So back to that original “thank you.”
No, you do not need to stop being polite. But maybe it is time we rethink how and when we summon this immense computational power. Maybe it is time to treat AI not as a frictionless background utility, but as a resource. A powerful, limited, energy-intensive resource that should be used wisely.
We are in the habit of asking machines to do everything for us, even the easy stuff. But intelligence, real, transformative intelligence, is not cheap. It never was.
The next time you prompt your favorite chatbot, ask yourself: Could I do this on my own? Could I write that sentence? Skim that page? Remember that case? Because someday soon, you might not have the choice.