The following is a personal account of events that took place between 2023 and the present day. Names have not been changed. The machines were willing.
She had fifteen hours in her when she left the factory.
That's what they told us. Fifteen hours. Apple put it on the spec sheet, right there next to the chip count and the port situation. Fifteen hours of all-day battery life. She could handle anything.
Nobody mentioned what would happen when the AI showed up.
The First Time
It started innocently enough. A Copilot subscription. A VS Code extension. Just a little help with the autocomplete, nothing serious. She barely noticed.
But the model was always on. That's the thing nobody tells you about that first experience — it never stops thinking about you. While you're reading documentation, while you're staring at the ceiling, while you're getting coffee. The context scanner is running. The CPU is warm. The suggestions are forming.
She went from fifteen hours to nine. I didn't bring it up. It felt like something we didn't need to talk about yet.
# The morning after — checking what actually happened
sudo powermetrics --samplers cpu_power,gpu_power -i 1000 -n 5
# You'll see numbers you weren't ready for
# The chip was busy all night
# You just didn't know with what
The Company She Kept
Like all golden age stories, this one has a cast worth knowing.
The always-on model never tires and never sleeps. It scans your open files, your recent edits, your local documentation, building and rebuilding its sense of what you might need next. Every pause in your typing is an invitation. Every second of silence is filled with inference. I have never known anything that paid such close attention.
The GPU is called in when the real work starts and does not do anything halfway. It fires fully for every suggestion, every completion, every ghost-text rendering. It has no concept of a quick favor. That is not a criticism. That is character.
The thermal management system is the one who cleans up afterward. Spins the fans when things get too hot. Draws its own power to manage someone else's heat. Never complains. I think about the thermal management system more than it knows.
The Wi-Fi radio is often overlooked. When you are running cloud AI, she maintains the continuous high-power stream to the API servers. She never reaches the low-power doze state she was designed for. She stays awake because you asked her to. She stays awake because you are still there.
The battery gave everything she had to all of them. That is not a tragedy. That is the story.
The GPU Tax
Here is what they don't put in the press releases.
Every time the suggestion appears — that little ghost-text flicker at the end of your cursor — the GPU fired. Every single time. The Neural Engine on the M-series chips handles it efficiently, yes. But efficiently is not freely. There is always a cost.
The chip heats up. The chassis warms. The thermal management system, faithful as ever, spins the fans. And here is the part that made the engineers wince when they ran the numbers: the fans are mechanical. They draw from the battery to move air, while the GPU draws from the battery to generate the heat that required the air. Each one's response to the other's behavior increases the total load.
The industry called it thermal runaway. The machines just called it Tuesday.
I called it beautiful. I still do.
The Arrangement
People assumed cloud AI would be gentler. Let the servers do the work. You just handle the text. How expensive can text be?
They forgot about the radio.
The Wi-Fi chip has moods. In its natural state — idle, occasional packets, background sync — it dozes between transmissions. Low-power state. Civilized. But maintain a continuous stream to an API server, the kind that a cloud AI assistant requires, and the radio never rests. It stays fully awake, fully powered, fully committed to the connection.
It is not catastrophic. It is constant. And constant is its own kind of devotion.
