Your AI Glasses Need 300 Watts. This Chip Uses Half a Milliwatt.

Q: Why Isn't This Everywhere?

Because the software ecosystem doesn't exist.

Neuromorphic processors — silicon modeled after neurons rather than logic gates — are achieving 40–100× the energy efficiency of GPUs for inference tasks. The race to put always-on AI into wearables isn't about smarter models. It's about dumber chips that happen to think like brains.

0.42 milliwatts.

That's the idle power consumption of SynSense's Speck chip — a neuromorphic vision processor developed jointly by the Chinese Academy of Sciences and SynSense AG in Zurich. Under active load, processing a live camera feed for gesture recognition or object detection, it draws 0.70 milliwatts. Not watts. Not hundred-milliwatts. Sub-milliwatt, active, real-time inference.

For context: the NVIDIA H100 GPU that powers most AI inference today consumes 700 watts at peak. That's a million-fold difference. The Speck chip could run for 14 years on a single coin cell battery at idle.

This isn't a prototype in a press release. It was published in Nature Electronics in June 2024 by Li Guoqi and colleagues, with working silicon and benchmark results. The question isn't whether neuromorphic chips work. It's why nobody outside research labs has heard of them.

The Energy Wall

Every company building AI wearables — Meta, Apple, Google, Snap — hits the same wall. Running a large language model or vision model on a GPU requires tens to hundreds of watts. A pair of glasses has a battery budget measured in single-digit watt-hours. The math doesn't work.

The industry's current solution is offloading: send audio or video to a phone or cloud server, run inference there, beam results back. It works, but it adds latency, kills the experience when connectivity drops, and means your glasses are fundamentally a dumb display tethered to a smart phone.

Neuromorphic chips attack this from a completely different angle. Instead of shrinking a GPU until it fits in a glasses frame, they build processors that work like neurons — spiking only when there's something to process, consuming zero energy when nothing changes in the input. A static scene costs nothing. A sudden gesture costs a fraction of a milliwatt for a fraction of a second.

The Hardware Landscape

Chip	Organization	Neurons	Power	Key Result
Speck	SynSense / CAS	~327K	0.42 mW idle, 0.70 mW active	Real-time vision at sub-milliwatt (Nature Electronics, 2024)
Loihi 2	Intel	1M per chip	~1W per chip	Hala Point: 1,152 chips, 1.15B neurons at Sandia National Labs
NorthPole	IBM Research	22B transistors	~12W (12nm)	72.7× more energy-efficient than next-lowest-latency GPU; <1ms/token on 3B-param LLM (IEEE HPC, 2024)
Akida AKD1500	BrainChip	—	<1W (edge SoC)	AkidaTag wearable platform w/ Nordic nRF5340 (March 2026)
TrueNorth	IBM (legacy)	1M	~70 mW	Original neuromorphic demonstrator (2014)

Three things jump out of this table.

First, the power numbers span five orders of magnitude depending on what you're trying to do. Speck does vision at sub-milliwatt. NorthPole does LLM inference at 12 watts. Both are "neuromorphic" in the broad sense — event-driven, brain-inspired — but they're solving fundamentally different problems.

Second, Intel built the biggest neuromorphic system in the world and put it in a weapons lab. Hala Point, installed at Sandia National Laboratories in April 2024, packs 1,152 Loihi 2 chips containing 1.15 billion artificial neurons and 128 billion synapses. It's 10× faster and denser than its predecessor, Pohoiki Springs. Sandia's interest is in real-time optimization problems — logistics, sensor fusion, autonomous systems — the kind of stuff the Department of Defense runs on supercomputers today.

Third, BrainChip just shipped a wearable reference platform. The AkidaTag, announced March 10, 2026, pairs their Akida AKD1500 neuromorphic coprocessor with Nordic Semiconductor's nRF5340 wireless SoC. It's designed for always-on gesture recognition, keyword spotting, and sensor fusion in devices the size of a hearing aid. BrainChip (ASX: BRN) is the only publicly traded pure-play neuromorphic chip company, trading at A$0.089 as of this writing.

The IBM Results That Should Terrify NVIDIA

In September 2024, IBM Research published results from their NorthPole chip that reframed the entire neuromorphic conversation.

Running a 3-billion-parameter LLM derived from IBM's Granite-8B-Code-Base, a rack of 16 NorthPole chips achieved:

Sub-1ms latency per token — 46.9× faster than the next most energy-efficient GPU
28,356 tokens per second throughput in a standard 2U server
72.7× more energy-efficient than the next lowest-latency GPU

Read that again. Not 72.7% more efficient. 72.7 times.

NorthPole is fabricated on a 12nm process — two full generations behind NVIDIA's 4nm H100. And it still destroyed the efficiency benchmarks. IBM Fellow Dharmendra Modha, who leads the NorthPole team, told IEEE Spectrum: "What is essential here is qualitative orders of magnitude in improvement."

The earlier NorthPole results, published in Science in October 2023, showed 25× energy efficiency gains over comparable GPUs on ResNet-50 and YOLOv4 — image recognition and object detection, the exact workloads that wearable AI needs. The 2024 LLM results proved the architecture generalizes beyond vision.

The Wearables Gap

Here's what a pair of AI glasses needs to do, continuously, on a battery the size of your pinky finger:

Task	GPU Power (typical)	Neuromorphic Power (demonstrated)	Gap
Always-on wake word	50–200 mW	0.5–2 mW (Akida)	100×
Gesture recognition	500 mW–2W	0.7 mW (Speck)	700–2,800×
Object detection	1–5W	10–50 mW (Loihi 2)	100×
On-device LLM (3B)	5–15W	~0.75W (NorthPole, projected at edge scale)	7–20×

The sensor-level tasks — wake word, gesture, basic object detection — are already neuromorphic-ready. Speck and Akida can handle them today at power levels that don't register on a battery gauge. The hard problem is on-device language models, which is where NorthPole's 2024 results become critical: they prove the architecture works for LLMs, even if the current chip is too big for a glasses frame.

Why Isn't This Everywhere?

Because the software ecosystem doesn't exist.

GPUs won the AI hardware war not because they're the best architecture for neural networks — they're not — but because CUDA gave every researcher on Earth a common programming model. PyTorch and TensorFlow abstract over CUDA. Every model, every framework, every tutorial assumes GPU execution. The entire $200B+ AI infrastructure stack is built on the assumption that inference runs on NVIDIA hardware.

Neuromorphic chips require spiking neural networks (SNNs), which use a fundamentally different computational model. You can't take a PyTorch model and run it on Loihi 2 without converting it — and conversion loses accuracy. The open-source neuromorphic tools (Intel's Lava framework, SynSense's Sinabs) are years behind CUDA in maturity. Most ML engineers have never written a spiking network.

This is the exact same pattern that delayed GPUs for a decade. NVIDIA shipped the first CUDA-capable GPU in 2006. Deep learning didn't explode until 2012, when AlexNet showed the world what GPUs could do for neural networks. The hardware was ready years before the software caught up.

Neuromorphic hardware is in its 2008 moment. The silicon works. The benchmarks are extraordinary. The software is not.

The Market

Meticulous Research projects the neuromorphic computing market will reach $17.4 billion by 2032, growing at 22.2% CAGR from a small base. The Business Research Company puts the 2025 market at $7.69 billion. The discrepancy tells you nobody really knows — the market is too nascent for clean sizing.

What's clearer is the edge AI market that neuromorphic chips would serve: $26.6 billion in 2025, growing to $107.4 billion by 2029, according to MarketsandMarkets. The subset that runs on batteries — wearables, IoT sensors, hearing aids, smart glasses — is where neuromorphic has an unassailable advantage. No amount of GPU shrinkage gets you to 0.42 milliwatts.

The Bottom Line

The AI glasses on your face today are basically a camera, a microphone, and a Bluetooth radio. The AI runs somewhere else — your phone, Meta's servers, Google's cloud. Neuromorphic chips are the technology that could make the glasses themselves intelligent, running vision and language models locally at power levels that don't drain a miniature battery in an hour.

The hardware exists. Intel's built a brain-scale system. IBM's chip outperforms GPUs by orders of magnitude. SynSense has sub-milliwatt vision. BrainChip just shipped a wearable reference platform. The bottleneck is the same one that held back GPU computing for six years: the software stack hasn't caught up to the silicon.

When it does — and it will — the 700-watt GPU inference paradigm will look as quaint as vacuum tubes.