Not All Robot Minds Think Alike, and That’s a Good Thing

By Karen Lau, Marsya Amnee and Trason Soh

Jun 09, 2026

Recently, during lunch with colleagues in their 20s, 30s, and 40s, the conversation somehow drifted into a very Malaysian topic: school life.

Someone joked about memorising Sejarah facts only to forget everything after SPM. Another laughed about cramming Add Math formulas without actually understanding them.

Then one colleague in his 40s said, “At some point, your parents stop caring about your grades and start asking whether you know how to survive in the real world.”

Immediately, everyone nodded.

Because almost every Malaysian kid grew up hearing:

“Don’t just memorise. Understand how to use it.”

Funny enough, that same advice may explain the next big shift in embodied AI.

For the past few years, the world has been obsessed with ChatGPT, Gemini, and other Large Language Models (LLMs) that can write essays, generate code, and sound remarkably intelligent.

But humanoid robots operate in the real world, not just in text. They must see, balance, react, and navigate unpredictable environments.

An LLM can explain how to ride a bicycle step by step, but a humanoid robot must actually stay upright while avoiding someone suddenly walking into its path carrying a Milo ais.

That is why Stanford professor Dr Fei-Fei Li, often called the “Godmother of AI”, famously described today’s LLMs as: “Wordsmiths in the dark.”

And this is where humanoid robots is evolving into three (or maybe four) different “brains.”

Brain #1: The “Book Smart” Robotic Brain –
Vision-Language-Action (VLA) Systems.

These robots learn by repeatedly watching humans perform tasks such as folding clothes, sorting boxes in a warehouse, or opening doors. Eventually, the robot memorises the workflow and repeats it.

In simple terms, VLA robots are “Book Smart.”

Very obedient. Very hardworking. Very tuition-centre energy.

They perform well in controlled environments because they memorise the correct answer.

But when conditions change slightly: lighting shifts, objects move, or clutters appears, VLA systems struggle because they do not deeply understand “why” an action works.

Yet, just like in human society, being book smart still pays off.

Factories do not need robots to debate philosophy. They need robots to move boxes consistently.

Source: Figure AI

In mid-May 2026, Figure AI demonstrated a real-world application of VLA models through its Helix 02 humanoid robots, which autonomously completed full 8-hour warehouse sorting shifts. The VLA system enabled the robots to visually identify packages, understand sorting tasks, and perform coordinated actions such as grasping, reorienting, and sorting parcels with minimal human intervention, showcasing scalable and human-like automation for logistics operations.

The downside?

VLA robots are extremely data hungry. Training them requires massive demonstrations, endless corrections, and months of repetitive learning.

Brain #2: The “Street Smart” Robotic Brain -
World Models.

This is where Fei-Fei Li’s idea of “spatial intelligence” becomes important.

Born in Beijing before immigrating to the United States, Fei-Fei Li became one of the most influential figures in AI through her work on ImageNet, the dataset that taught machines how to “see.” Today, she is a professor at Stanford, a former Vice President and Chief Scientist at Google Cloud AI, and co-founder of World Labs, a company focused on spatial intelligence that helps AI understand the physical world in 3D.

Unlike VLA systems, world-model robots attempt to internally simulate how the world behaves before acting. Instead of simply replaying learned steps, they predict movement, distance, and physical interactions in real time.

A VLA robot memorises how to park. A world-model robot understands the parking space. Before reversing your car into a tight parking spot at Mid Valley, a world model estimates the distance, available space, and the car’s motion as they turn the wheel.

That is why these systems are “Street Smart.”

Companies like XSquare (自变量), Tars Robotics (它石智航), and Galaxea AI (星海图) are building robots that interpret spatial relationships and adapt to dynamic environments such as homes, hospitals, and eldercare.

Brain #2.5: The Hybrid Robotic Brain -
Combination of VLA and World Model

On the other hand, companies like AgiBot (智元), Robotera (星动纪元), Galbot (银河通用), Anyverse Dynamics (无界动力), and AI² Robotics (智平方) sit in the hybrid tier.

They use world models to understand the environment and VLA systems to execute tasks. Think of them as the middle ground: street smart enough to adapt, book smart enough to deploy at scale.

Brain #3: The “Instinct Smart” Robotic Brain -
Le World Models

Inspired by former Meta Chief AI Scientist Yann LeCun, these systems go one step further.

LeCun believes today’s AI wastes too much computing power memorising unnecessary visual detail, what he calls “pixel waste.”

His idea is simple: Robots should not memorise everything they see. They should understand the deeper principles behind how reality behaves: motion, force, causality, and physical dynamics.

A VLA robot memorises driving lessons.

A world-model robot understands traffic.

But a Le World Model robot goes one step further. Instead of merely memorising driving patterns, it attempts to understand the deeper principles behind driving itself: speed, momentum, distance, timing, and physical cause-and-effect, something closer to what humans would call “gut feel.”

So even during a chaotic thunderstorm, when motorcycles suddenly appear from nowhere, the robot can still estimate trajectories, anticipate movement, and react accordingly. Even if it has never encountered that exact situation before.

Which, honestly, sounds very much like driving in Southeast Asia.

That is why these systems can be described as “Instinct Smart.”

These robots aim to develop something closer to instinct. If successful, Le World Models may dramatically reduce robotic training costs because the robot learns principles instead of memorising millions of examples.

The catch? These systems are still highly experimental.

So Which Brain Will Win?

Honestly, nobody knows yet.

The late Chinese leader Deng Xiaoping once said: “It does not matter whether the cat is black or white, as long as it catches mice, it is a good cat.”

The same logic may apply to humanoid robots.

Factories may continue favouring “book smart” robots because consistency matters more than adaptability.

Homes, hospitals, and eldercare may require “street smart” robots capable of handling unpredictable environments.

Meanwhile, “instinct smart” systems could eventually become critical in disaster response, defence, and fully autonomous assistants operating in constantly changing conditions.

Today’s robots imitate.

Tomorrow’s robots will predict.

Future robots may eventually understand.

And perhaps that brings us back to the advice many Malaysians heard growing up:

“Don’t just memorise. Learn how the real-world works.”

That may turn out to be the future of AI as well.

Discussion about this post

Ready for more?