0% found this document useful (0 votes)
17 views13 pages

Lam

Large Action Models (LAMs) are advanced AI systems that can understand language and take actions in both digital and physical environments, surpassing traditional Large Language Models (LLMs) which only generate text. LAMs integrate multiple modalities such as language, vision, and motor control, enabling them to perform tasks autonomously and interact with their surroundings. Their applications range from household robotics to assistive technology and they represent a significant step towards achieving Artificial General Intelligence (AGI).

Uploaded by

mohasinmoosi777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views13 pages

Lam

Large Action Models (LAMs) are advanced AI systems that can understand language and take actions in both digital and physical environments, surpassing traditional Large Language Models (LLMs) which only generate text. LAMs integrate multiple modalities such as language, vision, and motor control, enabling them to perform tasks autonomously and interact with their surroundings. Their applications range from household robotics to assistive technology and they represent a significant step towards achieving Artificial General Intelligence (AGI).

Uploaded by

mohasinmoosi777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Technical Seminar

On

LARGE ACTION MODELs (LAMs)

Under the Guidance of: Presented By:


Prof. Mohammed Siraj B MOHASIN ASIF C K
Assistant Professor 4DM21AI038
Dept. of AIML
INTRODUCTION
 Large Action Models (LAMs) are advanced AI systems designed
not only to understand language like Large Language Models
(LLMs), but also to take actions in digital or physical
environments
 Large Action Models (LAMs) are AI systems that go beyond just
talking — they take action
 While traditional Large Language Models (LLMs) like ChatGPT
specialize in understanding and generating text, LAMs are
designed to understand instructions and physically or digitally act
on them.
 LAMs combine language, vision, planning, and action into a
unified system
 A LAM doesn’t just read or write — it can see through cameras
or screens, understand the situation, plan a sequence of steps, and
execute them effectively
LARGE ACTION MODELS 2
LAMs VS LLMs VS AGENTS
LARGE LANGUAGE MODELS (LLMs)
• LLMs like ChatGPT, GPT-4, and Bard are trained on
massive text data and excel at understanding
questions, having conversations, writing content,
summarizing, translating, and more — all in natural
language.
• But they are passive in the sense that they only
respond with words — they don’t take real actions or
interact with the environment.
• Example: You ask an LLM “How do I book a flight?”
— it gives you a step-by-step explanation.

LARGE ACTION MODELS 3


LAMs VS LLMs VS AGENTS
AGENTS
• Agents are task-oriented systems that can use LLMs
inside them to understand goals and then perform
multi-step actions — like searching the web, using
APIs, writing code, or calling functions.
• Agents usually follow a loop of “Think → Plan →
Act”, and often use tools or plugins to help complete
tasks. They may still need a lot of supervision and are
often specialized for specific workflows.
• Example: An AI agent could search for flights,
compare prices, and send you a booking link — using
a browser plugin or API access.

LARGE ACTION MODELS 4


LAMs VS LLMs VS AGENTS
LARGE ACTION MODELS(LAMs)
• LAMs take agents a step further. Instead of just
calling tools or APIs, they interact with the
environment directly — whether that’s a web
interface, a 3D game, or a real robot.
• LAMs combine language, vision, memory, reasoning,
and motor control into one unified system. They are
built to handle real-time feedback and can act
autonomously in complex environments.
• Example: A LAM could see a room through a camera,
understand the instruction “put the red cup on the
table,” and physically move a robotic arm to complete
the task.

LARGE ACTION MODELS 5


WHY DO WE NEED LAMS

1. Language Alone Isn't Enough


• LLMs can answer questions, but they can’t take action. In real life, humans don’t just
talk — we do. We open doors, move objects, and solve problems through action.
• LAMs bring that same ability to machines — the power to understand and act.

2. Bridging the Gap Between AI and the Real World


• LAMs allow AI to interact with tools, robots, software, and environments in a
meaningful, goal-driven way.
• Whether it’s controlling a drone, navigating a website, or manipulating real objects
LAMs act as intelligent assistants in the physical and digital world

LARGE ACTION MODELS 6


REAL WORLD APPLICATIONS
1. Household Robotics : LAMs can power smart home robots that respond to
natural language and perform physical tasks — like “clean the table,” or “sort
the laundry.”
2. Web Automation and Digital Assistance : LAMs can control software
interfaces — like filling out forms, sending emails, creating reports, or booking
tickets — by understanding user intent and navigating digital environments like
a human user.
3. Assistive Technology for Disabled Users : LAMs can enable AI-powered
systems that help users with physical or visual impairments by controlling
devices, reading screens, or physically interacting with objects on their behalf.
4. Autonomous Scientific Research or Lab Assistance : LAMs could be used in
research labs to carry out multi-step procedures — like preparing chemical
samples, running tests, or recording results
5. Warehouse and Industrial Robotics : In logistics and manufacturing, LAMs
can power robots to move products, sort packages, or assemble parts by
combining spatial awareness, task planning, and precision movement.

LARGE ACTION MODEL 7


ADVANTAGES OF LAMs

1. Multimodal Intelligence : LAMs can understand text, see images or


video, and perceive environments all at once.
2. End-to-End Task Execution : LAMs don’t just answer questions —
they complete full tasks from start to finish
3. Autonomy and Adaptability : LAMs can adjust actions based on real-
time feedback
4. Reduced Human Effort & Automation : LAMs can automate both
physical labor and digital workflows, saving time and effort.
5. Generalization Across Tasks : Once trained, LAMs can perform
multiple tasks without needing to be reprogrammed.

LARGE ACTION MODEL 8


CURRENT LAMS

1. RT-X (by Google DeepMind) : RT stands for "Robotic


Transformer." RT-X is a family of models trained on data from
22 different robot types, across real and simulated environments.
2. WebArena (by Princeton et al.) : A simulated web
environment for training agents to interact with websites using
natural language.
3. Voyager (Minecraft + GPT-4) : Uses GPT-4 to play Minecraft
autonomously. It explores, builds tools, and improves itself over
time
4. ALOHA (by Stanford) : A foundation model for robotic arms.
It was trained using real video and sensor data, allowing robots
to pour drinks, open drawers, or push objects around based on
human commands.
LARGE ACTION MODEL 9
FUTURE SCOPE OF LARGE ACTION MODELS
(LAMs)
• General-Purpose Robots for Everyday Life : LAMs could power intelligent
household robots capable of performing a wide range of daily tasks —
cooking, cleaning, organizing, helping the elderly, or even teaching children
• Unified Digital and Physical AI Assistants : LAMs may serve as true
personal AI agents that operate seamlessly across your phone, computer, and
smart home — handling your emails, booking appointments, ordering
groceries, or even helping you cook dinner by controlling your smart kitchen
devices.
• High-Impact Fields: Healthcare, Disaster Response, Space : LAMs could
be used in critical environments where human presence is risky — like
disaster zones, remote healthcare assistance, or even extraterrestrial
exploration.
• Towards Artificial General Intelligence (AGI) :LAMs represent a step
toward true generalist AI - systems that can handle any task a human can do,
across both the digital and physical world.
LARGE ACTION MODEL 10
CONCLUSION
Large Action Models (LAMs) represent a major leap in artificial intelligence by enabling
systems that can understand, plan, and act in both the physical and digital world. Unlike
traditional AI that only processes information, LAMs bridge language and action — turning
instructions into real-world outcomes. As research progresses, LAMs will unlock new
possibilities in robotics, personal assistants, science, and more. They mark the beginning of
AI systems that don’t just think — they do.

"In the future, you won’t just talk to AI — you’ll work with it, walk with it, and maybe
even live with it."

LARGE ACTION MODEL 11


REFERENCES

•Tao, Y., Yang, J., Ding, D., & Erickson, Z. (2025). LAMS: LLM-Driven Automatic
Mode Switching for Assistive Teleoperation. arXiv. https://arxiv.org/abs/2501.08558​
•L., Yang, F., Zhang, C., Lu, J., Qian, J., He, S., Zhao, P., Qiao, B., Huang, R.,
Qin, S., Su, Q., Ye, J., Zhang, Y., Lou, J.-G., Lin, Q., Rajmohan, S., Zhang, D., &
Zhang, Q. (2025). Large Action Models: From Inception to Implementation. arXiv.
https://arxiv.org/abs/2412.10047
•Zhou, W., Wu, Z., Li, Y., Zhou, Z., Chen, W., Zhao, J., & Qiu, X. (2024). xLAM: A
Family of Large Action Models to Empower AI Agent Systems. arXiv preprint
arXiv:2409.03215

LARGE ACTION MODEL 12


THANK YOU

LARGE ACTION MODEL 13

You might also like