• Source:JND

Microsoft LAM: As we move forward with the development and advancement of AI, we can observe that there has been rapid growth when it comes to large language models(LLMs) as they are being utilized in enabling chatbots, text generation and writing code as well. LLMs are sufficient in understanding and generating text, but it has some gaps when it comes to performing tasks in real-world environments. In order to address this issue Researchers at Microsoft have created something they call a Large Action Model (LAM), an AI model that is capable of operating Windows programs on its own.

ALSO READ: Apple Agrees To Pay $95 Million To Settle Siri Privacy Lawsuit

What do we understand by LAMs?

Traditional AI models are primarily capable of processing and generating text, but LAMs take things to the next level. LAMs can turn user requests into real actions, which can range from operating software to controlling robots as well. However, it must be noted that this is not a new concept, and LAM is the first model that has been specifically attuned to comply with Microsoft Office products. LAMs as a concept gained traction during the first half of 2024 when Rabbit's AI device was launched with an AI that was capable of interacting with mobile applications without the need of the user.

LAM models are capable of understanding inputs such as text, voice, or images with the ability to convert such requests into detailed step-by-step plans, and can also adjust their approach in real-time. To put it in simple words, LAMs are AIs that are designed to act and not just understand the inputs.

The research paper "Large Action Models: From Inception to Implementation" explains that these models are designed to interact with both digital and physical environments. Instead of simply asking an AI how to make a PowerPoint presentation, you could ask it to open the app, create the slides, and format them according to your preferences. At their core, Large Action Models (LAMs) integrate three key elements: understanding intent, meaning they accurately interpret user commands; action generation, allowing them to plan and execute steps; and dynamic adaptation, enabling them to adjust based on feedback from their surroundings.

How are LAMs developed?

Creating LAMs is a far more complex task than building LLMs as its development takes place in five stages. Data forms the foundation of any AI, and LAMs require two types of data: task-plan data, which are high-level steps for tasks such as opening a Word doc and highlighting text. The second type is task-action data, which is essentially specific doable steps. During training, these models go through supervised fine-tuning, reinforcement learning, and imitation learning. Before being deployed, they are thoroughly tested in controlled environments. Additionally, they are integrated into agent systems, like Windows GUI agents, to interact with various environments. Finally, the models are put to the test in real-world scenarios to assess their adaptability and performance.


LAMs are certainly a big deal and a huge step in evolution from just text generation to action-driven AI agents. From Automating workflows to providing aid to people with disabilities LAMs are not just smarter AIs but AIs that can be utilized effectively even in day-to-day tasks. As the technology takes further steps in its evolution, LAM may soon become the standard for AI systems across all sectors.

ALSO READ: Microsoft Confirms Windows 11 Bug: This Update Might Break Your System