We believe Generative AI is the future of Intelligent Document Processing and offer a technical roadmap to get there.
The emergence of Generative AI and Large Language Models (LLMs) is a technological trigger with a more drastic impact than any other technology in recent history.
But LLMs are not only useful as writing assistants or in chatbot interfaces. Many document processing tasks like data entry, order booking, invoice processing, mailroom processing, … can benefit from these models.
The recent rise of “function calls” or “commands” is enabling these LLMs to connect to external resources like internet search, database lookups, or code snippet execution. This can transform task-specific or generative LLMs into goal-seeking “agents” taking actions.
For business users performing document processing tasks, generative agents will lead to a tremendous amount of value:
We believe Generative AI and generative agents are the future of Intelligent Document Processing and offer a technical roadmap to get there.
Metamaze is a platform for creating, training, evaluating, and deploying private LLMs for document and email processing pipelines. Our customers have been training and running more than 800 private LLMs in production, with thousands of versions in between. In the coming months, we aim to extend our framework from supervised to generative LLMs and agents.
An AI-powered “Generative Agent” is software designed to perform tasks autonomously by using various machine learning algorithms, models, and external services. Agents learn from previous interactions to optimize their performance in achieving specific tasks or goals. Agents act in a predefined environment (known set of possible actions) based on a given input (the document or e-mail).
When it comes to “taking actions”, it refers to the agent’s ability to make decisions and perform tasks. These actions can range from simple tasks like
to complex tasks such as
Crucial properties of an agent are
By using Generative AI and agents, IDP platforms can offer many more capabilities besides “extraction” or “recognition”. This can greatly enhance the breadth of repetitive work that can be reliably automated – with humans still in control and in the loop.
Companies will benefit from giving their business users the capability to train, evaluate and deploy their own – private – document-processing agents based on generative AI and LLMs, with less involvement of IT needed and better results.
For companies building intelligent document processing pipelines, generative LLMs are valuable because they can
Integrating with RPA platforms would make agents even more powerful.
For companies to be able to fully capitalize on Generative AI for their document processing needs, we believe there are three major technological leaps that need to be addressed to build a platform for creating private, generative LLMs:
Generic, one-size-fits-all solutions fall short for complex, customer-specific processes, which is why you need an Adaptive IDP platform where you can train your own models on your own data.
Document processing tasks like data entry, order booking, invoice entry, … are typically performed by users that do not have specialized IT or AI knowledge. But current open-source or commercial solutions do not allow users to train/fine-tune, evaluate, deploy and scale Generative LLMs without coding skills. We believe in democratizing access to private Generative LLMs by automating those steps and giving power to business users.
For the past 5 years, Metamaze has built an exceptional MLOps framework for training, evaluating, and deploying private LLMs. Because of that, we are in a unique position to have the software and experience to scale-up to private Generative LLMs.
Current state-of-the-art foundation models fall into two broad categories:
There is a clear gap to give text-based LLMs access to more layout information while maintaining the accuracy of specialized OCR models.
Metamaze has built custom supervised LLMs that extend pretrained multilingual text-only models with rich layout information, drastically improving performance. We plan on using similar techniques but scaling them up to generative LLMs as foundation models.
Every process is different. Any Adaptive IDP platform needs to be able to tailor its behavior to a company’s existing or target workflow. Often, this means custom code for external data lookup, interpretative decision-making, applying business logic, or data validation.
While it is great that you can customize every aspect of the pipeline, these steps are often hard-coded by developers and engineers. This has two big disadvantages:
Clearly, that is not how an Adaptive IDP system should work.
An end-to-end agent can learn from business users by being shown a couple of examples of which actions are appropriate, requiring no IT development.
The following numbers are the combined statistics of automation vs validation reason for a multitude of projects in a fixed time frame. These projects are of a mixed set of difficulty: some are so hard they reach no more than 50% STP, and others are fairly straightforward and achieve up to 99.2% STP.
Overall, the conclusion is that the global STP rate would rise from 76.6% to ~88.1%. That means ~50.6% less human validation is needed. Time that can be spent on more value-adding tasks!
Generative agents can help by
Our principles:
The Metamaze roadmap w.r.t. LLMs, Generative AI, and agent-based intelligent document processing
Learn how Metamaze can help you automate any document and email in your organization. Book a demo with one of our experts and we’ll give you a quick tour of our product.
Metamaze is a no-code Intelligent Document Processing platform that uses AI to automate every document and email workflow.
Metamaze is a no-code Intelligent Document Processing platform that uses AI to automate every document and email workflow.