LandingAI's Latest Breakthrough in Document Intelligence
LandingAI, a prominent player in agentic vision AI technologies, has officially unveiled an upgraded version of its Agentic Document Extraction (ADE) framework. This latest iteration, utilizing the Document Pre-trained Transformer (DPT-2), marks a substantial advance in the way organizations can interact with and process their documents.
The Significance of DPT-2 in Document Processing
With the original launch of ADE just six months ago, organizations have already been able to process billions of pages more efficiently, with users reporting reductions of up to 90% in time spent retrieving vital information. The DPT-2 model builds on these foundations, offering enhanced capabilities that are crucial for accurate data extraction from complex documents.
Unlike traditional Large Language Models (LLMs), which often falter in extracting detailed and complete information from visual formats, DPT-2 utilizes structured deep learning methods combined with agentic workflows. This dual approach ensures a higher level of accuracy, making it particularly effective for documents that contain a combination of text, tables lacking gridlines, invoices with unusual angles, embedded signatures, and more.
Dan Maloney, the CEO of LandingAI, emphasizes the importance of visual data in organizational decision making: "Documents contain the information that organizations need to make not only accurate but the best decisions possible. Key nuances can be lost if the visual representations are not adequately captured." This enhanced capability addresses the existing gaps in document intelligence, streamlining workflows, especially in sectors where accuracy is critical such as finance, healthcare, and insurance.
Innovative Features of DPT-2
The features integrated into DPT-2 are designed to refine document parsing and enhance the extraction process:
- - Agentic Table Captioning: DPT-2 can now analyze large, complex tables without gridlines and merged cells with unprecedented accuracy. It maintains cell integrity and alignment, enabling users to trace back values to their source.
- - Refined Figure Captioning: Logos, seals, and minor figures are identified with precision, thus eliminating extraneous verbosity during processing.
- - Smart Layout Detection: The model boasts improved detection capabilities for messy scans, ensuring that fewer elements are overlooked during the extraction process. It can differentiate and process stamps within tables separately, which is particularly valuable for compliance purposes.
- - Expanded Chunk Ontology: Beyond standard text and tables, DPT-2 recognizes various document elements, including signatures, checkboxes, ID cards, barcodes, and QR codes, ensuring consistent classification of all document parts.
The DPT-2 model is currently available in preview and can be accessed via APIs for various applications, including:
- - Parse: This feature converts documents into structured markdown and semantic chunks for better readability and organization.
- - Extract: This schema-driven extraction method leverages LLM reasoning and is designed to maintain a direct connection with the original document for accuracy.
Support for Developers and Builders
Recognizing the growing demand for tailored document-processing solutions, LandingAI has also initiated a Builder Program. This program provides developers and organizations with essential tools and support to create applications powered by ADE APIs. Participants in the Builder Program gain priority assistance, early access to new features, increased rate limits, and go-to-market support, fostering a collaborative environment for innovation.
With resources like SDKs, cookbooks, and direct access to LandingAI experts, organizations can accelerate the development and deployment of production-ready solutions.
LandingAI, founded by Andrew Ng, ex-chief scientist at Baidu and a co-founder of Coursera, is well-positioned to lead the charge in transforming visual AI technologies. The ongoing evolution of their Agentic Document Extraction capability illustrates a commitment to enabling organizations to harness the potential of visual data more effectively.
As industries continue to evolve, solutions like LandingAI’s DPT-2 will play an indispensable role in enhancing operational efficiency and informed decision-making across various sectors. To learn more about their offerings, visit
Landing.ai.