Archaic AI System
2025-06-09 02:15:19

Archaic Unveils High-Performance AI System Specializing in Japanese Document Processing

Introduction



Archaic, a company based in Shibuya, Tokyo, has made headlines with the recent development of its RAG (Retrieval-Augmented Generation) AI system, specifically designed for handling Japanese business documents. Their innovative technology has exhibited superior performance in a comprehensive evaluation using benchmark datasets, achieving top scores in both the manufacturing category and overall average.

The Challenge of Complex Documents



A notable challenge many businesses face is the complexity of their documents, which often consist of not just text, but also charts, tables, and images. Traditional AI systems struggle with these multifaceted documents, often leading to incomplete responses or misinterpretation due to a lack of contextual understanding.

In response to these issues, Archaic has created a system that comprehensively understands document structure and maintains the relevance of information while interacting with generation AI, thereby enhancing accuracy and utility in a business context.

Core Technologies of Archaic RAG System



Central to the Archaic RAG system are two unique technologies that facilitate its performance:

1. Parser - Document Analysis Engine



The Parser automatically extracts text, figures, tables, and images from documents, breaking them down into meaningful units based on their structure and content. This allows the RAG system to process documents that are traditionally seen as non-structured, turning them into a format comprehensible by generation AI.

2. Tree-Data Structure - Information Hierarchy Technology



Extracted information is organized into a hierarchical Tree structure that preserves context and relationships. This allows for precise and coherent outputs, taking into account the relationships between main text, figures, tables, and annotations.

Differential Advantages Over Traditional RAG



Traditional RAG systems primarily process text and often falter when faced with visual elements like charts and tables, which can lead to significant information loss and contextual gaps. In contrast, Archaic's system is designed to manage these elements more holistically, greatly enhancing the accuracy and comprehensiveness of responses.

Comparison Table: General RAG vs. Archaic RAG System



Features General RAG Archaic RAG System AI
-------------------
Document Compatibility Text-centric Comprehensive, including charts, tables, images, and annotations
Context Understanding Prone to fragmentation Maintains meaning in hierarchical structure (Tree-Data)
Usable for Businesses FAQ/Knowledge Complex documents like manuals, specifications, meeting notes
Dependence on LLM Relies on LLM capabilities Supplements LLM weaknesses through preprocessing and structuring

Performance Evaluation Methodology



To substantiate the effectiveness of their RAG system, Archaic utilized publicly available benchmark datasets tailored for evaluating Japanese RAG performance. This benchmark encompasses 300 question-answer pairs across five industries (finance, information and communication, manufacturing, public sector, and retail), providing a clear metric for comparison based on the accuracy of generated results.

The large language model utilized for verification was Claude 3.5 Sonnet, ensuring that the distinctions in performance between RAG implementations could be reliably assessed.

Evaluation Results by Industry Category



The following summarizes the accuracy rates achieved:

Industry Category Accuracy Rate Number Correct (Correct/60 Questions) Remarks
---------------------
Finance 83.3% 50 / 60
Information 85.0% 51 / 60
Manufacturing 91.7% 55 / 60 ★Highest Rating
Public Sector 93.3% 56 / 60
Retail 93.3% 56 / 60
Overall Average 89.3% 268 / 300 ★Highest Rating

With the manufacturing sector and overall average scores leading the way, this demonstrates the Archaic RAG system's structural comprehension and preprocessing accuracy significantly contribute to its document generation capabilities.

Developer Commentary



Zhaoxu Wang, CTO of Archaic, commented, "I believe RAG is transitioning beyond simple 'search + generate' methods into an era defined by 'structure understanding + search + generate.' The Archaic RAG employs a Tree-Data structure that faithfully represents the semantic relationships within complex documents, integrating graphical data alongside textual content. The core of our technology lies in how effectively we can convey meaningful context. I expect this advancement will facilitate the democratization of business knowledge and repurpose document knowledge effectively."

Future Prospects



Looking ahead, Archaic plans to expand its practical applications along three main axes:

1. Development of industry-specific optimization templates (for manufacturing, finance, municipalities, etc.)
2. Provision of AI solutions for manual generation, meeting notes summarization, and knowledge search designed for non-structured documents.
3. Collaborative enhancement of the dataset to establish guiding principles for RAG metrics.

Through this innovative RAG system that emphasizes understanding structure and connecting meanings, Archaic aims to pioneer the future of knowledge utilization.

About Archaic



Founded on November 15, 2017, and located in Shibuya, Tokyo, Archaic focuses on creating a world where AI is as commonplace as electricity and water, enabling everyone to utilize it seamlessly. They boast profound expertise in deep learning and AI systems development, nurturing a specialized team committed to building cutting-edge, custom AI for industry leaders.

The CEO, Jun Yokoyama, believes that making AI accessible in familiar environments will lower barriers and ultimately boost various sectors of Japan's economy.

For more information, visit Archaic's official website.

Contact



For inquiries regarding this announcement, please contact Archaic's public relations team at [email protected].


画像1

Topics Consumer Technology)

【About Using Articles】

You can freely use the title and article content by linking to the page where the article is posted.
※ Images cannot be used.

【About Links】

Links are free to use.