Anote, Inc. Launches Groundbreaking Japanese LLM Project in New York

Anote, Inc. Launches Innovative Japanese LLM Project

Anote, Inc., headquartered in New York, has announced the start of a significant initiative focused on the evaluation and refinement of Japanese large language models (LLMs). This project is in collaboration with Channel Bridge, a Japanese company, and aims to provide tools and infrastructure for Japanese developers and researchers to create and validate superior AI models in Japanese.

Beginning today, Anote is opening applications for a limited pilot program aimed at AI development companies and research institutions based in Japan. This initiative represents a holistic approach towards human-centered AI development, seeking to make these advanced tools more accessible to a wider audience.

The Core of the Project

The project revolves around Anote's newly released end-to-end MLOps platform. This platform is designed to support the entire lifecycle of LLM implementation, from data annotation and fine-tuning to inference, evaluation, and integration. It allows users to construct the most suitable large language model tailored to their data.

Anote provides a unique evaluation framework where users can compare zero-shot LLMs such as GPT, Claude, Llama3, and Mistral, along with fine-tuning models that utilize domain-specific training data. The platform also includes a data annotation interface that transforms raw, unstructured data into a format compatible with LLMs, thus integrating expert knowledge into the training process to enhance model accuracy.

Moreover, end users can incorporate their optimal LLM into private chatbots within their on-premise environments or utilize the fine-tuning software development kit (SDK) for additional integration capabilities.

Addressing Challenges in Japanese LLM Development

Currently, the training data for multilingual LLMs comprises 60-70% English, leaving only 3-5% for Japanese. This disparity leads to significant performance degradation when utilizing AI for Japanese applications. Anote's project is set to tackle this pressing issue by providing necessary tools and infrastructure for developing and validating superior Japanese AI models.

Some of the main challenges addressed by this project include:

- Lack of high-quality training data for Japanese and associated quality issues – Anote aims to assist in generating large-scale, high-quality Japanese training datasets.
- Absence of evaluation benchmarks for Japanese LLMs – The initiative will introduce Japan’s first public LLM assessment datasets, metrics, and leaderboards.
- Limited access to fine-tuning environments – The Anote platform will enable users to build and operate custom LLMs using their proprietary data.

Unique Features of Anote

1. End-to-End MLOps: Covers everything from annotation and fine-tuning (supporting LoRA, QLoRA, and RLHF) to inference and evaluation, all in one package.
2. Multi-Model Comparison: Users can compare fine-tuned models with GPT-4o, Claude 3.5, Llama 3, Mistral, and more, specifically using Japanese data.
3. Advanced Evaluation Framework: Model performance can be assessed through diverse metrics including Cosine Similarity, Rouge-L, LLM Eval, Answer Relevance, and Faithfulness.
4. Versatile Task Support: The platform caters to multiple tasks such as text classification, named entity recognition (NER), full-document QA, and prompt QA.
5. Instant Integration APIs & SDKs: Quick integration of optimal models into business environments is achievable.

Leveraging Few-Shot Learning

The technology utilizes cutting-edge Few-Shot Learning, allowing for high-accuracy predictions from minimal labeled samples. Users can execute supervised fine-tuning with labeled data and can further enhance model performance using supervised fine-tuning or RLHF/RLAIF through Anote’s data annotation interface or API.

The process comprises four major steps:
1. Upload: Create a new text dataset.
2. Customize: Set classification categories, extracted entities, questions, etc.
3. Annotation: Label a few edge cases to enable LLM’s active learning.
4. Download: Export the generated CSV or fine-tuned model as an API endpoint.

Private Fine-Tuning Chatbot

With the fine-tuned models, users can design accurate and private AI assistants for corporate use. The main steps involve:
1. Upload – Uploading corporate documents.
2. Chat – Asking questions regarding the documents using LLMs like GPT, Claude, Llama2, and Mistral.
3. Evaluation – Presenting the source (citation) of answers to mitigate hallucination effects. An evaluation dashboard enables users to confirm the efficacy of fine-tuning using metrics like Cosine Similarity, Rouge-L, LLM Eval, Answer Relevance, and Faithfulness.

Partner Opportunities

Through this project, participating partners will be able to:

- Build models optimized for the Japanese language and context.
- Develop internal tools or products using fine-tuning LLMs.
- Contribute to the first public benchmarking dataset for Japanese NLP.
- Share achievements through the Anote leaderboard and collaborative research.

Application Overview

Target Group: Companies, startups, or research institutions in Japan involved in generative AI or LLM development.
Number of Participants: Up to 5 organizations.
Project Duration: June 1, 2025 – October 1, 2025.
Application Deadline: May 17, 2025.
How to Apply: Interested parties should contact the provided contact information below for application.

Contact for Inquiry/Applications:
Anote, Inc. (Attn: Natan Vidra)
Email: [email protected]
Website: Anote.ai

Anote’s domestic partner:
Channel Bridge, Inc.
Email: [email protected]
Website: ch-bridge.com