Introduction to OrcaRouter
In a groundbreaking move, FlashLabs has announced the launch of OrcaRouter in Japan, an innovative adaptive routing gateway that optimizes access to over 200 large language models (LLMs). Developed in partnership with Continuum AI, OrcaRouter aims to reduce the costs related to AI inference by as much as 70%, all while maintaining output quality comparable to leading models. This comes at a time when many companies are struggling with AI implementation costs and complexities.
Market Context
FlashLabs CEO Yoichi Hosoi pointed out that businesses operating AI in production environments are likely overpaying, and often by more than double the appropriate amount. The existing AI gateways predominantly serve as mere conduits that inflate costs by adding margins when routing model requests. This method results in inefficiencies, especially when selecting an optimal model based on the intricacies of the input prompts.
In Japan, specific issues have been identified, such as the intricate procurement processes due to individual contracts with multiple LLM providers, the complications arising from currency exchange rates since invoicing is done strictly in USD, and the lack of tools that assist in cost optimization. These barriers have hindered the effective use of AI in many enterprises.
The Innovation of OrcaRouter: Adaptive Routing Explained
OrcaRouter distinguishes itself by consolidating API calls to over 200 LLMs, including those from giants such as OpenAI, Google, and Meta, into a single endpoint with one API key and invoice. Unlike other AI gateways that merely offer a variety of models, OrcaRouter's unprecedented routing engine assesses the best model per prompt dynamically.
Three Core Mechanisms of OrcaRouter
1.
Pre-emptive Classification: Before any request is sent, a rapid router model predicts which downstream LLM can process prompts while adhering to specified quality standards. For example, a simple task like summarizing an email doesn't require the sophistication of a model like GPT-5, while more complex tasks like refactoring lengthy code would.
2.
Continuing Learning System: Quality signals from success rates and user feedback continuously refine routing policies, allowing the system to become more efficient without any changes in customer code. This means the AI becomes smarter and more cost-effective every week.
3.
Real-Time Market Monitoring: OrcaRouter consistently tracks provider pricing, latency, error rates, and newly released models. If a new, cheaper alternative model becomes available for a specific task, OrcaRouter can immediately reroute requests without any need for reintegration or procurement reviews.
Benefits of Implementing OrcaRouter
Internal benchmarks indicate that OrcaRouter can cut inference expenses by 47% to 71% depending on workload configurations. Importantly, quality indicators measured from end-user perspectives show no degradation, making it a viable choice even for tasks that typically demand sophisticated processing.
Unique Features of OrcaRouter
- - Zero Markup Fees: Clients only pay the provider's base cost; OrcaRouter earns through a transparent flat-rate platform fee.
- - Single API for 200+ Models: Users benefit from access to various models through one endpoint, streamlining operations and reducing complexity.
- - Quick Migration: Transitioning to OrcaRouter requires updating just the Base URL and API key, with no modifications necessary to existing OpenAI SDK code.
Why Now is the Time for Japan
As of 2026, Japan is becoming a leading region for the rapid adoption of open-source AI technologies. The open-source voice model, Chroma, developed by Continuum AI, has been utilized by major companies like NTT Data and Xiaomi in recent quarters. Despite abundant opportunities, traditional barriers in closed API selections, such as complicated procurement processes and the absence of cost optimization tools, have limited corporate adoption. OrcaRouter addresses these issues from day one.
Features Tailored for the Japanese Market
- - Invoicing in Yen: Simplifies the accounting process for companies that previously operated only in USD.
- - Local Support: Fully localized management consoles and documents, with Japanese language support available during JST business hours.
- - Domestic Data Routing: Compliance with data localization requirements through AWS and Google Cloud Platform regions within Japan.
OrcaRouter stands to become Japan's first AI gateway equipped with these capabilities.
Leadership Perspectives
Yoichi Hosoi, founder of FlashLabs, stated, "Companies currently using AI in production are likely overpaying significantly. Adaptive routing offers a unique resolution to this issue. We ensure transparency about which model responds to a request, guaranteeing it is the most cost-effective option capable of delivering quality outcomes. Our enterprise clients see substantial weekly savings, often between 60% and 70%."
A representative from Continuum AI emphasized how most other gateways market their convenience while adding margins, and OrcaRouter was created to negate such structures. With FlashLabs already connecting with developers looking for transparency and predictability, they present the ideal partner for launching in Japan.
Launch and Pricing
Starting today, OrcaRouter is available at orcarouter.ai, with a pay-per-use model that charges only the provider cost plus a flat platform fee. An enterprise subscription includes features like private endpoints and Japanese-language SLA support. The first 100 companies registering through FlashLabs will receive a credit of 30,000 yen and a complimentary adaptive routing diagnostic report focused on their current AI expenditures.
About FlashLabs
FlashLabs aims to automate customer experiences and sales processes, eventually leading to self-sustainable systems. Through a hybrid of machine speed and human insights, they deliver outcomes that outperform traditional methods. The company also develops the Chroma open-source voice model utilized by various developers and machine learning engineers across notable firms.
- - Company Name: FlashLabs Inc.
- - Headquarters: Chiyoda, Tokyo
- - CEO: Yoichi Hosoi
- - Website: flashlabs.ai
About Continuum AI
Continuum AI is a research institution focused on foundational infrastructure for intelligent systems for the next decade. Their commitment lies in ensuring reliable AI advancements.
Contact for the press
For further inquiries, reach out to FlashLabs’ marketing department: