Visual Bank Launches Qlean Dataset - A Comprehensive Japanese Voice Dataset for AI Growth

Visual Bank and the Advent of Qlean Dataset

Visual Bank, a startup based in Shibuya, Tokyo, led by CEO Masayuki Nagai, is transforming the AI training landscape through its subsidiary, Amana Images. Their latest offering, the Qlean Dataset, aims to provide robust data solutions for AI development, featuring an extensive collection of over 70,000 hours of diverse Japanese voice recordings. This dataset can be readily utilized for various research and commercial AI applications.

What is Qlean Dataset?

The Qlean Dataset serves as a comprehensive resource for training and developing AI systems. By partnering with multiple audio data partners, Visual Bank has created an extensive library that includes a variety of voice data, catering to different usage scenarios. It is part of their broader strategy to expand their AI data offerings known as Data Recipes.

Explore Data Recipes

Data Recipes in Qlean Dataset

The Data Recipes within Qlean Dataset feature original, commercially usable data collections. They offer flexibility in combining data materials based on specific needs, accuracy levels, and deadlines. These recipes include both annotated and unannotated data. Collaborations with partners such as the Chiba Lotte Marines and Toyo Keizai Inc., along with a global network and new data acquisitions, enhance this data lineup. This comprehensive approach significantly reduces the burden of data collection and management for AI development, accelerating project timelines.

Overview of the Recently Added Japanese Voice Dataset

The newly added voice recordings encompass:

- Single Speaker: Monologues, readings from novels and stories, private narratives, educational lectures, cultural performances like rakugo, and conversational snippets in various cultural topics.
- Two Speakers: Business and private conversation simulations, natural dialogues among children, and medical scenarios involving conversations among doctors, nurses, and patients.
- Three or More Speakers: Group discussions in private or business contexts and audio from media, including TV and movie clips.

These audio datasets are crucial for improving the performance of general foundation models and applications in natural language processing, speech recognition, and conversational AI. Visual Bank ensures all datasets come with completed rights processing, allowing for commercial use in research and development comfortably.

Use Cases for the Japanese Voice Dataset

Single Speaker Audio

- Monologues & Readings: Suitable for assessing ASR (Automatic Speech Recognition) WER and adapting models to longer texts.
- Educational Lectures: Effective for validating ASR precision in academic fields and expanding training datasets with subject-specific terms.
- Cultural Performances: Valuable for training emotion classification models and TTS (Text-to-Speech) prosody control.

Two Speakers Audio

- Business Conversations & Simulations: Contains overlapping speech and ambiguous expressions, making it ideal for testing speaker separation accuracy and specialized chatbot training.
- Medical Conversations: Includes typical dialog structures and terminologies, perfect for ASR evaluation benchmarks and AI pre-training tasks in healthcare.

Three or More Speakers Audio

- Group Conversations: Features interruptions, suitable for improving Speaker Diarization models and evaluating multi-speaker ASR.
- Media Audio: Captures natural dialogues with background noise, crucial for validating ASR robustness and evaluating AI-generated responses for naturalness.

Additional Features of Qlean Dataset

1. Research and Commercial Use Compatibility: All datasets acquired by Qlean Dataset comply with privacy policies, ensuring security for both research and commercial applications.
2. Rapid Proposal of Data Recipes: Qlean Dataset’s unique presentation method allows for quick, cost-effective data procurement.
3. Customized Dataset Construction: For data requirements not covered by the Data Recipe lineup, Qlean Dataset offers tailored solutions, supporting specific regulatory needs.

Qlean Dataset Academia Support Program

Visual Bank is also launching a support program to provide datasets at no cost to academic and non-profit research teams. This initiative aims to address challenges in obtaining high-quality, rights-clear training data. Interested parties can explore this initiative through the Academia Support Program.

About Visual Bank

Visual Bank's mission is to unlock the potential of data through the development of advanced data infrastructures for AI applications. The company, which supports manga artists with AI tools called THE PEN, is developing the Qlean Dataset via its subsidiary, Amana Images. Visual Bank is also an approved participant in Japan's GENIAC research program, enhancing its commitment to social implementation.

For more information, visit:
Visual Bank
Amana Images