Qlean Dataset Launch
2025-11-20 06:56:02

Visual Bank Launches Qlean Dataset Featuring Rakugo Audio Corpus for AI Training

The Launch of Qlean Dataset's Rakugo Audio Corpus



Visual Bank Corporation, based in Tokyo's Minato Ward and led by CEO Shinobu Nagai, has recently unveiled a significant addition to its AI learning data solutions through its affiliate, Amana Images. The newly launched Qlean Dataset features a unique resource known as the 'Japanese Single Speaker Rakugo Audio Corpus'.

This dataset consists of captivating performances by Rakugo storytellers, offering rich audio content in a single-speaker format. Each recording spans approximately 15 to 30 minutes, encapsulating the unique rhythm, pacing, and intonations intrinsic to Rakugo, along with ambient sounds such as laughter and applause from the audience.

The Rakugo Audio Corpus is specifically designed for applications in Automatic Speech Recognition (ASR), natural spoken language understanding, and speech generation model development, making it a versatile tool for researchers and developers in the AI field. The dataset's contextual recordings, which capture natural speech in a performance environment, are particularly suited for evaluating the generalization performance of AI models and for validating voice processing technologies in real-world applications.

Overview of the Rakugo Audio Corpus



Data Specifications


  • - Data Type: Audio
  • - Subject Attributes: Rakugo Storyteller
  • - Data Format: MP3
  • - Total Duration: 447 hours (Each audio segment between 15-30 minutes)
  • - Recording Environment: Indoor (performance venue)
  • - Scenes Covered: Live performances of various genres, including both classic and modern Rakugo

Usage Scenarios


The 'Japanese Single Speaker Rakugo Audio Corpus' opens doors to several practical applications within AI research and development:

1. Speech Recognition and Natural Language Understanding: The diverse and natural verbal storytelling characteristics present in the recordings make them ideal for training ASR models to function effectively in environments that resemble real-life scenarios, enhancing their accuracy while factoring in various ambient sounds.

2. Acoustic Event Detection and Communication Analysis: With the inclusion of audience reactions, this dataset provides a foundational base for models focusing on environmental sound detection, event classification, and speaker style estimation analysis.

3. Voice Synthesis and Expression AI: The varied intonations, rhythm, and styles captured provide essential data for training voice synthesis models, helping to develop AI systems capable of rich expression in audio generation.

4. Cultural and Educational Applications: Clear vocal delivery and diverse intonations found in the recordings can be leveraged to create educational materials for Japanese language learners, focusing on listening comprehension and oral communication skills.

5. AI Applications Using Real Environment Data: The practical application of this dataset allows for the validation of AI audio processing technologies through testing in noisier environments and aids in the assessment of digital archive search capabilities and speech indexing models.

About Qlean Dataset


The Qlean Dataset is a commercially viable AI learning data solution offered by Amana Images under Visual Bank. It provides a wide array of multimedia data formats, including images, videos, audio, 3D models, and text to support extensive research and commercial application. Continuous collaboration with industry partners ensures that the dataset remains relevant and tailored to meet contemporary data demands.

By easing the burdens of data collection and curation in AI development environments and ensuring legally sound data usage, Qlean Dataset aims to facilitate the advancement of AI research in a myriad of fields.

Key Features of Qlean Dataset


  • - Compliance with international laws and regulations (GDPR/CCPA) and consent obtained from all subjects
  • - Rapid delivery of existing datasets within one day
  • - Capability for custom data collection and unique data structuring

For more information on Qlean Dataset, please visit their website or their AI Data Recipe collection.

Company Information


Visual Bank aims to build next-generation data infrastructure that maximizes AI development capabilities. They also provide a variety of support tools for artists and are involved in the national research and development program 'GENIAC' to accelerate societal implementation initiatives. For more details, visit their company page and Amana Images page.


画像1

画像2

画像3

画像4

画像5

Topics Consumer Technology)

【About Using Articles】

You can freely use the title and article content by linking to the page where the article is posted.
※ Images cannot be used.

【About Links】

Links are free to use.