Skywork-Reward-V2 Series: The Next Generation of Open-Source Reward Models Is Here

Skywork-Reward-V2: The Next Leap in Open-Source Reward Models



As of July 2025, Skywork AI has made significant strides in the development of open-source reward models with the introduction of the Skywork-Reward-V2 series. This initiative follows the successful launch of the first-generation models in September 2024 and signifies a notable evolution in the technology, as it seeks to empower the open-source community with state-of-the-art tooling in reinforcement learning and human feedback systems.

Building on a Strong Foundation


The Skywork-Reward-V2 series is anchored by the principles of collaborative innovation and data-rich environments. With over 750,000 downloads since its first release on the HuggingFace platform, the initial models have been instrumental in elevating performance across various benchmark evaluations. The rigorous testing these tools have encountered sets a high bar for their successors.

In this latest iteration, Skywork AI has focused on solving the limitations associated with preference datasets used in previous models. The company has unveiled the Skywork-SynPref-40M dataset, which comprises 40 million hybrid preference pairs. This dataset enhances the efficacy and adaptability of the new models, specifically engineered to capture nuanced human preferences.

The Power of Human-Machine Collaboration


Skywork has developed a sophisticated two-stage data screening pipeline that integrates human oversight with machine efficiency. This dual approach begins with human annotators constructing a

Topics Other)

【About Using Articles】

You can freely use the title and article content by linking to the page where the article is posted.
※ Images cannot be used.

【About Links】

Links are free to use.