人気の記事一覧

Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment

3か月前

RewardBench: Evaluating Reward Models for Language Modeling

2か月前