language-mo

Here is 1 public repository matching this topic...

SJ9VRF / Reinforcement-Learning-for-Human-Feedback-RLHF

This repository contains the implementation of a Reinforcement Learning with Human Feedback (RLHF) system using custom datasets. The project utilizes the trlX library for training a preference model that integrates human feedback directly into the optimization of language models.

language-model language-mo llms rlhf reinforcement-learning-from-human-feedback

Updated Aug 17, 2024
Python

Improve this page

Add a description, image, and links to the language-mo topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the language-mo topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

language-mo

Here is 1 public repository matching this topic...

SJ9VRF / Reinforcement-Learning-for-Human-Feedback-RLHF

Improve this page

Add this topic to your repo