HapticAI
  • 🚄HapticAI
  • Introduction to Reinforcement learning
  • 🥅Objective behind Haptic
  • Methodology Followed
    • Pretraining
    • Preference model training
    • Fine tuning
  • System Overview
  • Ongoing RLHF adjacent research at Haptic
  • 💰HAI Tokenomics
    • HAI staking
  • Planned Launch
    • Phase 1
    • Phase 2
    • Phase 3
  • References
  • Token links
  • Website
  • Technical Paper
  • Twitter
  • Telegram
Powered by GitBook
On this page

Methodology Followed

PreviousObjective behind HapticNextPretraining

Last updated 1 year ago

Reinforcement learning from Human Feedback is a complex and challenging concept to grasp. This is largely because it involves a multiple-model training process and various stages of deployment, each of which carries its own unique set of considerations and requirements. To make this topic more accessible and easier to understand, we will deconstruct the training process, breaking it down into three fundamental steps. Each of these steps plays a crucial role in the overall process and contributes to the successful implementation of reinforcement learning from human feedback.

  1. Pretraining a language model

  2. Gathering data and training a preference model

  3. Fine-tuning the language model with reinforcement learning

Pretraining
Preference model training
Fine tuning
System architecture for reinforcement learning