HapticAI
  • 🚄HapticAI
  • Introduction to Reinforcement learning
  • 🥅Objective behind Haptic
  • Methodology Followed
    • Pretraining
    • Preference model training
    • Fine tuning
  • System Overview
  • Ongoing RLHF adjacent research at Haptic
  • 💰HAI Tokenomics
    • HAI staking
  • Planned Launch
    • Phase 1
    • Phase 2
    • Phase 3
  • References
  • Token links
  • Website
  • Technical Paper
  • Twitter
  • Telegram
Powered by GitBook
On this page

References

Following sets of papers and articles have inspired our current and upcoming build of HapticAI

  • WebGPT: Browser-assisted question-answering with human feedback (OpenAI, 2021):

  • ChatGPT: Optimizing Language Models for Dialogue (OpenAI 2022)

  • Learning to summarize with human feedback (Stiennon et al., 2020)

  • Recursively Summarizing Books with Human Feedback (OpenAI Alignment Team 2021)

  • Llama 2 (Touvron et al. 2023)

  • Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback (Anthropic, 2022)

  • Fine-Tuning Language Models from Human Preferences (Zieglar et al. 2019)

  • InstructGPT: Training language models to follow instructions with human feedback (OpenAI Alignment Team 2022)

  • GopherCite: Teaching language models to support answers with verified quotes (Menick et al. 2022)

  • Sparrow: Improving alignment of dialogue agents via targeted human judgements (Glaese et al. 2022)

  • Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning (Cohen at al. 2022)

  • Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization (Ramamurthy and Ammanabrolu et al. 2022)

PreviousPhase 3NextToken links

Last updated 1 year ago