HapticAI
  • 🚄HapticAI
  • Introduction to Reinforcement learning
  • 🥅Objective behind Haptic
  • Methodology Followed
    • Pretraining
    • Preference model training
    • Fine tuning
  • System Overview
  • Ongoing RLHF adjacent research at Haptic
  • 💰HAI Tokenomics
    • HAI staking
  • Planned Launch
    • Phase 1
    • Phase 2
    • Phase 3
  • References
  • Token links
  • Website
  • Technical Paper
  • Twitter
  • Telegram
Powered by GitBook
On this page

References

Following sets of papers and articles have inspired our current and upcoming build of HapticAI

PreviousPhase 3NextToken links

Last updated 1 year ago

  • (OpenAI, 2021):

  • (OpenAI 2022)

  • (Stiennon et al., 2020)

  • (OpenAI Alignment Team 2021)

  • (Touvron et al. 2023)

  • (Anthropic, 2022)

  • (Zieglar et al. 2019)

  • InstructGPT: (OpenAI Alignment Team 2022)

  • GopherCite: (Menick et al. 2022)

  • Sparrow: (Glaese et al. 2022)

  • (Cohen at al. 2022)

  • (Ramamurthy and Ammanabrolu et al. 2022)

WebGPT: Browser-assisted question-answering with human feedback
ChatGPT: Optimizing Language Models for Dialogue
Learning to summarize with human feedback
Recursively Summarizing Books with Human Feedback
Llama 2
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Fine-Tuning Language Models from Human Preferences
Training language models to follow instructions with human feedback
Teaching language models to support answers with verified quotes
Improving alignment of dialogue agents via targeted human judgements
Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning
Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization