HapticAI
search
Ctrlk
  • 🚄HapticAI
  • Introduction to Reinforcement learning
  • 🥅Objective behind Haptic
  • Methodology Followedchevron-right
  • System Overview
  • Ongoing RLHF adjacent research at Haptic
  • 💰HAI Tokenomicschevron-right
  • Planned Launchchevron-right
  • References
  • Token links
  • Websitearrow-up-right-from-square
  • Technical Paperarrow-up-right-from-square
  • Twitterarrow-up-right-from-square
  • Telegramarrow-up-right-from-square
gitbookPowered by GitBook
block-quoteOn this pagechevron-down

References

Following sets of papers and articles have inspired our current and upcoming build of HapticAI

  • WebGPT: Browser-assisted question-answering with human feedbackarrow-up-right (OpenAI, 2021):

  • ChatGPT: Optimizing Language Models for Dialoguearrow-up-right (OpenAI 2022)

  • Learning to summarize with human feedbackarrow-up-right (Stiennon et al., 2020)

  • Recursively Summarizing Books with Human Feedbackarrow-up-right (OpenAI Alignment Team 2021)

  • Llama 2arrow-up-right (Touvron et al. 2023)

  • Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedbackarrow-up-right (Anthropic, 2022)

  • Fine-Tuning Language Models from Human Preferencesarrow-up-right (Zieglar et al. 2019)

  • InstructGPT: Training language models to follow instructions with human feedbackarrow-up-right (OpenAI Alignment Team 2022)

  • GopherCite: Teaching language models to support answers with verified quotesarrow-up-right (Menick et al. 2022)

  • Sparrow: Improving alignment of dialogue agents via targeted human judgementsarrow-up-right (Glaese et al. 2022)

  • Dynamic Planning in Open-Ended Dialogue using Reinforcement Learningarrow-up-right (Cohen at al. 2022)

  • Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimizationarrow-up-right (Ramamurthy and Ammanabrolu et al. 2022)

PreviousPhase 3chevron-leftNextToken linkschevron-right

Last updated 1 year ago