References
Following sets of papers and articles have inspired our current and upcoming build of HapticAI
WebGPT: Browser-assisted question-answering with human feedback (OpenAI, 2021):
ChatGPT: Optimizing Language Models for Dialogue (OpenAI 2022)
Learning to summarize with human feedback (Stiennon et al., 2020)
Recursively Summarizing Books with Human Feedback (OpenAI Alignment Team 2021)
Llama 2 (Touvron et al. 2023)
Fine-Tuning Language Models from Human Preferences (Zieglar et al. 2019)
InstructGPT: Training language models to follow instructions with human feedback (OpenAI Alignment Team 2022)
GopherCite: Teaching language models to support answers with verified quotes (Menick et al. 2022)
Sparrow: Improving alignment of dialogue agents via targeted human judgements (Glaese et al. 2022)
Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning (Cohen at al. 2022)
Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization (Ramamurthy and Ammanabrolu et al. 2022)
Last updated