🥅Objective behind Haptic

Our team believes that cryptoeconomic innovative and reward mechanisms, pioneered by DePIN projects, provide us with a unique opportunity to collect higher-quality human feedback. This kind of feedback is not achievable via web2 platforms like Amazon's Mechanical Turks. We can further use this superior feedback to enhance Language Learning Model (LLM) capabilities through reinforcement learning.

Reinforcement Learning from Human Feedback (RLHF) provides a solution to the challenge of defining - great text response, a subjective and context-specific task. Traditional language models are usually trained using next-token prediction loss (like cross-entropy). However, this method may not capture the qualities that make a text compelling.

Our objective is simple and straightforward, there is a large unlock of model improvements possible through RLHF - which we have observed being utilised at university research labs and corporations like OpenAI, Microsoft, Anthropic. Haptic is being built to ensure accessibility of these tools and the network of high quality human feedback providers to all AI developers and teams.

PreviousIntroduction to Reinforcement learning NextMethodology Followed

Last updated 1 year ago