Reinforcement Learning from Human Feedback: Difference between revisions
From ACT Wiki
Jump to navigationJump to search
imported>Doug Williamson (Create page - sources - Wikipedia - https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback#:~:text=In%20machine%20learning%2C%20reinforcement%20learning,learning%20(RL)%20through%20an%20optimization - ACT - https://www.treasurers.org/hub) |
(Add link.) |
||
Line 18: | Line 18: | ||
* [[Enterprise-wide resource planning system]] | * [[Enterprise-wide resource planning system]] | ||
* [[Generative pre-trained transformer]] (GPT) | * [[Generative pre-trained transformer]] (GPT) | ||
* [[Google Gemini]] | |||
* [[Information technology]] | * [[Information technology]] | ||
* [[Machine learning]] | * [[Machine learning]] | ||
Line 30: | Line 31: | ||
*[https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf Improving Language Understanding by Generative Pre-Training, Radford, Narasimhan, Salimans & Sutskever, 2018] | *[https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf Improving Language Understanding by Generative Pre-Training, Radford, Narasimhan, Salimans & Sutskever, 2018] | ||
[[Category:Identify_and_assess_risks]] | [[Category:Identify_and_assess_risks]] | ||
[[Category:Manage_risks]] | [[Category:Manage_risks]] | ||
[[Category:Risk_reporting]] | |||
[[Category:Risk_frameworks]] | [[Category:Risk_frameworks]] | ||
[[Category: | [[Category:The_business_context]] | ||
Latest revision as of 22:23, 11 May 2024
Information technology - software - natural language processing - artificial intelligence - chatbots - training.
(RLHF).
Reinforcement Learning from Human Feedback is a training process for machine learning.
It uses human feedback, or human preferences, to rank - or score - instances of the behaviour or output from the system being trained, for example ChatGPT.
The human-supervised RLHF supplements an initial period of unsupervised training known as generative pre-training.
See also
- Artificial intelligence (AI)
- Bot
- Chatbot
- ChatGPT
- Enterprise-wide resource planning system
- Generative pre-trained transformer (GPT)
- Google Gemini
- Information technology
- Machine learning
- Natural language
- Natural language processing
- Robotics
- Software
- Software robot