Reinforcement Learning from AI Feedback A Review
DOI:
https://doi.org/10.32628/CSEIT24104135Keywords:
Reinforcement Learning, AI Generated Feedback, Machine Learning, Reward Signals, Interactive LearningAbstract
Reinforcement Learning from AI Feedback (RLAIF) is a big step forward compared to Reinforcement Learning from Human Feedback (RLHF). It's especially useful for large language models like GPT-4. RLAIF is better because it can handle more data at scale and is more efficient. It uses AI-generated feedback instead of human feedback. This shift to AI-generated feedback enhances the efficiency and speed of training AI systems. Additionally, RLAIF optimizes the AI's ability to align with desired outcomes, although it may not directly improve understanding human preferences. RLAIF uses a Preference Model (PM) that follows constitutional principles. This ensures that AI responses are ethical, safe, and high -quality. The constitution sets rules for AI decision-making. It makes sure AI follows ethical and social standards. This is important as AI keeps evolving. RLAIF is moving towards an automated, moral feedback system focusing on responsible AI governance and ethical guidelines.
Downloads
References
N. Ding et al., "Parameter-efficient fine-tuning of large-scale pre-trained language models," Nature Machine Intelligence, vol. 5, no. 3, pp. 220-235, 2023. DOI: https://doi.org/10.1038/s42256-023-00626-4
O. Muzurura, T. Mzikamwi, T. G. Rebanowako, and D. Mpini, "APPLICATION OF ARTIFICIAL INTELLIGENCE FOR VIRTUAL TEACHING ASSISTANCE (Case study: Introduction to Information Technology)," 2023.
R. Zheng, S. Dou, S. Gao, W. Shen, B. Wang, Y. Liu, et al., "Secrets of rlhf in large language models part i: Ppo," arXiv preprint arXiv:2307.04964, 2023.
Y. Zhao, R. Joshi, T. Liu, M. Khalman, M. Saleh, and P. J. Liu, " SLICHF: Sequence likelihood calibration with human feedback," arXiv preprint arXiv:2305.10425, 2023.
B. Singh, R. Kumar, and V. P. Singh, "Reinforcement learning in robotic applications: a comprehensive survey," Artificial Intelligence Review, pp. 1 -46, 2022.
L. C. Garaffa et al., "Reinforcement learning for mobile robotics exploration: A survey," IEEE Transactions on Neural Networks and Learning Systems, 2021.
J. Lin, Z. Ma, R. Gomez, K. Nakamura, B. He, and G. Li, "A review on interactive reinforcement learning from human social feedback," IEEE Access, vol. 8, pp. 120757 - 120765, 2020. DOI: https://doi.org/10.1109/ACCESS.2020.3006254
L. Guan, M. Verma, and S. Kambhampati, "Explanation augmented feedback in human-in-the-loop reinforcement learning," arXiv preprint arXiv:2006.14804, 2020.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 International Journal of Scientific Research in Computer Science, Engineering and Information Technology
This work is licensed under a Creative Commons Attribution 4.0 International License.