Constitutional AI(CAI)
Definition
An Anthropic-developed approach to training AI systems to be helpful, harmless, and honest using a set of principles rather than extensive human feedback.In-Depth Explanation
Constitutional AI uses a set of written principles (a "constitution") to guide AI behavior during training. The model critiques and revises its own outputs based on these principles, reducing the need for human labelers. This approach helps create more transparent and consistent AI alignment, as the rules are explicit rather than implicit in training data.
Real-World Example
Claude is trained using Constitutional AI principles that include being helpful while avoiding harm and deception.