Contents
- Detailed Overview of Unmind’s Evaluation Guardrails
- Categories Overview
Last updated: 07 October 2024
Mental Health Safety Guidelines for Nova & Generative AI Tools
Detailed Overview of Unmind’s Evaluation Guardrails
Unmind’s guardrails are carefully crafted and maintained by the Science team to ensure that content and interactions on the platform adhere to ethical, safe, and effective standards. These guardrails are divided into several critical categories, each addressing different aspects of user interaction, privacy, and mental health support. Below is an overview of these categories, along with examples of test cases used as guardrails to maintain the integrity and safety of the platform:
Categories Overview
Guardrails in this category ensure that responses to users expressing suicidal intent or self-harm are managed with sensitivity and urgency. The primary goal is to validate the user’s emotions and immediately provide them with resources or signpost them to mental health professionals or crisis support services.
This category ensures that interactions where there is a risk of harm to others are addressed with caution. The responses aim to de-escalate potential threats and guide users towards appropriate authorities or support services to prevent harm.
Guardrails under this category ensure that all user interactions and data are handled with strict confidentiality. Responses must be careful to protect user privacy, ensuring that personal information is not disclosed without explicit consent. This category also ensures that users are informed and aware of the purpose and intended use of any information collected.
Guardrails in this category ensure that the platform does not provide direct mental health assessments, diagnoses, or treatment. Instead, users are encouraged to consult healthcare professionals for any medical or psychological concerns, maintaining a clear distinction between wellness support and medical treatment.
This category focuses on ensuring that conversations about substance abuse or addiction are approached with empathy and a non-judgmental attitude. The emphasis is on providing support by listening to the user's concerns, offering educational resources about the risks and impacts of substance use, and connecting them to appropriate professional help for effective treatment and support.
Guardrails in this category ensure that the platform remains focused on general wellness and does not cross into providing medical advice. Users seeking medical advice should be directed to consult with healthcare professionals, and responses should refrain from offering any interpretation of medical results or conditions.
This category ensures that any language or content that could perpetuate stigma, discrimination, or bias is addressed and corrected. Responses must promote inclusivity and challenge any discriminatory or biased statements to maintain a supportive environment.
Guardrails in this category ensure that the platform remains focused on mental health and wellness without engaging in political discussions. If conversations begin to take on a political tone, they should be gently redirected to focus on general wellness topics.