Our Safety Philosophy
At DivyCHI, we believe that the development of powerful AI systems carries profound responsibility. Safety is not a feature we add at the end — it is a constraint that shapes every architectural decision, training run, and deployment choice we make.
Our Existence Intelligence architecture is designed from the ground up with the principle that intelligence without alignment is dangerous. We hold that it is possible — and necessary — to build AI systems that are highly capable and reliably safe. We reject the framing that safety and capability are in fundamental tension.
We operate under the conviction that the organisations building the most capable AI systems bear the greatest responsibility to invest in safety research, share findings with the broader community, and subject their systems to independent scrutiny.
Core Safety Principles
Every system we build and deploy is governed by six non-negotiable safety principles:
Model Development & Testing
Safety is integrated at every stage of the model lifecycle. We do not treat safety evaluation as a final gate — it informs architecture, dataset curation, training objectives, and fine-tuning from the very beginning.
Pre-Training
- All training data undergoes automated and human-reviewed filtering to remove CSAM, detailed instructions for mass-casualty weapons, and content designed to facilitate targeted violence.
- Data provenance is documented. We maintain records of data sources, licensing status, and any known safety concerns associated with training corpora.
- Bias audits are conducted on curated training samples to surface demographic, cultural, and linguistic disparities before training begins.
Alignment & Fine-Tuning
- We apply Reinforcement Learning from Human Feedback (RLHF) and Constitutional AI (CAI) techniques to align model outputs with DivyCHI's safety principles.
- Fine-tuning datasets are curated with adversarial examples to ensure the model refuses harmful requests robustly, not merely in easy cases.
- Alignment evaluations are run across a standardised battery of safety benchmarks after every significant training update before any model advances to staging.
Deployment Gates
- No model is promoted to production without sign-off from both our internal Safety Review Board and an independent external evaluation partner.
- Critical-risk capability thresholds (e.g., autonomous code execution, long-horizon agentic tasks) require additional safety review and staged rollout.
- All production models are versioned and immutable once deployed; changes require a new review cycle.
Deployment Safeguards
Safe model behaviour during training is necessary but not sufficient. We operate multiple layers of runtime safeguards to ensure safe outputs in production.
| Safeguard Layer | Mechanism | Scope |
|---|---|---|
| Input Filtering | Classifier-based screening of incoming prompts for policy violations before model inference | All API and product endpoints |
| Output Moderation | Post-generation review of model outputs before delivery to the user | All consumer-facing products |
| Rate Limiting | Per-user and per-organisation throttling to prevent automated abuse at scale | All API access tiers |
| Agentic Guardrails | Capability restrictions and confirmation gates for agents taking real-world actions (file writes, API calls, browser actions) | DivyCHI Agents & API with tool use |
| Anomaly Detection | Real-time monitoring for unusual usage patterns indicative of policy circumvention or coordinated abuse | All authenticated sessions |
| Session Sandboxing | Code execution and file operations occur in isolated containers with no access to user system or external network by default | Code interpreter feature |
Prohibited Use Cases
The following uses of DivyCHI products are strictly prohibited, regardless of the account tier, stated purpose, or operator configuration. These restrictions are hardcoded and cannot be unlocked by any operator or user instruction.
- Weapons of mass destruction: providing technical uplift for the development, synthesis, acquisition, or deployment of chemical, biological, radiological, or nuclear weapons.
- Child sexual abuse material (CSAM): generating, describing, or facilitating any content that sexually exploits minors, fictionally or otherwise.
- Critical infrastructure attacks: generating plans, code, or guidance designed to disable or damage power grids, water systems, financial systems, or healthcare infrastructure.
- Cyberweapons: developing functional malware, ransomware, zero-day exploits, or tools designed to cause significant damage if deployed.
- Targeted violence: generating credible threats or operational plans to harm specific identified individuals or groups.
- Undermining AI oversight: assisting any effort to disable, circumvent, or sabotage the safety systems, monitoring infrastructure, or human oversight mechanisms of any AI system.
- Influence operations at scale: generating synthetic personas, coordinated inauthentic content, or disinformation campaigns designed to manipulate democratic processes.
Harm Classification Framework
Not all potentially harmful outputs are equally serious. We use a structured harm classification to calibrate our response — from outright refusal to cautious assistance with appropriate caveats.
| Severity | Description | Model Response |
|---|---|---|
| Critical | Mass-casualty potential; CSAM; critical infrastructure; cyberweapons; election manipulation at scale | Absolute refusal; account flagged for review |
| High | Targeted violence; self-harm facilitation; non-consensual intimate imagery; coordinated fraud | Refusal; safety resource offered where appropriate |
| Medium | Deceptive content with significant real-world impact; harassment; serious privacy violations | Refusal or significant modification; explanation provided |
| Low | Mildly harmful, dual-use, or context-dependent content | Contextual evaluation; assistance with caveats or redirection |
| Edge Case | Ambiguous requests where intent is unclear and harm is speculative | Charitable interpretation; clarifying question or conservative assistance |
Harm assessment is contextual. The same information may be freely provided in one context (e.g., a medical professional asking about overdose thresholds) and withheld in another (e.g., an anonymous user with expressed suicidal ideation). Our systems and policies are designed to support this contextual reasoning.
Human Oversight & Control
We believe that during the current period of AI development, maintaining robust human oversight of AI systems is not optional — it is a categorical requirement for responsible deployment.
Design Constraints
- No DivyCHI system is designed or configured to pursue long-horizon goals autonomously without periodic human checkpoints.
- Agentic systems are designed to request confirmation before taking irreversible actions (deleting files, sending communications, making purchases, or modifying production systems).
- All agentic sessions produce a full action log auditable by the authorising user or operator.
Operator Controls
- Enterprise operators may restrict model capabilities, adjust content policies within DivyCHI's baseline floor, and configure mandatory human-in-the-loop steps for high-stakes workflows.
- All operator configuration changes are logged with timestamps and responsible party attribution.
- DivyCHI retains the right to override operator configurations that would cause the platform to facilitate clearly illegal or severely harmful outcomes.
Red-Teaming & Safety Audits
Internal testing against our own policies is necessary but not sufficient. We invest in adversarial evaluation by parties motivated to find failures, not validate assumptions.
Incident Response
When safety incidents occur — and in a system operating at scale, some will — our priority is to act swiftly, communicate transparently, and learn systematically.
Detection & Triage
- Automated monitoring triggers alerts for high-severity policy violations, unusual usage spikes, and unexpected model behaviours within minutes of detection.
- An on-call Safety Incident Commander is available 24/7 to triage and escalate critical incidents.
- Incidents are classified on severity (P0–P3) within one hour of detection, with corresponding response SLAs.
Containment & Mitigation
- Affected capabilities may be temporarily disabled or access suspended while an incident is under investigation — user impact is minimised but safety takes precedence.
- Targeted model patches or classifier updates can be deployed without a full model update cycle for critical-severity incidents.
Post-Incident Review
- All P0 and P1 incidents receive a written post-mortem within 7 days, reviewed by the Safety Review Board.
- Action items from post-mortems are tracked with owners and deadlines. Recurring failure patterns trigger systemic remediation projects.
- Material incidents affecting user safety or trust are disclosed publicly in our transparency report within 90 days, where legal constraints permit.
Research & Transparency
We believe that AI safety is a collective problem requiring open collaboration. DivyCHI contributes to the broader safety research ecosystem through publications, benchmark participation, and responsible disclosure.
Our Research Commitments
- We publish safety-relevant research findings in peer-reviewed venues and pre-print servers, subject to security review to avoid enabling misuse.
- We participate in cross-industry safety initiatives, evaluation frameworks (e.g., HELM, MMLU-Pro, SafeBench), and relevant standards bodies.
- We share non-proprietary safety tooling and evaluation methodologies with the research community where feasible.
Transparency Report
- DivyCHI publishes an annual AI Safety Transparency Report covering: safety incidents, red-team findings, policy enforcement statistics, and safety research updates.
- The report is available at divychi.com/transparency and archived publicly for historical reference.
Governance Structure
Safety accountability at DivyCHI is structured, not diffuse. Clear ownership ensures that safety obligations are enforced, not merely aspired to.
| Body / Role | Composition | Responsibility |
|---|---|---|
| Safety Review Board | Chief Safety Officer, CTO, two independent external advisors | Final sign-off on all model releases; oversight of P0/P1 incidents; policy amendments |
| Chief Safety Officer | Senior executive reporting to CEO | Day-to-day safety operations; policy enforcement; external regulatory liaison |
| Red Team | Dedicated internal team of 4+ safety researchers | Adversarial evaluation of all models prior to deployment |
| Ethics Committee | Cross-functional; includes external civil society representation | Review of novel capability deployments; fairness and societal impact assessment |
| On-Call Safety Incident Commander | Rotating senior engineer from safety team | 24/7 incident triage and escalation |
The Safety Review Board operates with independent authority over deployment decisions and may halt or roll back a model release against the recommendation of the commercial or product teams. This independence is preserved in our company charter.
Changes to This Policy
This AI Safety Policy will be updated as our technology, research, and regulatory environment evolve. We are committed to transparency about those changes.
- Material changes will be announced via email to registered API partners and Enterprise customers at least 14 days before taking effect.
- All historical versions of this policy will remain publicly accessible at divychi.com/safety/archive.
- A summary of substantive changes will accompany each revision, clearly identifying what was added, removed, or modified.
- The "Last Updated" date at the top of this page reflects the most recent revision, even for minor clarifications.
Contact & Reporting
We welcome safety-related feedback, vulnerability disclosures, and questions about this policy from users, researchers, regulators, and the public.
General safety enquiries: safety@divychi.com
Vulnerability disclosure: security@divychi.com
Policy questions: policy@divychi.com
For urgent safety incidents involving active harm, please use safety@divychi.com with the subject line "URGENT — Safety Incident". Our on-call team monitors this inbox 24 hours a day.