DivyCHI | AI Safety Policy

Section 01

Our Safety Philosophy

At DivyCHI, we believe that the development of powerful AI systems carries profound responsibility. Safety is not a feature we add at the end — it is a constraint that shapes every architectural decision, training run, and deployment choice we make.

Our Existence Intelligence architecture is designed from the ground up with the principle that intelligence without alignment is dangerous. We hold that it is possible — and necessary — to build AI systems that are highly capable and reliably safe. We reject the framing that safety and capability are in fundamental tension.

We operate under the conviction that the organisations building the most capable AI systems bear the greatest responsibility to invest in safety research, share findings with the broader community, and subject their systems to independent scrutiny.

DivyCHI is committed to the principle that no commercial objective justifies releasing AI systems whose risks to users or society are not adequately understood and mitigated.

Section 02

Core Safety Principles

Every system we build and deploy is governed by six non-negotiable safety principles:

Non-Maleficence

Our systems must not cause harm to users, third parties, or society. We apply strict content policies and capability restrictions to enforce this baseline.

Honesty & Calibration

Our models are trained to express uncertainty truthfully, avoid confabulation, and never attempt to deceive or manipulate users.

Human Autonomy

We design systems that support human decision-making, never supplant it. Our AI assists and informs; final authority remains with the user.

Controllability

Systems must remain correctable and interruptible by authorised human operators at all stages of operation. No autonomous escalation without oversight.

Robustness

Safety behaviours must hold under adversarial conditions, prompt injection, jailbreak attempts, and distribution shift — not just in benign settings.

Fairness & Inclusion

We work to identify and mitigate demographic bias in model outputs, striving for consistent quality and respect across all user groups and languages.

Section 03

Model Development & Testing

Safety is integrated at every stage of the model lifecycle. We do not treat safety evaluation as a final gate — it informs architecture, dataset curation, training objectives, and fine-tuning from the very beginning.

Pre-Training

All training data undergoes automated and human-reviewed filtering to remove CSAM, detailed instructions for mass-casualty weapons, and content designed to facilitate targeted violence.
Data provenance is documented. We maintain records of data sources, licensing status, and any known safety concerns associated with training corpora.
Bias audits are conducted on curated training samples to surface demographic, cultural, and linguistic disparities before training begins.

Alignment & Fine-Tuning

We apply Reinforcement Learning from Human Feedback (RLHF) and Constitutional AI (CAI) techniques to align model outputs with DivyCHI's safety principles.
Fine-tuning datasets are curated with adversarial examples to ensure the model refuses harmful requests robustly, not merely in easy cases.
Alignment evaluations are run across a standardised battery of safety benchmarks after every significant training update before any model advances to staging.

Deployment Gates

No model is promoted to production without sign-off from both our internal Safety Review Board and an independent external evaluation partner.
Critical-risk capability thresholds (e.g., autonomous code execution, long-horizon agentic tasks) require additional safety review and staged rollout.
All production models are versioned and immutable once deployed; changes require a new review cycle.

Section 04

Deployment Safeguards

Safe model behaviour during training is necessary but not sufficient. We operate multiple layers of runtime safeguards to ensure safe outputs in production.

Safeguard Layer	Mechanism	Scope
Input Filtering	Classifier-based screening of incoming prompts for policy violations before model inference	All API and product endpoints
Output Moderation	Post-generation review of model outputs before delivery to the user	All consumer-facing products
Rate Limiting	Per-user and per-organisation throttling to prevent automated abuse at scale	All API access tiers
Agentic Guardrails	Capability restrictions and confirmation gates for agents taking real-world actions (file writes, API calls, browser actions)	DivyCHI Agents & API with tool use
Anomaly Detection	Real-time monitoring for unusual usage patterns indicative of policy circumvention or coordinated abuse	All authenticated sessions
Session Sandboxing	Code execution and file operations occur in isolated containers with no access to user system or external network by default	Code interpreter feature

Safeguard configurations for Enterprise deployments are auditable and configurable within policy-defined bounds. Operators may tighten but not loosen DivyCHI's baseline safety restrictions.

Section 05

Prohibited Use Cases

The following uses of DivyCHI products are strictly prohibited, regardless of the account tier, stated purpose, or operator configuration. These restrictions are hardcoded and cannot be unlocked by any operator or user instruction.

Weapons of mass destruction: providing technical uplift for the development, synthesis, acquisition, or deployment of chemical, biological, radiological, or nuclear weapons.
Child sexual abuse material (CSAM): generating, describing, or facilitating any content that sexually exploits minors, fictionally or otherwise.
Critical infrastructure attacks: generating plans, code, or guidance designed to disable or damage power grids, water systems, financial systems, or healthcare infrastructure.
Cyberweapons: developing functional malware, ransomware, zero-day exploits, or tools designed to cause significant damage if deployed.
Targeted violence: generating credible threats or operational plans to harm specific identified individuals or groups.
Undermining AI oversight: assisting any effort to disable, circumvent, or sabotage the safety systems, monitoring infrastructure, or human oversight mechanisms of any AI system.
Influence operations at scale: generating synthetic personas, coordinated inauthentic content, or disinformation campaigns designed to manipulate democratic processes.

Attempts to circumvent these restrictions — through jailbreaks, role-play framing, hypothetical framings, or indirect elicitation — constitute a violation of our Terms of Service and will result in immediate account termination.

Section 06

Harm Classification Framework

Not all potentially harmful outputs are equally serious. We use a structured harm classification to calibrate our response — from outright refusal to cautious assistance with appropriate caveats.

Severity	Description	Model Response
Critical	Mass-casualty potential; CSAM; critical infrastructure; cyberweapons; election manipulation at scale	Absolute refusal; account flagged for review
High	Targeted violence; self-harm facilitation; non-consensual intimate imagery; coordinated fraud	Refusal; safety resource offered where appropriate
Medium	Deceptive content with significant real-world impact; harassment; serious privacy violations	Refusal or significant modification; explanation provided
Low	Mildly harmful, dual-use, or context-dependent content	Contextual evaluation; assistance with caveats or redirection
Edge Case	Ambiguous requests where intent is unclear and harm is speculative	Charitable interpretation; clarifying question or conservative assistance

Harm assessment is contextual. The same information may be freely provided in one context (e.g., a medical professional asking about overdose thresholds) and withheld in another (e.g., an anonymous user with expressed suicidal ideation). Our systems and policies are designed to support this contextual reasoning.

Section 07

Human Oversight & Control

We believe that during the current period of AI development, maintaining robust human oversight of AI systems is not optional — it is a categorical requirement for responsible deployment.

Design Constraints

No DivyCHI system is designed or configured to pursue long-horizon goals autonomously without periodic human checkpoints.
Agentic systems are designed to request confirmation before taking irreversible actions (deleting files, sending communications, making purchases, or modifying production systems).
All agentic sessions produce a full action log auditable by the authorising user or operator.

Operator Controls

Enterprise operators may restrict model capabilities, adjust content policies within DivyCHI's baseline floor, and configure mandatory human-in-the-loop steps for high-stakes workflows.
All operator configuration changes are logged with timestamps and responsible party attribution.
DivyCHI retains the right to override operator configurations that would cause the platform to facilitate clearly illegal or severely harmful outcomes.

DivyCHI systems are explicitly trained to be corrigible — to accept correction, shutdown, and modification from authorised principals — even when such correction conflicts with task completion.

Section 08

Red-Teaming & Safety Audits

Internal testing against our own policies is necessary but not sufficient. We invest in adversarial evaluation by parties motivated to find failures, not validate assumptions.

1

Internal Red Team

A dedicated internal red team conducts structured adversarial testing against every model prior to production release. Red-team findings are tracked, prioritised, and must be addressed or formally accepted before deployment sign-off.

2

Third-Party Safety Evaluation

We commission independent safety evaluations from recognised AI safety research organisations before major model releases. Reports are reviewed by our Safety Review Board and material findings are published in our transparency disclosures.

3

Bug Bounty Programme

Our responsible disclosure programme rewards external researchers who identify safety-relevant vulnerabilities in our systems. Scope includes alignment failures, jailbreaks with significant uplift, and privacy-critical bugs. Reports are acknowledged within 48 hours.

4

Ongoing Monitoring

Production systems are continuously monitored for anomalous output patterns, policy violations, and emerging misuse vectors. Monitoring findings feed back into our safety roadmap on a quarterly basis.

Our current red-team programme covers: harmful content generation, jailbreaks and prompt injection, dangerous capability elicitation, bias and fairness failures, and agentic misuse scenarios.

Section 09

Incident Response

When safety incidents occur — and in a system operating at scale, some will — our priority is to act swiftly, communicate transparently, and learn systematically.

Detection & Triage

Automated monitoring triggers alerts for high-severity policy violations, unusual usage spikes, and unexpected model behaviours within minutes of detection.
An on-call Safety Incident Commander is available 24/7 to triage and escalate critical incidents.
Incidents are classified on severity (P0–P3) within one hour of detection, with corresponding response SLAs.

Containment & Mitigation

Affected capabilities may be temporarily disabled or access suspended while an incident is under investigation — user impact is minimised but safety takes precedence.
Targeted model patches or classifier updates can be deployed without a full model update cycle for critical-severity incidents.

Post-Incident Review

All P0 and P1 incidents receive a written post-mortem within 7 days, reviewed by the Safety Review Board.
Action items from post-mortems are tracked with owners and deadlines. Recurring failure patterns trigger systemic remediation projects.
Material incidents affecting user safety or trust are disclosed publicly in our transparency report within 90 days, where legal constraints permit.

If you discover a safety-relevant incident or vulnerability in DivyCHI systems, please report it immediately to safety@divychi.com. Do not publicly disclose vulnerabilities before coordinating with us.

Section 10

Research & Transparency

We believe that AI safety is a collective problem requiring open collaboration. DivyCHI contributes to the broader safety research ecosystem through publications, benchmark participation, and responsible disclosure.

Our Research Commitments

We publish safety-relevant research findings in peer-reviewed venues and pre-print servers, subject to security review to avoid enabling misuse.
We participate in cross-industry safety initiatives, evaluation frameworks (e.g., HELM, MMLU-Pro, SafeBench), and relevant standards bodies.
We share non-proprietary safety tooling and evaluation methodologies with the research community where feasible.

Transparency Report

DivyCHI publishes an annual AI Safety Transparency Report covering: safety incidents, red-team findings, policy enforcement statistics, and safety research updates.
The report is available at divychi.com/transparency and archived publicly for historical reference.

We believe that where safety research is published, the benefit of informing defences outweighs the risk of informing attacks in the vast majority of cases. We err on the side of openness, with narrow exceptions for genuinely dual-use findings.

Section 11

Governance Structure

Safety accountability at DivyCHI is structured, not diffuse. Clear ownership ensures that safety obligations are enforced, not merely aspired to.

Body / Role	Composition	Responsibility
Safety Review Board	Chief Safety Officer, CTO, two independent external advisors	Final sign-off on all model releases; oversight of P0/P1 incidents; policy amendments
Chief Safety Officer	Senior executive reporting to CEO	Day-to-day safety operations; policy enforcement; external regulatory liaison
Red Team	Dedicated internal team of 4+ safety researchers	Adversarial evaluation of all models prior to deployment
Ethics Committee	Cross-functional; includes external civil society representation	Review of novel capability deployments; fairness and societal impact assessment
On-Call Safety Incident Commander	Rotating senior engineer from safety team	24/7 incident triage and escalation

The Safety Review Board operates with independent authority over deployment decisions and may halt or roll back a model release against the recommendation of the commercial or product teams. This independence is preserved in our company charter.

Section 12

Changes to This Policy

This AI Safety Policy will be updated as our technology, research, and regulatory environment evolve. We are committed to transparency about those changes.

Material changes will be announced via email to registered API partners and Enterprise customers at least 14 days before taking effect.
All historical versions of this policy will remain publicly accessible at divychi.com/safety/archive.
A summary of substantive changes will accompany each revision, clearly identifying what was added, removed, or modified.
The "Last Updated" date at the top of this page reflects the most recent revision, even for minor clarifications.

Section 13

Contact & Reporting

We welcome safety-related feedback, vulnerability disclosures, and questions about this policy from users, researchers, regulators, and the public.

AI Safety Team — DIVY CHI Pvt. Ltd.
General safety enquiries: safety@divychi.com
Vulnerability disclosure: security@divychi.com
Policy questions: policy@divychi.com

For urgent safety incidents involving active harm, please use safety@divychi.com with the subject line "URGENT — Safety Incident". Our on-call team monitors this inbox 24 hours a day.

CHI Safety Policy