Skip to content

Governance and Safety

How Communitas protects member agency, trust, and community legitimacy.

Undisclosed AI participation in communities triggers severe backlash and erodes trust. Engagement optimization without consent degrades the relationships it claims to improve. These are not theoretical risks — they are documented failures that have harmed real communities.

Communitas is designed against these failures. These principles are constraints, not aspirations.

Humans remain accountable for community decisions. AI drafts, suggests, and summarizes — it does not act autonomously in member-facing spaces. The goal is to make stewards more effective, not to replace their judgment.

Members know that AI systems are involved, what those systems can see and suggest, and how to adjust or decline. No intervention reaches a member who hasn’t consented to receive it. Undisclosed AI participation is not a configuration option.

Synthetic agents are clearly identified as non-human. Every intervention is labeled with its source, the signals that triggered it, and the mechanism behind it. Members never have to guess whether they’re interacting with a person or a system.

Every AI suggestion and decision path is logged with enough context to support review, debugging, and longitudinal auditing. If a recommendation caused harm six months ago, you can trace exactly what happened and why.

Communitas does not optimize for engagement, virality, or persuasion. Interventions focus on pro-social goals: bridging clusters, supporting newcomers, de-escalating conflict, and surfacing information. Covert influence is a bug, not a feature.

The system uses the minimum data necessary. Raw text, interaction logs, and member data are handled with realistic threat models — not compliance checklists. Where aggregated or pseudonymized representations suffice, those are preferred.

Communities set their own policies. Communitas operationalizes them — it doesn’t override them. Wherever possible, representative members participate in selecting metrics, choosing interventions, and reviewing experiment results.

When interventions cause harm — and some will — the system supports incident response, rollback, and transparent post-mortems. The standard for success is not perfect predictions. It’s legible, accountable processes for addressing problems when they arise.


These principles are not aspirational. They are constraints we design against. Any feature or intervention that weakens member agency, transparency, or trust is a bug, not a trade-off.