# Atlassian 启用默认数据收集功能以训练人工智能

- 来源：Hacker News 热门（buzzing.cc 中文翻译）
- 作者：kevcampb
- 发布时间：2026-04-20 23:09
- AIHOT 链接：https://aihot.virxact.com/items/cmo7czypv00ifslmlmb7triab
- 原文链接：https://letsdatascience.com/news/atlassian-enables-default-data-collection-to-train-ai-f71343d8

## AI 摘要

Atlassian 已默认开启数据收集功能，将用户数据用于训练人工智能模型。这一政策变更意味着用户交互数据将自动纳入 AI 训练流程，除非用户主动选择退出。该消息于 2026 年 4 月 20 日发布后在 Hacker News 获得 104 个赞，引发关于企业数据隐私和默认权限设置的讨论。

## 正文

Atlassian will begin collecting customer metadata and in-app content from Jira, Confluence, and other cloud products by default on August 17, 2026, to train its AI offerings including Rovo and Rovo Dev. The change affects roughly 300,000 customers; metadata collection is mandatory for Free, Standard, and Premium tiers and cannot be opted out on those plans. Enterprise customers can opt out of metadata and in-app collection by default. Collected data will be retained up to seven years, with in-app data removed within 30 days after deletion or opt-out and models retrained within 90 days. Customers using customer-managed keys, Atlassian Government Cloud, Isolated Cloud, or with HIPAA requirements are excluded from collection.

What happened

Atlassian is changing its data contribution policy so that, starting August 17, 2026, it will use customer metadata and in-app content from Jira, Confluence, and other Atlassian Cloud products to train its AI capabilities, including `Rovo` and `Rovo Dev`. The update applies to about 300,000 customers and implements tiered defaults: lower tiers cannot opt out of metadata collection, while Enterprise plans retain opt-out controls. Atlassian will retain contributed data for up to seven years.

Technical details

Atlassian has defined two distinct data categories. Metadata covers de-identified signals such as readability and complexity scores, task classifications, semantic similarity metrics, story points, sprint end dates, and Jira Service Management SLA values. In-app data covers user-generated content: page titles and bodies in Confluence, Jira issue titles, descriptions, comments, custom emoji names, custom status names, and workflow names. Atlassian says it will remove direct identifiers, aggregate data, and apply protections before using it for training. The company documents retention and remediation rules: in-app data removed within 30 days after opt-out or deletion, and any models trained on that data will be retrained within 90 days to purge the contribution.

Tiered defaults and exclusions

The defaults are explicitly tied to an organization's highest active plan. Free and Standard customers have metadata contribution always on with no opt-out, and in-app data on by default for Free/Standard but configurable. Premium keeps metadata always on and in-app data off by default. Enterprise customers have both off by default and can opt out of metadata. Customers using customer-managed encryption keys, Atlassian Government Cloud, Atlassian Isolated Cloud, or with HIPAA obligations are excluded from contribution entirely.

Context and significance

This policy reverses Atlassian's prior posture, which stated that customer data would not be used to train or improve AI services. The change mirrors a broader industry shift where SaaS vendors harvest internal usage signals and content to bootstrap, fine-tune, or evaluate models, while promising de-identification and aggregated analytics. Practical benefits Atlassian lists include improved search relevance, better summaries, template suggestions, and agentic workflow optimizations. For practitioners, the update matters because it changes data provenance for models used in workplace tooling, and it alters compliance and procurement tradeoffs between price tiers and data control.

Risks and trade-offs

The mandatory metadata collection for non-Enterprise customers raises privacy and governance concerns even if identifiers are removed, because telemetry like story points and SLA metrics can reveal project structure and performance patterns. Retaining de-identified data for seven years increases exposure surface over time and places burden on customers that require long-data-retention audits. The documented exclusion paths for high-security customers and those with customer-managed keys are useful, but they require migration to higher-priced plans or specialized deployments.

What to watch

Organizations should inventory Atlassian tenants and identify the highest active plan per tenant to understand default contributions, update administrative settings during the rollout window, and evaluate whether to migrate to Enterprise or isolated deployments if they require full opt-out. From a product perspective, watch how Atlassian operationalizes 90-day retraining for models and whether downstream LLM vendors used in Rovo claim they do not retain inputs. Expect customer pushback and potential regulatory scrutiny as this pattern spreads across enterprise SaaS vendors.

Scoring Rationale

This is a notable product-policy shift with practical consequences for thousands of enterprise users and practitioners who manage data governance and model provenance. It is not a frontier-model or regulatory landmark, but it materially changes data pipelines and compliance choices for teams using Atlassian products.

Practice with real SaaS & B2B data

90 SQL & Python problems · 15 industry datasets

250 free problems · No credit card
