Every enterprise AI conversation eventually hits the same wall. The technology is compelling, the use cases are clear, and then someone asks the question that stops everything: “But will our competitors see our data?”
Fair question. The right question. Industrial companies sit on sensitive production data—cost structures, efficiency metrics, financial models—that competitors would gladly steal. Before you plug any of that into an AI platform, you need answers you can actually defend, not vendor marketing.
We’ll walk through how the major AI platforms actually handle enterprise data, what happens when you connect Capstone via MCP, and why a governed data platform is the most reliable way to keep operational data under your control.
What MCP Is and Why It Matters for Data Security
Here’s the thing about modern AI: it doesn’t just magically access your systems. It goes through something called the Model Context Protocol (MCP), an open standard originally created by Anthropic and now governed by the Linux Foundation’s Agentic AI Foundation.
Think of MCP as a conversation layer between an AI assistant and your data platform. With Capstone, the AI client can query operational data, run calculations, and generate reports. But here’s the crucial part: the AI model never directly touches your raw database.
This matters because it’s where the real security lives. The AI doesn’t get unfettered access to your infrastructure. Instead, it sends a structured request to the Capstone MCP server, which checks permissions and governance rules before sending anything back. The model sees only what it’s supposed to see, nothing more.
All three vendors, Anthropic, OpenAI, and Microsoft, back MCP as an open standard. That means you’re not locked into one platform. Your IT team picks whatever they trust, and your governance rules move with you. (Learn more: When Your Dashboard Talks Back: MCP and Industrial AI)
How the Major AI Vendors Handle Enterprise Data
The most common fear we hear: “Will the AI vendor train their model on our data?” Let’s see what each platform actually commits to.
OpenAI (ChatGPT Business and Enterprise)
OpenAI’s enterprise privacy policy says they won’t use your business data for model training by default. If they ever want to, they ask first.
ChatGPT Enterprise runs in isolated workspaces with AES-256 encryption at rest and TLS 1.2+ in transit. They hold SOC 2 Type II compliance, offer Enterprise Key Management so you control the encryption keys, and let you pick data residency (US, Europe, UK, Japan, etc.). Your content stays yours. They don’t touch it except to run your queries.
Important Distinction
Consumer plans (ChatGPT Free, Plus, Pro) default to training on user data. The enterprise protections only apply to ChatGPT Business, ChatGPT Enterprise, and the API.
Anthropic (Claude for Enterprise)
Anthropic’s privacy policy is blunt: “By default, we will not use your inputs or outputs from our commercial products to train our models.” That covers Claude for Work, the API, and Claude Gov.
They go further. Inputs and outputs get deleted within 30 days automatically. Enterprise customers can push for zero data retention agreements, where inputs and outputs aren’t stored except where required by law or to combat misuse. (Anthropic does retain safety classifier results for policy enforcement.) That applies to API products and tools using your org’s API key, including Claude Code.
Anthropic built MCP specifically to keep the data platform in control, not the model. That architecture is the reason the model can’t randomly access your data in the first place.
Key Insight
Enterprise AI data protection depends on three independent layers: no model training, workspace isolation, and platform-level access control.
Important Distinction
In late 2025, Anthropic’s consumer plans (Claude Free, Pro, Max) changed to train on user data by default unless the user opts out. The enterprise protections only apply to commercial products: Team, Enterprise, API, and Gov plans.
Microsoft (Copilot for Microsoft 365 and Azure OpenAI)
Microsoft’s Copilot documentation is clear: they don’t train foundation models on your prompts, responses, or data. Copilot runs inside the Microsoft 365 boundary, using existing enterprise controls.
For Azure OpenAI Service, Microsoft won’t train or retrain its models on your data. They don’t hand your data to OpenAI. Everything processes within your chosen Azure region, backed by SOC 2, ISO 27001, and HIPAA certifications.
Bottom line: all three vendors publicly commit not to train on enterprise customer data. Your data isn’t going into their next model.
What Happens to Your Data on an Enterprise AI Plan?
When you query Capstone through any MCP-connected AI client, the flow is identical regardless of vendor. The AI sends a structured request to the Capstone MCP server. Capstone checks your permissions, pulls the authorized data, and sends back only what you’re supposed to see. The AI works from the MCP response, never from your database.
What happens after that—storage, training, workspace isolation—varies by vendor. Let me lay out how they compare.
What gets stored
All three vendors store enterprise conversations for product functionality like chat history and continuity. All three also give enterprise administrators control over retention:
| OpenAI (Enterprise) | Anthropic (Enterprise) | Microsoft (Azure OpenAI) | |
|---|---|---|---|
| Default retention | Admin-configurable (90 days, 180 days, etc.) | Admin-configurable (minimum 30 days) | Up to 30 days for abuse monitoring |
| Zero retention | Available via Enterprise agreement | Available via zero data retention agreement | Available via Limited Access program |
| Deletion | Removed within 30 days | Removed within 30 days | Purged after retention period |
What happens to MCP connector data specifically
MCP data flows through the AI client but originates from Capstone. Each vendor applies additional protections to connector and tool data:
- OpenAI: Confirms that information accessed from connectors isn’t used for training by default for Business, Enterprise, and Edu customers.
- Anthropic: Confirms that raw content from MCP servers isn’t included in feedback data. Even if a user submits a bug report, the underlying data retrieved from Capstone through MCP is excluded.
- Microsoft: Azure OpenAI processes MCP tool calls within the same data boundary as other API calls. The same no-training and data residency commitments apply.
Can a competitor’s AI agent surface your private data?
This is the question that actually matters. The answer is no.
Here’s why—in three layers:
Layer 1: The AI model doesn’t learn from your data. Enterprise plans don’t train on inputs or outputs. Your proprietary information never seeps into the model. When a competitor asks the same model a question, it pulls from general training data—public websites, licensed datasets, research papers. It doesn’t know your conversations, your MCP queries, or your Capstone data. Your cost-per-unit, production volumes, margin structures? They’re not in the model at all.
Layer 2: Workspace isolation prevents cross-tenant access. Each enterprise workspace is completely walled off:
- OpenAI: Enterprise workspaces don’t leak. Other ChatGPT users can’t see your conversations or connector data.
- Anthropic: Each org’s workspace is isolated. Other Claude users, other companies, Anthropic staff—none of them can see your conversations or MCP queries.
- Microsoft: Azure OpenAI runs inside your tenant boundary. Data stays in your boundary.
A competitor on the same vendor can’t query your workspace, browse it, or stumble into it. The isolation is structural—not just a permissions rulebook.
Layer 3: Capstone governs what the AI can see from the start. Before any vendor protection kicks in, Capstone’s role-based access control decides what data reaches the AI. MCP connections are authenticated and encrypted (TLS). The model sees only what Capstone explicitly returns for that authenticated user. Even if someone broke into your AI client somehow, Capstone would still block them at the MCP layer. No account, no permissions, no data.
The result: Three independent layers. No training, workspace isolation, and platform-level governance. A competitor would need to breach all three at once. The only information in an AI model’s general knowledge is what you’ve publicly released: your website, press releases, industry reports.
Why Vendor Commitments Alone Aren’t Enough
Vendor policies matter, sure. But they’re not enough on their own. They tell you what happens at the AI platform layer, they don’t control what data gets to the AI in the first place. And policies shift. Anthropic’s consumer terms changed in late 2025 to allow training by default. Enterprise contracts evolve at every renewal. That’s precisely why the architecture matters more than any vendor promise.
Most data platforms already offer some form of access control. Column-level security, row-level security, dynamic data masking, these are real capabilities in tools like Snowflake, Databricks, and SQL Server. They matter, and if you’re using them, good. But they all share a blind spot: they govern access to stored data, not to what an AI can derive from it.
Column-level security can hide a cost_per_tonne column. But if the user can see total_cost and production_volume in separate columns, the AI will happily divide one by the other when asked. The derived metric leaks, even though the stored metric is locked down. Row-level security and dynamic masking have the same gap: they control what data the user sees, not what conclusions the AI draws from it.
That’s the difference between data-level governance and calculation-level governance. A platform like Capstone doesn’t just control which tables or columns reach the AI. It checks permissions on the derived metrics themselves, including the underlying formulas. If the user’s role doesn’t include cost-per-tonne as a metric, the AI can’t return it, regardless of whether the raw inputs are individually visible. The AI never opens the raw database.
Both approaches beat giving AI unrestricted access. Calculation-level governance is fundamentally harder to attack because your business logic sits inside the trust boundary, not just the data structure. When governance lives at the metric level, you control what the AI sees and what it can derive from it.
MCP is an open standard that all three vendors back, so you avoid lock-in. Switch from Claude to ChatGPT to Copilot, your Capstone configuration, data model, and governance rules stay the same. You’re not rebuilding when you change AI clients.
How Capstone Governs AI Access to Your Data
Capstone sits between your operational data and any AI client as a governance layer.
The difference: governance isn’t added afterward. It’s baked into the data model. Every metric, every formula, every aggregation has its own permission context. This is calculation-level governance, not table-level access control.
Why Calculation-Level Governance Matters
Let me show you where table-level governance breaks.
A regional operations manager needs access to site production volumes and top-level cost categories for their job. Table-level access control lets the AI see both tables. When the manager asks “What’s Site 3’s cost per tonne?”, the AI does the obvious calculation:
Cost per Tonne = Total Cost ÷ Production Volume
The manager has legitimate access to both inputs. The AI just divides one by the other. But cost-per-tonne is a restricted financial metric that should be shared only with the CFO’s team.
Table-level governance can’t stop this. The user has access to the input tables. The problem isn’t the data, it’s the derived metric.
Capstone works differently. When the AI queries “What was Site 3’s cost per tonne last month?”, Capstone checks more than just whether the user can see Site 3 and the cost table. It checks whether the user has permission to view the cost per tonne metric—including the underlying formula. If the user’s role doesn’t include that metric, Capstone blocks the result, regardless of whether the raw inputs are individually visible.
This works across the full formula chain. Capstone’s metric model maps formulas and dependencies—Gross Margin depends on Revenue and Total Cost, which depend on Fixed Cost and Variable Cost, down the hierarchy. The AI can traverse those relationships to answer complex questions (see how formula chain analysis works), but only within what the user is permitted to see. This same principle also blocks ratio aggregation errors—The formula engine knows which metrics can be averaged and which must be recalculated from components.
Role-Based Access, Tenant Isolation, and Audit
On top of calculation-level governance, Capstone enforces the enterprise security fundamentals you’d expect:
- Role-based access control: Every MCP request is authenticated and authorized against your permission model. The same rules that protect your dashboards also protect what the AI retrieves.
- Tenant isolation: In multi-tenant environments, Capstone isolates each client’s data at the platform level by design. Tenant boundaries lock in before any data reaches the MCP layer.
- Auditable access: Every query, retrieval, and calculation through MCP gets logged with timestamps, user identity, and purpose. Your compliance team can audit exactly what the AI touched.
- Constrained tool exposure: Admins control which MCP tools reach which roles. You might open reporting tools to everyone while locking financial modelling behind CFO access.
Quick Answers to the Questions We Hear Most
“What about data in transit?”
MCP connections between Capstone and the AI client are authenticated and encrypted (TLS). Data is protected on the wire, not just at rest.
“Can users derive sensitive metrics from the raw data the AI can see?”
Not with calculation-level governance. Capstone checks permissions on derived metrics, not just on the input tables. If a user doesn’t have access to cost-per-tonne, the AI can’t return it — even if the underlying cost and production data are individually visible to that user. See the worked example above.
“What if we already have column-level security in our data warehouse?”
Column-level security controls that specify which fields a user can see. It can’t control what the AI derives by combining fields the user doesn’t have access to. Calculation-level governance fills that gap by permissioning the derived metric itself, rather than just its inputs.
Questions to Bring to Your Security Review
If you’re evaluating any AI data platform, ours or anyone else’s, these are the questions that separate robust governance from marketing claims:
- Does the AI model connect directly to our database, or does it go through a governed intermediary? Direct access means you’re relying entirely on the AI vendor’s data handling promises. A governed layer means you control the trust boundary.
- Does access control extend to derived metrics, or only to tables and views? Table-level RBAC can’t prevent users from asking the AI to calculate restricted metrics from accessible input data. Calculation-level governance can.
- Can we audit exactly what data the AI accessed, when, and for whom? If there’s no query-level audit trail, you can’t demonstrate compliance or investigate incidents.
- What happens to our data after the AI processes it? Check retention periods, whether zero-retention options exist, and whether connector/tool data is treated differently from chat data.
- Is our workspace architecturally isolated, or just logically separated? Logical separation (permissions within a shared environment) is weaker than architectural isolation (separate tenant boundaries).
- Can we switch AI vendors without rebuilding our governance model? If governance is tied to a single AI platform, you’re locked in. Open standards like MCP make the AI layer replaceable.
- Does the platform enforce domain-specific calculation rules, or just pass through raw data? Platforms that understand your metric definitions (formula dependencies, aggregation rules, valid calculation paths) can prevent errors and unauthorised derivations that raw data layers can’t. (For more on why this matters, see how operational intelligence differs from data engineering.)
The Bottom Line
The real question isn’t whether AI is secure; it’s whether your architecture puts governance in the right place.
When the data platform controls access, enforces permissions, and audits queries, the choice of AI vendor becomes a matter of preference rather than a risk. Your IT team picks what they trust. Your security team enforces what they need. Your operations team gets answers in minutes, not hours.
Want to see what that looks like with your actual data? We can show you.
Ready to see how governed AI access protects your operational data? Book a 30-minute Capstone demonstration
Related reading: When Your Dashboard Talks Back: MCP and Industrial AI · Power BI and Capstone: Why You Need Both · Real-Time Is for Alarms, Daily Is for Decisions · Why Spreadsheets Lie About Cost Per Unit · Microsoft Fabric and Capstone: Different Problems, Different Tools
