Your Sensitivity Label Taxonomy Is Probably Too Complicated
I have been writing and speaking about information architecture, metadata management, and healthy taxonomies since 2006. If you go back through my content history, you will find me making essentially the same argument in different contexts across nearly two decades: the enemy of adoption is complexity. People do not avoid governance frameworks because they disagree with the goals. They avoid them because someone made participation feel like homework.
Sensitivity labels are no different. And right now, in the run-up to broad Microsoft 365 Copilot deployments, organizations are repeating an expensive mistake that the SharePoint and records management communities made a generation ago — building classification systems that are technically correct, politically comprehensive, and practically unusable.
If your users need a decision tree to pick a label, you didn’t build a classification system. You built an obstacle course.
The Overly Eager Designer Problem
You know this person. Every organization has one. They show up to the governance kickoff meeting with a spreadsheet, an enthusiasm for edge cases, and a conviction that the classification system needs to account for every possible data type, department, regulatory regime, and project-level nuance before it goes live.
Eighteen months later, the organization has 34 sensitivity labels with names like “Confidential — HR — Compensation — North America — Draft” and a mandatory labeling policy that requires users to make a classification decision before saving any document. Adoption data shows that 73% of users are selecting the default label regardless of content because it is the path of least resistance. The other 27% have found creative ways to avoid the system entirely.
This is not a hypothetical. It plays out with remarkable consistency across organizations of every size and industry. Microsoft’s own documentation quietly acknowledges the pattern, noting that as a best practice, organizations should try to keep the number of labels to a minimum (https://learn.microsoft.com/en-us/purview/sensitivity-labels). The platform technically supports over 1,000 labels per tenant. Nobody should ever get close to that number. Nobody should get close to 20.
Why Simple Wins
The research on this is unambiguous. Overly complex taxonomies collapse adoption. When organizations create dozens of hyper-specific labels, employees become paralyzed by choice and often default to the lowest level of protection (https://www.congruity360.com/blog/sensitivity-labels-for-ai/). That outcome is worse than having no labels at all, because it creates a false sense of coverage while providing none of the enforcement.
Microsoft’s own implementation guidance recommends no more than five top-level parent labels, each with up to five sub-labels, to keep the user interface manageable (https://www.syskit.com/governance-handbook/sensitivity-labels/best-practices-sensitivity-labels/). Even that ceiling is generous for most organizations. For the majority of Microsoft 365 tenants, a clean four-label schema covers the real-world classification decisions that users actually need to make:
Public — Content that is approved for external audiences, no restrictions on sharing.
General — Internal content that does not require special handling. The default for most day-to-day work.
Confidential — Content restricted to internal audiences or defined partner groups, with sharing controls enforced.
Highly Confidential — Sensitive content requiring encryption, strict access controls, and DLP enforcement. HR, legal, finance, executive communications, M&A activity.
That’s it. Four labels. Four clear, human-readable descriptions that any employee in any role can apply without consulting a decision tree. The goal is to create a common language for risk that translates across departments — whether you are in Finance or Engineering, “Confidential” should mean the same thing.
What the Labels Actually Do
Here is where the conversation gets more technical, and where IT pros need to move past thinking of labels as digital stickers.
Each label in Microsoft Purview is a configuration object that can trigger protection actions across the Microsoft 365 ecosystem. When a user applies “Highly Confidential” to a document, the label can automatically encrypt the file, restrict who can open it, apply a visual watermark, block it from being shared externally, prevent it from being attached to email outside the organization, and trigger a DLP policy that blocks Copilot from processing its contents. All of that from a single user action: selecting a label.
The label writes metadata directly into the file header — not into a database somewhere, not into a SharePoint column that gets stripped when someone downloads the file. The protection travels with the document regardless of where it goes. That architectural decision is what makes sensitivity labels the right foundation for AI governance. Copilot reads the label, respects the policy, and surfaces or withholds content accordingly. The label is the enforcement mechanism, not a reporting tag.
Sub-labels can extend the schema for specific regulatory or operational needs without exploding the top-level experience. A “Highly Confidential” parent label might have a “GDPR” sub-label for organizations with specific European data residency requirements, or a “Board Only” sub-label for executive governance content. Users still see a manageable top-level list. The complexity lives in the configuration layer, not in the user interface.
Auto-labeling closes the coverage gap that manual labeling always leaves open. Purview can scan content in SharePoint, OneDrive, and Exchange and apply labels automatically based on sensitive information types — credit card numbers, Social Security numbers, passport data, custom patterns defined by your organization — without requiring any user action at all (https://learn.microsoft.com/en-us/purview/data-gov-best-practices-sensitivity-labels). The honest number from organizations that have run Copilot readiness assessments is that roughly 3-5% of documents carry a label before the project starts. Auto-labeling is how you get from 5% to meaningful coverage without asking 10,000 employees to reclassify their entire document history.
Start Simple, Govern Forward
The instinct to build the comprehensive system first is understandable. It feels responsible. It feels like you are accounting for the complexity of the real world. What it actually does is shift the cognitive burden of governance onto the people least equipped to make fine-grained classification decisions — your end users — while simultaneously making it easy to game the system by always picking the path of least resistance.
Four labels, clearly defined, with enforcement actions attached and auto-labeling doing the heavy lifting on existing content: that is a governance system that works. Build the foundation first. You can always add sub-labels as specific regulatory or operational needs emerge. What you cannot easily do is convince 10,000 employees to start taking a 34-label system seriously after they have already decided it is not worth their time.
The taxonomy is not the point. Protection is the point. Keep the first one simple enough that the second one actually happens.




