Data Classification: Everything You Need to Know
4 Levels of Data Classification: Types, Compliance, & Best Practices
In the simplest terms, the data classification process streamlines and simplifies the ongoing management of your information. This is crucial because, depending on the type of data, a data steward must uphold associated compliance and privacy efforts for regulatory compliance (GRC), data governance and records management, and security/access control purposes.
For example, organizations subject to the Payment Card Industry’s Data Security Standard (PCI DSS)—which applies to any that collect, process, store, or transmit credit card data—can segment their network to ease their compliance burden. Data classifications assist with determining and enforcing the segmentation, illuminating which network areas require the most security concern.
Data classification isn’t necessarily required on its own. However, like any other effort, management becomes much easier when you have visibility. And if specific management standards are required, then simplifying their adherence can provide significant benefits.
In this guide, we’ll break down the basics of data classification and its benefits.
Why is Classifying Data Necessary?
Numerous compliance frameworks pertain to protecting specific categories of data based upon the information and how it can be used. So, understanding where the information is located, what it contains, its value, and other relevant details directly correlates to successful regulatory adherence.
Classifying data is necessary because you can’t effectively manage what you don’t understand.
And nearly every organization is subject to regulatory compliance that requires data oversight. Several dedicated personnel (e.g., data controllers) are responsible for that administration—especially tracking where sensitive and protected data is stored. Their roles depend on understanding the types of data their organization possesses, which means they rely on accurate data classification process.
Additionally, data classification is regularly used as a precursor for matching fields and information between different databases (i.e., “data mapping”). This is a crucial step for:
- Database migrations
- Mergers and acquisitions
- IT systems and other resource integrations
The importance of data classification is similar to a chef needing to know their ingredients. Some items, like meat or shellfish, could be dangerous with improper preparation or storage methods. If raw poultry isn’t recognized as a potential source of salmonella, it could be handled inappropriately and allowed to contaminate work surfaces and other areas.
Similarly, data classifications help responsible professionals and organizations avoid information management disasters.
Types of Data Classification
Broadly, data is classified into four categories based on how “sensitive” it is. Myriad characteristics determine whether the data element is sensitive information or not (and how much), but they generally relate to the consequences of unauthorized access. The more severe the consequences, the more sensitive the data element.
The standard four data classification levels are:
- Public data (least sensitive) – Any information that can be made freely accessible by anyone or is already (e.g., public records, press releases, and promotional materials). This data type can be stored and shared by the data owner without restrictions or repercussions.
- Internal-only or internal data – Any information restricted to an organization’s employees or members (e.g., business plans, internal communications). This data type cannot be shared outside of the organization.
- Confidential data – Any information that requires elevated access permissions even within the organization but won’t result in legal consequences if that confidentiality is violated. This data is not accessible without specific, role-based rights and is not shareable unless the recipients have been granted the same.
- Restricted data (most sensitive) – Any information that carries significant legal and regulatory penalties should access violations occur, likely resulting in criminal charges and substantial fines. This data is generally protected under a compliance framework or would severely damage the organization if released (e.g., customers’ or employees’ sensitive personal data, proprietary research and development).
Classifying data according to the above categories allows those responsible for managing it to quickly determine whether the sensitivity-based protections are present and appropriately enforced.
US Government Data Classifications
The four data classification categories also mirror those adopted by the US government:
- Confidential – Any data that would cause some damage to national security if released.
- Secret – Any data that would cause “serious” damage to national security if released.
- Top secret – Any data that would cause “exceptionally grave” damage to national security if released.
Technically, the US enforces a three-category system. However, any information that doesn’t fall into one of the above is considered “public data,” operating as a de facto fourth category.
What are the Common Requirements or Compliance for Classifying Data?
Compliance requirements generally apply to organizations based on their industry or location and often have a specific range of data use cases. Requirements will be codified within a regulatory framework outlining what data must be protected and how they are protected.
Classifying data based on the relevant compliance frameworks helps organize entire databases or datasets and derive insights on managing it optimally. Following the proliferation of cloud technologies and services, data transfer and storage locations are more challenging to track and manage than ever. No longer can organizations rely on the knowledge that physical documents are stored in one heavily secured location.
Digitalization means sensitive data could end up anywhere. Failing to properly manage this reality results in heavily damaging consequences that could sink an organization under legal and financial penalties or client and public reputation.
Some of the frameworks that require or benefit from data classification include:
- PCI DSS – As mentioned above, this framework relates to credit card and cardholder data (CHD) and applies to every organization that collects, processes, stores, or transmits it. The PCI DSS comprises 12 Requirements and an extensive listing of sub-requirements for protecting this data (e.g., card numbers, PINs, cardholder address).
- Notably, some PCI DSS obligations can be outsourced (e.g., with third-party payment gateways), but the outsourcing organization still bears the ultimate compliance responsibility.
- HIPAA – The Health Insurance Portability and Accountability Act informs the necessary technical, administrative, and physical safeguards with which organizations in and adjacent to healthcare must secure individuals’ protected health information (PHI). Whether or not data is considered PHI depends on the presence of any of these 18 designated “identifiers:”
- Social security numbers
- Addresses more specific than state-level residence
- Any relevant dates more specific than the year
- Telephone numbers
- Fax numbers
- Email addresses
- Medical records
- Health plan beneficiary numbers
- Account and payment numbers
- Certificates or license numbers
- Vehicle identifiers and serial numbers
- Device identifiers and serial numbers
- Web URLs
- Internet Protocol (IP) addresses
- Finger- or voiceprints and other biometric data
- Photographic images (not limited to those revealing individuals’ faces)
- Any other data that could be used to identify a unique individual
- CMMC – One of the more recent frameworks, the Cybersecurity Maturity Model Certification, will be mandatory for organizations seeking contracts with the US Department of Defense (DoD) by 2026. It stipulates the standard for protecting federal contract information (FCI) and controlled unclassified information (CUI) that could significantly damage national security if accessed by unauthorized personnel.
- GDPR – The EU’s General Data Protection Regulation enumerates and standardizes the data privacy rights held by citizens of European member states. Crucially, the GDPR is applicable regardless of where pertinent data is stored or the location of the possessing organization. All that matters is whether the “data subject” is a resident of a nation within the European Union or European Economic Area (EEA).
- CCPA – Like the GDPR, the California Consumer Privacy Act (CCPA) oversees the privacy and security of California citizens’ personal information obtained by commercial entities. Similar state-level consumer protections have been enacted in Colorado, Connecticut, Utah, and Virginia. What is PII data? Visit our blog for more.
The frameworks presented here only cover some compliance obligations your organization may be subjected to on top of local, state, and federal laws. Others include those for financial record keeping outlined in the Sarbanes-Oxley Act (SOX) and data security practices assessed during any of the various SOC certifications overseen by the American Institute of Certified Public Accountants (AICPA).
Thorough research and legal consultation may be necessary to determine all your obligations. However, data classification greatly assists those ongoing efforts once you know what standards must be adhered to.
Data Classification Best Practices
Achieving successful and efficient information management becomes easier by following data classification best practices:
- Leverage scanning tools – Locating data within your network and systems is challenging enough, and adding cloud environments only complicates the matter. Manually sifting through every possible data storage location is simply not feasible, so you should implement scanning tools. For example, some scanners will look for known PII metadata and formats, like 9-digit social security numbers or 16-digit credit card numbers.
- Maintain accurate data inventory with documentation – The documentation and inventorying step for managing data represents one of the best times to begin classifying it. As you’re already cataloging and organizing the information, there’s no better time to apply the four data classification levels.
- Create a security program and implement controls per NIST guidelines – The National Institute of Standards and Technology (NIST) has released and updated numerous “Special Publications” (SP) related to data security and implementing a data classification policy. Some include:
- SP 800-53 – Among the most comprehensive information security guidance documents published by NIST, the controls outlined in SP 800-53 are the federal government’s standard. Meeting them builds a firm compliance foundation for US organizations. SP 800-53A and SP 800-53B contain supplementary guidance and control baselines. ISO/IEC 27001 compares for international purposes.
- SP 800-66 – Referenced by the Department of Health and Human Services for HIPAA compliance, SP 800-66 contains guidance on conducting risk assessments. Risks are ranked by both impact severity and likelihood to determine which an organization should prioritize (low, medium, and high risk). Use your data classifications to conduct periodic risk assessments that help shape your broader information security program.
- SP 800-171 and SP 800-172 – Derived from and largely mappable to SP 800-53, these two documents specifically apply to the protection of CUI for national security. Levels 2 and 3 of the CMMC are primarily based on them.
At a high level, data classification is a straightforward discipline. Most best practices revolve around implementing, maintaining, and leveraging those classifications for easier ongoing management.
Maintain Optimal Data Classification with DataGrail
There are two primary reasons to classify data: It simplifies compliance requirements and helps businesses grow their reputation. Unfortunately, many organizations forget that adhering to compliance standards can help facilitate future partnerships and revenue growth.
Your partners and customers want to interact with entities they can trust. At DataGrail, we believe robust and transparent data protection practices are the foundation of that trust and rely on thorough classification. That’s why we’ve built a data privacy platform that helps streamline adherence to data privacy regulations and facilitates better business.
Reach out today to learn more about how we can help streamline your data privacy compliance!
Loyola University of Chicago. 18 HIPAA Identifiers. https://www.luc.edu/its/aboutits/itspoliciesguidelines/hipaainformation/18hipaaidentifiers/
National Conference of State Legislatures. State Laws Related to Digital Privacy. https://www.ncsl.org/research/telecommunications-and-information-technology/state-laws-related-to-internet-privacy.aspx
National Science Foundation. Eight Steps to the New Cybersecurity Maturity Model Certification (CMMC) Now Required by the DoD. https://www.nsf.org/knowledge-library/eight-steps-new-cybersecurity-maturity-model-certification-cmmc-required-dod
NIST. SP 800-53 Rev. 5. https://csrc.nist.gov/publications/detail/sp/800-53/rev-5/final
NIST. SP 800-66 Rev. 2 (Draft). https://csrc.nist.gov/publications/detail/sp/800-66/rev-2/draft
NIST. SP 800-171 Rev. 2. https://csrc.nist.gov/publications/detail/sp/800-171/rev-2/final
NIST. SP 800-172. https://csrc.nist.gov/publications/detail/sp/800-172/final
PCI Security Standards Council. PCI DSS Requirements and Testing Procedures Version 4.0. https://listings.pcisecuritystandards.org/documents/PCI-DSS-v4_0.pdf
USC Dornsife. What is classified information, and who gets to decide? https://dornsife.usc.edu/news/stories/2609/what-is-classified-information-and-who-gets-to-decide/