Explaining Data Classification for GDPR, HIPAA, and Beyond

Sam Noss, August 29, 2022

Organizations use data classification to identify and categorize the personal data they hold like names, addresses, email addresses, phone numbers, account numbers, IP addresses, and other personally identifiable information (PII).

Organizations using data classification will know exactly where customer data and other sensitive information lives, who has access, and how it’s processed. This allows organizations to improve their data management practices and apply appropriate safeguards and data access controls.

The European Union’s General Data Protection Regulation (GDPR) requires organizations to protect sensitive information by ensuring that company data processing practices are consistent with the protection principles outlined in the regulation. Making data classification an essential component of a privacy program can allow for easier, more efficient GDPR compliance.

This blog provides information on data classification and outlines why businesses operating in the EU should focus on ongoing classification and data discovery efforts.

4 Types of Data Classification by Level of Sensitivity

The four standard data classification levels are:

Public data (least sensitive) — Any information that can be made freely accessible by anyone or is already (e.g., public records, press releases, and promotional materials). This data type may be considered public, but the data owner is still subject to some situational regulations when it comes to sharing, storing, and organizing data.
Internal-only or internal data — Any information restricted to an organization’s employees or members (e.g., business plans, internal communications). This data type can’t be shared outside of the organization.
Confidential data — Any sensitive information that requires elevated access permissions even within the organization, but won’t result in legal consequences if confidentiality is violated. This data is inaccessible without specific, role-based rights and isn’t shareable unless the recipients have been granted those same access rights.
Restricted data (most sensitive) — Any information carrying significant legal and regulatory penalties should access violations occur, likely resulting in criminal charges and substantial fines. This data is generally protected under a compliance framework or would severely damage the organization if released (e.g., sensitive customer or employee personal data, proprietary research and development, etc).

Classifying data according to the above categories allows those responsible for data management to quickly determine if the sensitivity-based protections are present and appropriately enforced.

Data Classification for GDPR: Why It Matters

Data classification helps organizations identify which personal data is subject to specific GDPR requirements, like obtaining explicit consent from data subjects, or notifying data subjects in the event of a data breach.

By classifying personal data, organizations can apply appropriate safeguards and controls to protect it and ensure compliance with GDPR. Unorganized data is difficult to manage and can lead to hefty fines, penalties, and irreparable damage to an organization’s reputation.

Nearly every organization is subject to regulatory compliance requiring some sort of data oversight.

Several dedicated personnel (e.g., data controllers) are responsible for that administration which includes tracking the storage location of sensitive and protected data. Their roles depend on understanding the types of data their organization possesses, which means they rely on an accurate data classification process and ongoing data discovery.

Data classification is regularly used as a precursor for matching fields and information between different databases (i.e., “data mapping”). This is a crucial step for:

Database migrations
Mergers and acquisitions
IT systems and other resource integrations

Simply put, establishing security measures and implementing data classification practices can help organizations avoid information management disasters.

How Does the US Government Classify Confidential Data?

The four data classification categories also mirror those adopted by the US government:

Confidential — Any data that would cause some national security damage if released.
Secret — Any data that would cause “serious” national security damage if released.
Top secret — Any data that would cause “exceptionally grave” national security damage if released.

Technically, the US enforces a three-category system. However, any information that doesn’t fall into one of the above is considered “public data,” operating as a de facto fourth category.

Master the data privacy basics: Read the Privacy Primer

Other Compliance Guidance for Classifying Data

Classifying data based on relevant compliance frameworks helps organize entire databases or datasets and derive insights for optimal data management.

Digitalization means sensitive data could end up anywhere. Failing to properly manage data sprawl can result in heavily damaging consequences that could subject an organization to legal and financial penalties, client loss, or public reputation damage.

Some of the frameworks that require or benefit from data classification include:

PCI DSS — This framework applies to every organization that collects, processes, stores, or transmits credit card and cardholder data (CHD). The PCI DSS comprises 12 Requirements and an extensive listing of sub-requirements for protecting data like card numbers, PINs, and cardholder addresses.
- Some PCI DSS obligations can be outsourced with third-party payment gateways, but the outsourcing organization still bears the ultimate compliance responsibility.
HIPAA — The Health Insurance Portability and Accountability Act (HIPAA) outlines the technical, administrative, and physical safeguards that healthcare and healthcare-adjacent organizations must use to secure protected health information (PHI). Whether data is considered PHI depends on the presence of any of these 18 designated “identifiers:”
- Names
- Social security numbers
- Addresses more specific than state-level residence
- Any relevant dates more specific than the year
- Telephone numbers
- Fax numbers
- Email addresses
- Medical records
- Health plan beneficiary numbers
- Account and payment numbers
- Certificates or license numbers
- Vehicle identifiers and serial numbers
- Device identifiers and serial numbers
- Web URLs
- Internet Protocol (IP) addresses
- Finger- or voiceprints and other biometric data
- Photographic images (not limited to those revealing an individual’s face)
- Any additional data that could be used to identify a unique individual

CMMC — The Cybersecurity Maturity Model Certification will be mandatory for organizations seeking contracts with the US Department of Defense (DoD) by 2026. It specifies the standard for protecting federal contract information (FCI) and controlled unclassified information (CUI) that could significantly damage national security if accessed by unauthorized personnel.
GDPR – The EU’s General Data Protection Regulation (GDPR) outlines and standardizes the data privacy rights held by EU citizens. Crucially, the GDPR is applicable regardless of where pertinent data is stored or the location of the possessing organization. All that matters is whether the “data subject” is a resident of a nation within the European Union or the European Economic Area (EEA).
CCPA – Like the GDPR, the California Consumer Privacy Act (CCPA) oversees the privacy and security of California citizens’ personal information obtained by commercial entities. Similar state-level consumer protections have been enacted in Colorado, Connecticut, Utah, and Virginia. Read more about PII data here.

These frameworks cover only some compliance obligations your organization may be subject to on top of local, state, and federal laws. Others include those for financial record keeping outlined in the Sarbanes-Oxley Act (SOX) and data security practices assessed during any of the various SOC certifications overseen by the American Institute of Certified Public Accountants (AICPA).

Thorough research and legal consultation may be necessary to determine all compliance obligations. However, data classification greatly assists ongoing efforts once you know to which standards you must adhere.

Data Classification Best Practices

Achieving successful and efficient information management becomes easier by following data classification best practices:

Leverage scanning tools — Locating data within your network and systems is challenging enough, and adding cloud environments only complicates the matter. Manually sifting through every possible data storage location is simply infeasible, so you should implement scanning tools. For example, some scanners will look for known PII metadata and formats, like 9-digit social security numbers or 16-digit credit card numbers.
Maintain accurate data inventory with documentation — The documentation and inventorying step for managing data represents one of the best times to begin classification. As you’re already cataloging and organizing the information, there’s no better time to apply the four data classification levels.
Create a security program and implement controls per NIST guidelines — The National Institute of Standards and Technology (NIST) released numerous updated “Special Publications” (SP) related to data security and data classification policy implementation. Some include:
- SP 800-53 — Among the most comprehensive information security guidance documents published by NIST, the controls outlined in SP 800-53 are the federal government’s standard. Meeting them builds a firm compliance foundation for US organizations. SP 800-53A and SP 800-53B contain supplementary guidance and control baselines. ISO/IEC 27001 compares for international purposes.
- SP 800-66 — Referenced by the Department of Health and Human Services for HIPAA compliance, SP 800-66 contains guidance on conducting risk assessments. Risks are ranked by both impact severity and likelihood to determine which an organization should prioritize (low, medium, and high risk). Use your data classifications to conduct periodic risk assessments that help shape your broader information security program.
- SP 800-171 and SP 800-172 — Derived from and largely mappable to SP 800-53, these two documents specifically apply to the protection of CUI for national security. Levels 2 and 3 of the CMMC are primarily based on these special publications.

At a high level, data classification is a straightforward discipline. Best practices revolve around implementing, maintaining, and leveraging those classifications for easier ongoing organizational data management.

Data Classification With DataGrail

There are two primary reasons to classify data: It simplifies compliance requirements and helps businesses grow their reputations. Unfortunately, many organizations forget that adhering to compliance standards can help facilitate future partnerships and revenue growth.

Your partners and customers want to interact with entities they can trust. At DataGrail, we believe robust and transparent data protection practices are foundational for trust and rely on thorough classification. That’s why we’ve built a data privacy platform that streamlines adherence to data privacy regulations, builds trust, and outsmarts business risk.

Reach out today to learn more about how we can help streamline your data privacy compliance!

Sources:

Loyola University of Chicago. 18 HIPAA Identifiers. https://www.luc.edu/its/aboutits/itspoliciesguidelines/hipaainformation/18hipaaidentifiers/

National Conference of State Legislatures. State Laws Related to Digital Privacy. https://www.ncsl.org/research/telecommunications-and-information-technology/state-laws-related-to-internet-privacy.aspx

National Science Foundation. Eight Steps to the New Cybersecurity Maturity Model Certification (CMMC) Now Required by the DoD. https://www.nsf.org/knowledge-library/eight-steps-new-cybersecurity-maturity-model-certification-cmmc-required-dod

NIST. SP 800-53 Rev. 5. https://csrc.nist.gov/publications/detail/sp/800-53/rev-5/final

NIST. SP 800-66 Rev. 2 (Draft). https://csrc.nist.gov/publications/detail/sp/800-66/rev-2/draft

NIST. SP 800-171 Rev. 2. https://csrc.nist.gov/publications/detail/sp/800-171/rev-2/final

NIST. SP 800-172. https://csrc.nist.gov/publications/detail/sp/800-172/final

PCI Security Standards Council. PCI DSS Requirements and Testing Procedures Version 4.0. https://listings.pcisecuritystandards.org/documents/PCI-DSS-v4_0.pdf

USC Dornsife. What is classified information, and who gets to decide? https://dornsife.usc.edu/news/stories/2609/what-is-classified-information-and-who-gets-to-decide/

Data privacy 101 guide to discover how you can provide the best consumer experiences while avoiding business risk.

Learn More