Skip to main content

Data Minimization - Collect Only What You Need

About 2 min read

Data minimization is the principle of collecting, processing, and retaining only the minimum amount of data necessary to achieve a specific purpose. It is explicitly stated in GDPR Article 5(1)(c) as "adequate, relevant and limited to what is necessary," and it is one of the core principles of European data protection law. Beyond being a mere legal obligation, it also functions from a security standpoint as a simple yet powerful defensive strategy: "data you do not hold cannot be leaked."

Why Data Minimization Strengthens Security

The essence of data minimization lies in "reducing the attack surface." If the amount of data retained is small, then in the event of a data breach, both the number of affected individuals and the types of data leaked are limited.

When retaining large amounts of data

The damage in case of a breach is enormous. Fines from regulators are also substantial. Storage costs, encryption costs, and the management burden of access control all increase.

When retaining only the minimum necessary data

The scope of impact in case of a breach is limited. The cost of regulatory compliance is also reduced. Data quality management becomes easier and analytical accuracy improves.

In the 2017 Equifax data breach, the Social Security numbers, dates of birth, and addresses of 147 million people were leaked. One factor that amplified the damage was the long-term retention of historical data unnecessary for credit checks. Had data minimization been thoroughly enforced, the volume and variety of leaked data could have been substantially reduced.

Concrete Implementation Examples

MeasureConcrete actionEffect
Reduce registration form fieldsAllow registration with only an email address, and collect name, address, and phone number only when they become necessaryHigher registration rate + less retained data
Set data retention periodsAutomatically delete access logs after 90 days and data of withdrawn users after 30 daysPrevents bloating of accumulated data
Separate data by purposeAnonymize analytical data before moving it to a separate database, and delete personal information from the production DBReduces risk in the production environment
Minimize privilegesShow only the order history to customer support and hide credit card numbersReduces the risk of internal fraud

Combining it with anonymization and data classification further enhances the effectiveness of data minimization. It is important to classify which data is confidential and which is unnecessary, and then run a cycle of actively deleting the unnecessary data.

Trade-offs with Business

The biggest obstacle to data minimization is the request from business units to "keep it because we might use it in the future." There is a deeply ingrained belief that the more data you hold, the greater the business value, in areas such as marketing analytics, training machine learning models, and long-term trend analysis of customer behavior.

However, the more data you retain, the more management costs (storage, encryption, access control, audit logs) increase, and the legal risk in the event of a breach also rises. The cap on GDPR fines is 4% of global annual revenue or 20 million euros, whichever is higher, so the risk posed by retaining unnecessary data cannot be ignored. Following the principle of privacy by design and building minimization into the design stage of data collection is the best approach to balancing business and privacy.

Designing a Data Retention Policy

To make data minimization an effective measure, a clearly documented data retention policy is essential. The policy should include the following elements.

  • Retention periods for each data category (e.g., transaction logs for 7 years, marketing data for 2 years)
  • A mechanism for automatically deleting data past its retention period (because manual deletion gets forgotten)
  • Exception provisions for data subject to legal retention obligations (tax records, litigation holds, etc.)
  • Special handling provisions for personally identifiable information

Reading the articles Balancing Privacy and Convenience, Privacy Settings Guide, and Data Breach Response will give you a fuller picture of data protection.Data protection books on Amazon will help you deepen your understanding from both the regulatory and technical sides.

Related Terms

Was this article helpful?

XHatena