Industry attributes
Other attributes
The field of privacy has been, for a long time, the realm of lawyers and consultants who are responsible for developing and implementing privacy policies, handling contractual risks to data, managing privacy risk in business processes, and ensuring effective notice and publishing practices. However, privacy engineering has developed in order to effectively manage and govern how data is used in practical terms in the associated products and technologies. As a discipline, privacy engineering is relatively newer, with regulations such as the European Union's General Data protection Regulation (GDPR) mandates offering an incentive for organizations to implement privacy engineering policies and practices.
In this way, privacy engineering is often considered the technical side of the privacy profession, with privacy engineers working to ensure that privacy considerations are integrated into product design as they work as part of product teams, design teams, IT teams, security teams, and even perhaps as part of legal or compliance teams. Meanwhile, ISACA's publication Privacy in Practice 2021: Data Privacy Trends, Forecasts and Challenges notes that while boards of directors generally recognize the importance of strong privacy, especially in the face of hefty fines for violating privacy regulations and the harm of reputation associated with the publicity of these violations, the adoption of privacy practices is not as widespread as it is assumed it should be.
The report found, through a survey of more than 1,800 constituents, at least 52 percent of privacy professionals found their board of directors prioritized privacy, while nearly 50 percent of respondents said they had inadequate privacy budgets compared to only 34 percent of respondents who said their privacy budgets were adequately funded. A further 64 percent cited poor training or lack of training as a common privacy failure, 53 percent of respondents listed failure to perform a risk analysis, and 50 percent listed bad or nonexistent detection of personal information.
The foundational principles of the field of privacy engineering are often attributed to Dr. Ann Cavoukian, the former Information and Privacy Commissioner of Ontario, who in January 2011, published a pamphlet on the foundational principles of privacy engineering. These principles included a concept developed by Dr. Ann Cavoukian in the 1990s called Privacy by Design (PbD), meaning that privacy is a part of the design of a product or service.
These principles, also considered the foundational principles of Privacy by Design, include proactivity, privacy as default, embedded privacy, full functionality, end-to-end security, visibility and transparency, and respect for user privacy.
In this approach, proactivity means being proactive and not reactive, or preventative and not remedial. This includes anticipating and preventing privacy invasive events before they happen, rather than waiting for these risks to materialize.
Privacy as default in this approach refers to the default rules of a given software or product is to keep the user's privacy; or that in the case of an individual doing nothing, their privacy is protected. This would keep the individual's privacy intact, rather than requiring action on the part of the individual to protect their privacy.
Embedded privacy in this approach means that privacy is part of the design and architecture of both IT systems and business practices, rather than a bolt-on as needed and after the fact.
In the Privacy by Design approach, full functionality means a positive-sum approach, rather than zero-sum design. A positive-sum approach is capable of accommodating all legitimate interests and objectives in a positive-sum, or a "win-win" manner. This would work to avoid false dichotomies, such as privacy versus security, but instead work to have both.
End-to-end security means embedding security and privacy throughout the entire lifecycle of the data involved, with strong security measures being part of the privacy from start to finish. This would also work to ensure that all data is securely retained and securely destroyed at the end of the process and in a timely fashion.
Visibility and transparency means working to assure all stakeholders that whatever business practice or technology is involved, it is operating according to stated promises and objectives, subject to independent verification. And the component parts and operations remain visible and transparent to users and providers alike.
Respect for user privacy means that all architects and operators work to keep the interests of the individual uppermost by offering such measures as strong privacy defaults, appropriate notice, and user-friendly options.
As laid out in the above principles, the paradigm of Privacy by Design has gained importance since its inclusion in privacy regulations, with the most notable being the European Union's General Data Protection Regulation (GDPR). With the principle of embedding privacy into the design of and architecture of an IT system, rather than integrating it as a separate component, PbD offers itself and principles as a solution to privacy concerns.
However, translating these principles into engineering can still be a challenge, which has been acknowledged by, among others, the European Union Agency for Network and Information Security (ENISA). However despite the GDPR being in effect since 2018, there remains no standard or agreed upon best practice of how to integrate privacy into the software development lifecycle.
Other foundations for privacy engineering have been suggested, including the implementation of Fair Information Practice Principles (FIPPs), which were initially developed by a federal advisory committee in 1973 and initially commissioned because of concern over the possible consequences of computerized data systems on the privacy of personal information. Different versions of FIPPs have been defined, with one version found in the Guidelines on the Protection of Privacy and Transborder Flows of Personal Data which was developed by the Organization for Economic Cooperation and Development (OECD) in 1980 and later replaced in 2013 by the OECD Privacy Framework.
The principles that were published in the OECD documents are often considered the more largely adopted privacy framework and have worked as the basis for privacy laws worldwide.
2013 OECD Privacy Framework Privacy Principles
Often organizations combine Privacy by Design with the OECD Privacy Framework, and they rely on the following (and potentially other) activities to address privacy risks:
- Policy
- Risk assessments
- Notice
- Records management
- Accounting of disclosures
- Data flow mapping
- Data loss prevention
- Metrics
While implementing or operationalizing Privacy by Design as more than a philosophical framework within IT systems can be done by the following:
- Segmenting PbD into activities aligned with the systems engineering life cycle and supported by particular methods to account for privacy's distinctive characteristics.
- Defining and implementing requirements for addressing privacy risks within the systems engineering life cycle using architectural, technical point, and policy controls. Privacy requirements must be defined in terms of implementable system functionality and properties. Privacy risks, including those beyond compliance risks, are identified and adequately addressed.
- Supporting deployed systems by aligning system usage and enhancement with a broader privacy program.
The goal of this kind of framework is to integrate privacy into an existing system's engineering processes, rather than to create a new or separate process. The core privacy engineering activities are intended and designed to be mapped to stages of the classical system's engineering life cycle. A mapping exists for other systems' engineering life cycles, including for agile development, because every life cycle includes the core activities in some form.
One important facet of privacy engineering is to treat privacy as more than a compliance issue—thinking in an efficient manner and seeing privacy as a problem of strategy, management, and technology. Security systems traditionally are represented with a confidentiality, integrity, and availability triad. Privacy engineering objectives and controls could be expressed in a similar triad of predictability, manageability, and disassociability.
Predictability is meant to build trust and provide accountability and requires having a good understanding of data handling in the system. The purpose is to eliminate surprises on later phases of system use. Predictability ensures knowing how data is really used and helps avoid questions about why some data is being collected. This can be an important aspect of achieving transparency. Objectives of predictability can be met with technical solutions such as de-identification, anonymization, and more. In general, predictability should also ensure that both system owners and users understand what is going on.
Manageability concerns individual phases of data processing, such as collection, storage, and change, and relates to accountability. Manageability works to ensure that it's possible to control data on all system layers and that this control is made in an accountable manner. This would also work to ensure that the data is accurate and can be updated if needed. Manageability is also the layer of control offered to the user, often through privacy preferences. It can help answer the questions of whether a user has opted-out from parts of data processing and if the organization is respecting the user opting out.
Dissociability refers to identity, identifiers, and identifier linkage as design and protection challenges. This aspect of privacy engineering, if done well, can offer real privacy-preserving functionalities. It's also a privacy engineering layer where cryptography can be applied to obfuscate data. Privacy engineers in this case are required to be up-to-date with research, technology, trends, and cryptographic techniques. Dissociability is also a layer in which techniques such as anonymity, de-identification, unlinkability, unobservability, and pseudonymity are important.
Privacy engineering is an approach in which privacy is implemented in the designs of systems, while Privacy by Design advocates for privacy to be one of the foremost goals of the engineering process. This works to ensure that privacy is part of the development of a product, and that in the drive to bring products to market before the competition, privacy is not overlooked. Part of the data that privacy engineering works to protect is personally identifiable information (PII), which can include:
- Name
- Gender
- Address
- Birthdate
- Phone number
- Vehicle description and number
- Occupation
- Place of business
- Geographic location and movement
- Email address
- Biometric data
For some consumers, it can be surprising to find that an activity as simple and as common as a "like" on Facebook can provide a reasonable predictor of age, gender, ethnic group, religion, marital status, sexual orientation, and political views. And the task of who is to protect the PII is unclear, with over half of consumers believing that it is the task of an organization, while what constitutes privacy also varies legally from one country to another. Similarly, the legal obligation for the protection and maintenance of data collected is fairly unclear, and changes from one country to another.
Privacy enforcement has accelerated since the European Union's General Data Protection Regulation (GDPR) came into effect in 2018, and since then more than forty sets of privacy legislations have been enacted worldwide. These privacy regulations come with expensive fines, with privacy violations and related fines by mid-August of 2020 being estimated to have been worth around 60 million euros in the European Union. The maximum penalty against a company in the GDPR legislation is up to 20 million euros, or up to 4 percent of the company's global turnover for the preceding fiscal year, whichever is higher.
Under the GDPR, data subjects, or consumers, must be allowed to give explicit, unambiguous consent before the collection of personal data. The definitions of personal information are incredibly important in privacy legislation, as in the United States a computer's IP address is generally not considered to be personal data, but under the GDPR it is. As well, the GDPR requires organizations to notify supervisory authorities and data subjects within seventy-two hours in the event of a data breach affecting personal information. Data subjects are also given rights regarding their personal information, including the following:
- The right to be informed, including informed about the collection and use of personal data when the data is obtained
- The right to access their data, where a data subject can request a copy of their personal data; and data controllers must explain the means of collection, what's being process, and with whom it is shared
- The right of rectification, in which a data subject has the right to ask an organization to rectify data if it is inaccurate or incomplete
- The right to erasure, in which data subjects can request the erasure of personal data related to them
- The right to restrict processing, in which data subjects have the right to request the restriction or suppression of their personal data
- The right to data portability, in which data subjects can have their data transferred from one system to another safely and securely
- The right to object, where data subjects can object to how their information is used for marketing, sales, or non-service-related purposes
Similarly, in Quebec, Canada, Bill 64, a data protection and privacy regulation, has set fines for breaching the regulation at $25 million or a similar 4 percent of the global turnover for the previous fiscal year, with tougher breach requirements, and mandatory privacy assessments for any information system project or electronic service delivery project involving the collection, use, communication, keeping, or destruction of personal information. Similar to the GDPR, if data is to be transferred out of its jurisdiction, it has to receive similar levels of protection at the destination and during transit or else it may not be transferred.
Data privacy legislation
With many of the new laws, regulations, and protections coming into effect and with more being anticipated as concerns around data have grown, there are specific engineering practices that can be utilized to increase in an organization's capability to stay compliant with those laws and regulations. This can include adopting a clear technical framework, strengthening inconsistent or unreliable services, mapping and classifying data, maintaining version control for databases, automating where possible, and ensuring changes are backed up.
A clear technical framework is important as the messiness caused by poor systems design and operations built over time can create inevitable debt and cascading dysfunction downstream. Demonstrating compliance with any future privacy or data regulation will require an auditable source of truth for data processing, helping ensure that engineers and lawyers are not forced to make critical decisions on the basis of inaccurate assumptions or incomplete or misleading information. To solve such problems, an organization can adopt a core technical framework where privacy controls can be applied to any new or existing system.
For example, it should be easy for a developer creating a system to hook into a core personal data deletion process. In environments characterized by continual change, completeness requires dedicated and vigilant systems management. This goes for SaaS data stores too, as teams procure new vendors, and these systems should also interface with data governance workflows.
Strengthening inconsistent and unreliable services can help infrastructure engineers keep important, business-critical systems up and running despite natural disasters, neighboring systems outages, and configuration errors. An outage or poor performance in services can be responsible for data deletion, logging, or user communication escalation into compliance violations. And, regardless of the specific deliverables, a organization needs to be confident that technical systems can reliably enable users to exercise their rights under the law such as data access and portability requests, data deletion, or opt-out preferences.
Maintaining an effective failover and incident response protocol can mean systems critical for privacy procedures are classified by engineering organizations. And this classification can help provide a technical foundation for any additional legal requirements that may need to be built on later with increasing regulations.
This is a basic requirement for existing privacy regulations, which requires organizations to already be working on developing frameworks for tracking where and what data an organization has. However, because this is often a time- and engineering-intensive endeavor, its a good idea to leverage as many opportunities as possible to refine, mature, and expand visibility and understanding of the data that exists inside an organization. Existing privacy regulations and proposed measures all intend to ensure protections and controls for data usage, access, and deletion. This makes data mapping and classification critical for current and future laws, and regardless of the final details dictated by law, compliance is only possible when an organization knows what data it has and where the data is housed.
Ensuring an engineering organization uses the same version control tools for a database used for application development can ensure that teams have access to the latest version of database code. Maintaining a single source of truth with a full audit trail for regulatory compliance is critical, should it need to be remediated or explain any changes that impact user data. This can be important for privacy, as code clashes and inconsistencies can break processes for delivering data rights to consumers, such as access to their personal data, and threaten an organization's compliance to the different regulations and laws governing privacy.
As an organization builds the infrastructure and ensures production engineering and version control are in place, the organization can begin to automate parts of the privacy processes further down the pipeline to make those processes more reliable. For labor-intensive privacy requirements such as data access or deletion, automation helps reduce manual errors, demonstrates a clear process, and provides an audit trail for consumers and regulatory assurance.
The data privacy and protection requirements also add extra concerns for data backups. For example, given that data should be held for no longer than necessary, it needs to be removed from backups as well as the original database itself. Backups also need to be protected and managed in a documented, compliant manner. Understanding, refining, and improving the way backups are maintained at an organization in between legislative sessions can pay off for privacy and protection mandates that are applied to the data used by an organization.
The proposal of protection goals for privacy engineering distinguishes between three protection goals known for years within the IT security domain, and three protection goals genuine for privacy and data protection. In this context, it can be important to note that the terms of privacy and data protection are not synonymous. In the variety of definitions, cultural concepts, and translations, the clear distinction is blurred and not all distinguish between the two terms. However, the EU Charter of Fundamental Rights distinguishes between privacy and data protection.
This reflects a difference in that privacy takes the perspective of an individual who tries to fight against the impertinence of control of others. Data protection rather refers to the organizational perspective, and namely in the social context of information processing, and where self-determination and privacy are only possible if organizations are prevented from using or misusing their power advantage over people. Which, in a real world sense, means that data protection works to tackle real choice in markets, functioning separation of powers in a constitutional state, democratic decision making, and free discourses. The data protection laws usually address this bigger objective by regulating the use of personal data in order to indirectly strengthen the fundamental rights of the individual in society.
The discipline of privacy engineering develops techniques and methods for both aspects—on one hand, these techniques can be used for the domestication of organizations that deal with personal data, and on the other hand they provide immediate, effective protection of the personal data of those concerned. Information security is also important to support privacy engineering, but in this respect the often predominant focus on the interest of the organization has to be shifted towards the rights of the individual. This is often intended by the concept of multilateral security, which aims to empower users and stresses that imposing disadvantageous compromises on users must be prevented. And in this way it is important for privacy engineers to meet the protection goals in the field of security and privacy and data protection.
A set of three security protection goals—confidentiality, integrity, and availability—have been developed through the traditional consideration of information security in IT systems. Known as the CIA triad, these three aspects are considered to be of critical importance in order to evaluate an IT system's security conditions.
Therein, confidentiality addresses the need for secrecy, or the non-disclosure of certain information to certain entities within the IT system. Integrity expresses the need for reliability and non-repudiation regarding a given piece of information, or the need for processing unmodified, authentic, and correct data. As an important subset of such data, identity-related information is needed to be authentic to perform access control operations. Availability represents the need of data to be accessible, comprehensible, and processable. Where confidentiality addresses non-disclosure to unauthorized entities, availability requires explicit and full disclosure to authorized entities, and integrity measures work to make the distinction between authorized and unauthorized entities.
Each of these protection goals assumes an IT system beneath that is capable of supporting or limiting he particular protection goal, based on the technical details of implementation. What is left unconsidered in the triad approach above is the real world or the organizational and societal dimensions of the IT system, and its possible impact on the privacy of individuals. This means that in order to evaluate an IT system's impact on privacy and data protection, the triad of security protection goals are further complemented with three further privacy and data protection goals—unlinkability, transparency, and intervenability.
The protection goal of unlinkability refers to the property that privacy-relevant data cannot be linked across domains that are constituted by a common purpose and context. This implies that processes have to be operated in such a way that the privacy-relevant data are not linkable to privacy-relevant information outside of the domain. Unlinkability is related to the requirements of necessity and data minimization as well as purpose determination, purpose separation, and purpose binding.
The most effective method for unlinkability is data avoidance. Other methods for achieving or supporting unlinkability are data reduction, generalization, data hiding, separation, and isolation. The unlinkability protection goal needs to be considered already in engineering phases because the design decision may prevent proper realization of this goal.
This definition is broader than in most terminology papers or in the Common Criteria Standard. Those publications tend to regard unlinkability as a specific property or goal of data minimization, with concepts such as anonymity or unobservability with a similar emphasis. The broader definition is capable of even encompassing societal concepts such as division of power.
The goal of transparency is defined as the property that all privacy-relevant data processing, including the legal, technical, and organizational setting, can be understood and reconstructed at any time. The information has to be available before, during, and after the processing takes place. Thus, transparency has to cover not only the actual processing, but also the planned processing and the time after the processing has taken place to know what happened.
The amount and level of information provided and how to communicate how much information is provided has to be adapted according to the capabilities of the target audience, or the data-processing entity, the user, an auditor, or a supervisory authority. Transparency is related to the requirements of openness. Furthermore, it is a prerequisite for accountability. Standard methods for achieving or supporting transparency comprise logging and reporting, documentation of the data processing, or user notifications.
The protection goal of intervenability can be defined as the ability to intervene in any and all ongoing or planned privacy-relevant data processing. In particular, it applies to the individuals whose data is being processed in a system. This is intended to help in the effective enforcement of changes and corrective measures in a system, and often reflects the individual's rights to rectification and erasure of data, the right to withdraw consent, and the right to lodge a claim or to raise a dispute to achieve remedy.
Similar to other protection goals, intervenability is important for stakeholders, for data-processing, and for supervisory authorities in order to effectively influence or even stop the data processing. For example, a cloud application where the personal data of a service's customer has to be erased, and the service provider must be able to enforce this erasure in the cloud run by a third party. Methods for achieving or supporting intervenability comprise implementation of dedicated services for intervention, definition of break-glass procedures, and means to override automated decisions.
With the full set of protection goals, there is a possibility that there is no way to ensure 100 percent of each goal can be reached simultaneously; such that, if a system provides confidentiality, that implies that access to certain data will be restricted for certain entities, and violate its availability. Integrity conflicts with intervenability, as the former disallows subsequent changes to the integrity data and process, and the latter requires exactly such ability for subsequent modifications. Transparency and unlinkability also conflict in nature, as the former intends to increase an understanding of the data processing, for example by logging users and administrators actions, while the latter works to avoid such knowledge, as it could be misused for unintended linkage.
Each conflict can typically be mitigated, depending on the considerations of a specific IT system, but for the general model of the privacy and data protection goals, these pairs of mutually affecting goals are represented as opponents. Beyond the three explicit conflicts, other interrelations among the six goals exist. For example, in order to utilize the protection goal of intervenability, a basic understanding of an IT system's functionality is required, which necessitates the need for transparency into the system. Similarly, if information is not accessible, it can also not be linked in unintended ways, pairing confidentiality and unlinkability.
The six protection goals and the induced requirement of harmonization among these can be found in several stages of legal implementation. For instance, the protection goal of transparency is directly implemented in the upcoming European General Data Protection Regulation, whereas the other protection goals can be derived from the regulations articles. The triad of CIA protection goals are addressed by the demands for security, the protection goal of unlinkability is covered by the purpose of limitation and data minimization, and intervenability comprises data subject rights, data portability, and other control features such as consent.
The full systems of protection goals have been implemented in the German federal state of Schleswig-Holstein as part of the state's data procession act, and an effort of ISO standardization of this methodology is underway. Since the protection goals can be derived from the law on the one hand, and security engineers familiar with the protection goals, or at least their concepts, on the other hand, and facilitate bridging of the legal and technical communities for gaining mutual understanding and cross-discipline collaboration, and provides a basis for successful privacy engineering.
To develop privacy engineering in a company's product or service, there is a need for more than related products and services that need to be designed with privacy in mind. It often requires all internal company systems. Similar to security vulnerabilities, the ideal state is for privacy issues to be discovered through automation and orchestration, which would in turn allow for handling those issues at scale. This requires data to be discovered automatically, and classified and linked to its owners. Data retention limits need to be self-enforced.
Relatedly, the organization's Data Subject Access Requests (DSARs) should route to the appropriate people for approval, with identity verification baked in, and should resolve automatically after approval. And while this seems an unlikely capability for many organizations to achieve with something that has not been considered a core function, and will not be a core function, this is the gap where the industry of regulatory technology vendors are seeking to fill.
Cloud technologies are developing privacy functionality in its products. Data stores like Amazon's DynamoDB support setting a time to live on data records, so they can automatically expire from tables. Azure's Data Catalog automatically aggregates data assets and advanced threat protection automatically discovers and labels data by sensitivity.
Privacy engineering vendors offer data discovery functions, which are products that offer to help an organization find existing data and incoming data, which is a necessary function for an organization to understand what data is being brought in, what data has to be kept private, and what data exists that needs to be continued to be secured or be discarded effectively.
Related to data discovery, data mapping offers an organization an understanding of where data is being held, where it is being transported to, and how it moves through a system. This can help keep a company stay compliant with privacy regulations, such as GDPR, which often requires a company to transfer data under conditions that demand the data is treated the same at the destination as it was at the originating destination.
Once data has been discovered and it has been mapped, other vendors offer data anonymization services. This can be done either once data has been ingested or during the ingestion process, depending on the needs of the company or the requirements of applicable regulations. Data anonymization services can remove all personally identifiable information (PII) and keep only the necessary pieces of data to keep a company compliant or ease the burden of compliance.
In the context of privacy engineering, network monitoring can monitor not only the sources of data and where data is coming in, but also who has access to the data and who is accessing the data at any given point. This can be especially important for companies offering services where outside users can access data and ensure those users cannot access sensitive data and can only access the data the company intends them to access.