Impact assessment shows privacy risks in Microsoft Office ProPlus Enterprise
On behalf of the Dutch Ministry of Security and Justice, Privacy Company carried out a (DPIA) on Microsoft Office ProPlus (Office 2016 MSI and Office 365 CTR). With the permission of the Ministry, we publish this blog about our findings. For questions about the research you can contact SLM Rijk (Strategic Vendor Management for Microsoft within the Ministry of Justice), accessible via the Press Office from the Ministry of Justice, +31 (0)70 370 73 45.
The SLM Rijk conducts negotiations with Microsoft for approximately 300.000 digital work stations of the national government. The Enterprise version of the Office software is deployed by different governmental organisations, such as ministries, the judiciary, the police and the taxing authority.
The results of this Data Protection Impact Assessment (DPIA) are alarming. Microsoft collects and stores personal data about the behaviour of individual employees on a large scale, without any public documentation. The DPIA report (in English) as published by the Ministry is available here.
Starting today, and with the help of Microsoft, SLM Rijk offers zero exhaust settings to admins of government organisations. During the writing of this DPIA, Microsoft has committed to take a number of other important measures to lower the data protection risks.
Office 2016 and Office 365
Most government organisations in the Netherlands use versions of Office 2016 and Office 365 (or even older versions) that are installed on the computers of the government employees. The organisations store the content data locally, in their own data centres (on premise). But this will change. SLM Rijk conducts a pilot with data storage in the Microsoft cloud, in SharePoint, and in OneDrive. There is also a test with the web-only version of Office 365, where the software is no longer installed on the end-user devices. Also in the current set-ups, Microsoft collects data about the individual use of the software.
Large scale and covert collection of personal data
Microsoft systematically collects data on a large scale about the individual use of Word, Excel, PowerPoint and Outlook. Covertly, without informing people. Microsoft does not offer any choice with regard to the amount of data, or possibility to switch off the collection, or ability to see what data are collected, because the data stream is encoded. Similar to this practice in Windows 10, Microsoft has included separate software in the Office software that regularly sends telemetry data to its own servers in the United States. For example, Microsoft collects information about events in Word, when you use the backspace key a number of times in a row, which probably means you do not know the correct spelling. But also the sentence before and after a word that you look up in the online spelling checker or translation service. Microsoft not only collects use data via the inbuilt telemetry client, but also records and stores the individual use of Connected Services. For example, if users access a Connected Service such as the translate service through the Office software, Microsoft can store the personal data about this usage in so called system-generated event logs.
Difference between content, diagnostic, and functional data
Microsoft provides services over the Internet. From a technical perspective, it is inevitable that you have to provide data to Microsoft, such as the header of your e-mail and your IP address in order to be able to use the services. But Microsoft should not store these transient, functional data, unless the retention is strictly necessary, for example, for security purposes.
In this DPIA report (Data Protection Impact Assessment report), the data which Microsoft collects via Office ProPlus are divided in three categories:
- Content data: the content of files and communication that you store in your own datacenter or on cloud computers of Microsoft
- Functional data: the data you have to transmit over het Internet to be able to connect to Microsoft’s internet services
- Diagnostic data: the data that Microsoft stores for analysis of the usage of the services
In the report, Privacy Company uses these three categories of data in analogy with the division of communications data in ePrivacy law in Europe. This legislation distinguishes between (i) content, (ii) traffic/location data that are generated as a result of using the communication services, and (iii) data that are strictly necessary to transmit the communication, but have to be erased or anonymised immediately afterwards.
Microsoft emphasises that the company does not use these categories. Microsoft uses, among others, the categories of ‘Customer Data’ and ‘Personal Data’. Microsoft only uses the term Diagnostic Data for the specific telemetry data collected via the inbuilt software client in the locally installed Office software.
23.000 to 25.000 types of events
Microsoft does not (yet) offer a possibility to inspect the contents of the diagnostic data flow. Microsoft has explained that 23.000 to 25.000 types of events are sent to Microsoft’s servers, and that 20 to 30 engineer teams work with these data. The engineers can dynamically add new events to the data stream from all computers with Office ProPlus. This collection of data is much more specific than in Windows 10 telemetry. If the telemetry is set to ‘full’ in Windows 10, it involves 1000 up to 1200 types of events. And 10 teams with engineers. The Dutch Data Protection Authority (DPA) conducted an investigation in 2017 of the processing of telemetry data in the consumer and small business versions of Windows 10 (Home and Pro).
The Dutch DPA concluded that Microsoft violated data protection law on many counts, amongst others through the lack of transparency and purpose limitation, and the lack of a legal ground for the processing.
In response to that investigation, Microsoft made some adjustments in the spring 2018 release of the software. The Dutch DPA concluded (prior to the actual release of the software, press release in Dutch only) that the improvement plan presented by Microsoft would end all violations. The Dutch DPA did not investigate data processing via the Office software.
Microsoft as a (joint) controller and not as a data processor
Microsoft determines the purposes of the processing of the diagnostic data in the Office software, and the retention period of the data (30 days up to 18 months, or even longer if deemed necessary by Microsoft). The DPIA report shows that Microsoft processes the diagnostic data for 7 purposes, and for all other purposes Microsoft deems to be compatible with those purposes. Because Microsoft determines the purposes and the means (of the retention period), Microsoft acts as a controller, and not as a data processor.
The 7 purposes are:
- Security (identifying and mitigating security threats and risks as quickly as possible through updates to Office ProPlus Applications and remediation of connected services)
- Up to Date (delivering and installing the latest updates to the Office ProPlus Applications without disruption to the experience)
- Performing Properly (identifying and mitigating anomalies, “bugs,” and other product issues as quickly as possible through updates to the Office ProPlus Applications and remediation of connected services)
- Product development (learning to add new features)
- Product innovation (business intelligence, develop new services)
- General inferences based on long-term analysis, support machine learning
- Showing targeted recommendations on screen to the user
- Purposes Microsoft deems compatible with any these 7 purposes.
The Office ProPlus software includes the use of a number of online services. But Microsoft also offers so called ‘discretionary’ (voluntary) Connected Services, such as the online spelling checker and the translation service. Microsoft only considers itself to be a data controller when people use these discretionary Connected Services. In that case, Microsoft processes the personal data about the use of these services for all 12 purposes listen in its general privacy statement.
High data protection risks for data subjects
The DPIA report provides an extensive description of 8 high data protection risks for data subjects. The government organisations that use Office should, however, determine themselves what the specific risks are, based on the specific personal data they process. This DPIA report is meant to assist, not to replace this assessment.
During the writing of this DPIA report, Microsoft has already made commitments to SLM Rijk to make important adjustments to lower the risks. Microsoft has developed zero-exhaust settings. Microsoft also intends to provide adequate information, include a data viewer tool for the telemetry data from Office and provide an option to administrators to determine the desired level of telemetry. Additionally, SLM Rijk and Microsoft office will jointly work on the correct qualification of Microsoft as a (joint) controller or data processor.
Some residual risks can be mitigated if the government organisations will use the newly developed settings to minimize the processing of telemetry data. There are 6 remaining high risks for data subjects.
- The unlawful storage of sensitive/classified/special categories of data, both in metadata and in, for example, subject lines of e-mails.
- The incorrect qualification of Microsoft as a data processor, instead of as joint controller as defined in article 26 of the GDPR.
- Insufficient control over sub-processors and factual data processing.
- The lack of purpose limitation, both for the processing of historically collected diagnostic data and the possibility to dynamically add new types of events.
- The transfer of (all kinds of) diagnostic data outside of the EEA, while the current legal ground for Office ProPlus is the Privacy Shield and the validity of this agreement is subject of a procedure at the European Court of Justice.
- The indefinite retention period of diagnostic data and the lack of a tool to delete historical diagnostic data.
What can the admins do now to lower the risks?
Admins of the Enterprise version of Office ProPlus can already take a number of specific measures to lower the privacy risks for employees and other people in the Netherlands.
- Apply the new zero-exhaust settings
- Centrally prohibit the use of Connected Services
- Centrally prohibit the option for users to send personal data to Microsoft to ‘improve Office’
- Do not use SharePoint Oneline / OneDrive
- Do not use the web-only version of Office 365
- Periodically delete the Active Directory account of some VIP users, and create new accounts for them, to ensure that Microsoft deletes the historical diagnostic data
- Consider using a stand-alone deployment without Microsoft account for confidential/sensitive data
- Consider conducting a pilot with alternative software, after having conducted a DPIA on that specific processing This could be a pilot with alternative open source productivity software. This would be in line with the Dutch government policy to promote open standards and open source software.
These measure are not in all cases realistic or feasible. It is not possible for the (Enterprise) customers of Office to solve all problems. With regard to the contracts and transfer of personal data to the USA, a European solution must be sought.