How Getvisibility can help your company comply with GDPR
Derek Coetzee

Getvisibility Focus

The Getvisibility Focus platform specialises in discovery of sensitive information contained in difficult to track unstructured text documents across company networks. This knowledge is a valuable prerequisite for any GDPR activity. This article touches on the main GDPR principles and articles and shows how Getvisibility can assist you in addressing them to comply with GDPR.

To be clear, structured data such as database information, Enterprise Resource Planning (ERP) platforms such as SAP, document management platforms such as Atlassian and cloud services such as SalesForce are different from the possibly unsorted, unknown, unlisted text documents that can proliferate in organisations. These platforms provide digital interfaces to allow interactions via external software called Application Programming Interfaces (APIs) that respond with well defined, structured data responses. The vendors of these platforms are able to address GDPR constraints with new releases in a reasonably controlled way.

However, we believe that companies looking to comply with GDPR will need to focus on the other neglected respositories of data represented by lots and lots of text documents all over the place. These text documents may be Microsoft Office documents, PDFs, log and text files, presentations, spreadsheets and many other types.

The Data Minimisation and Storage Limitation principles

Information about a data subject should be kept as minimal as possible to provide the intended service to the customer or manage the employee. This is described in GDPR Article 5 under point 1(c). Companies need to know where their customer and employee data is and be able to examine the documents to determine what fields are present. Having an automated tool that can allow you to see a list of files containing customer or employee data captured in unstructured text (with a useful PII flag attached) is very powerful. In addition, we have the capability to find (and optionally extract) a list of Named Entities that would represent the “fields” in the text document. This is very useful information to have at hand to see if you have too much detail, or have sensitive information in files not being tracked. Article 9 prohibits storage of certain types of data so finding such information in documents allows you to either remove it or obtain consent as described in 9.2.

Privacy by design and default

Organisations holding information about data subjects have responsibilities to ensure the implementation of privacy by design and default in the way they handle their data. This is described in Article 25 and includes pseudonymisation and data minimisation (described above). Being aware of the location of documents containing sensitive information is a vital first step in securing the data. Our software marks each file with a flag indicating the presence of such data and also calculates a risk score based on permissions and user visibility of the data, allowing you to quickly focus on the files that need attention and comply with GDPR.

Subject access rights

Access to the data retained about people is a major part of GDPR and section III Rights of the Data Subject is dedicated to this. Providing a response to a data access request means having complete knowledge of the location of subject data. Here it is easier to be sure all records for a subject are known about in a structured environment (such as a database or SalesForce) but a far more difficult task for unstructured text documents that may have accumulated on laptops or workstations over time that have no central registry. Our Focus release of software builds this complete unstructured document registry and maintains it’s integrity over time by monitoring events from Active Directory or CloudTrail (for example). This registry is then the central source of truth for ensuring that subject access requests do describe every piece of data held about the subject.

Useful information about the age of a file and indication of duplicates helps to provide further confidence in the completeness of a response as described in Articles 13, 15 (especially 1(b)), 16, 17, 19 and 20.

Security of processing

Described in Article 32, the security of processing is vital when handling sensitive data and, while referring to pseudonymisation again, it also mentions ensuring the integrity and availability of the data and restoration in the event of a technical problem. Our software provides our existing customers with useful indexes of files allowing them to identify duplicates (to reduce wasted backup capacity), also to confirm restores of directories are complete and contain the correct versions of files, to identify stale data that can be archived (while retaining the information about what was in the file, who had access and how sensitive the data was) and to find any directories containing sensitive documents not correctly protected.

In summary, having a comprehensive view of unstructured documents built and maintained by the Getvisibility Focus platform gives companies significantly stronger tools to comply with GDPR and protect their customers’ or employees’ data.

To find out more please use the contact page on this site.

