Can sensitive data cease discrimination in AI based systems?

Published on Computer Law & Security Review 48 (2022), M. van Bekkum and F. Z. Borgesius brought an interesting discussion about the use of sensitive data to prevent the discrimination by AI based systems as an exception to the GDPR.

Image by rawpixel.com on Freepik

The authors put forward the proposal of article 10(5) of AI Act as an exception to the GDPR regulation on the interdiction of using special category data to mitigate the discrimination, including direct and indirect discrimination. They presented both arguments in favour of and against that suggestion.

First of all, for the supporting arguments:

AI systems developers can use the special category data to audit the AI systems. They used the binary test as their example;
AI systems audit may gain the credit from the consumers.

Second, for the oppositions:

Retention of special categories of data can be considered as a privacy interference;
There can be the abuses or data breaches;
The concerned organisation can abuse that exceptional recommendation to collect the special category data for the uses other than the AI audit;
Special category data does not ensure a successful AI audit to debias the whole system.

Personally, I totally agreed with the authors on both supporting and opposing points. Furthermore, as the nature of non synthetic data, the existence of bias or even discrimination defined by the law seems unavoidable. To detect the discrimination in AI based systems, the sensitive data may be useful but it is not certain that it is the solution:

The data used to generate, train and deploy the systems may contain the discrimination. Therefore, it may require the exposure of the special category data in both three databases as mentioned to learn where is the discrimination created.
The database also contains the paradox (see Simpson’s Paradox)
There should be more than one measure to apply in order to detect and fix the issue in question.
More importantly, bias and discrimination are different and the criteria to define the discrimination are determined by law. The use of special category data must demonstrate specific discrimination referred to in addition to a solid licit interest to use such data.

For the full text: https://www.sciencedirect.com/science/article/pii/S0267364922001133

For more news about AI on AstraIA Gear: https://www.astraiagear.com/category/ag-project/