A new and interesting approach of auditing LLMs

On February 17, 2023, J. Mökander, J. Schuett, H. R. Kirk and L. Floridi published their article “Auditing Large Language Models (LLMs): a three-layered approach” on arxiv.org. I found it very interesting to read. It brought practical lenses about the AI audit through its approaches to LLMs.

Image by upklyak on Freepik

Ethical and social risks in connection with LLMs

The research addressed ethical challenges, such as discrimination, information hazards, misinformation hazards, malicious use, human-computer interaction harms and automation and environmental harms. In addition, the authors pointed out the methodological and normative challenges in LLMs management. Accordingly, the audit based on multiple normative dimensions helps system’s providers identify risks and prevent potential harm, determine vulnerabilities or errors and encourage procedural transparency and regularity. The article underlined that audit is not the sole and perfect tool to detect and eliminate all LLM related ethical or regulatory risks and adequate due to its limitations and insufficiency. But, we should see it as a useful complement to other methods.

Existing auditing procedures assessments

AI auditing procedures only focusing on compliance are unlikely adequate guarantee;
External audit requirements ensure ethical, legal and technical robustness of the LLMs;
Auditing procedures must include governance and technology audits to assess and mitigate the LLMs’ risks;
The methodological design of technology audits will have to modify to identify and assess LLM-related risks; and
Model audits are crucial to identify and communicate LLMs’ limitations by informing system redesign and mitigating downstream harm;
LLM auditing procedures must contain continuous ex-post auditing to be more effective.

Thus, the authors consolidated 4 LLMs’ properties, weakening existing AI procedures, including generativity, emergence of abilities, lack of grounding, and lack of access.

The article proposed the tree-layered – governance, model and application – audits:

Governance audits assessing the technology providers’ organisational procedures, accountability structures and quality management systems. With process-oriented methods, the governance audits focuses on 3 tasks:
1. Reviewing the adequacy of organisational governance structures;
2. Creating an audit trail of the LLM development process; and
3. Mapping roles and responsibilities within organisations designing LLMs;
Model audits assessing their capabilities and limitations between the initial training phase and the next phases of adaptation and deployments in specific applications of LLMs with performance-oriented methods. As a result, these audits may appraise a few of all four following examples:
1. Performance, i.e., how well the LLM functions on various tasks;
2. Robustness, i.e, how well the model reacts to unexpected prompts or edge cases;
3. Information security, i.e., how difficult it is to extract training data from the LLM; and
4. Truthfulness, i.e., to what extent the LLM can distinguish between the real world and possible worlds.
Application audits assessing the ethical appropriation and legal compliance of the intended functions and impact over time of the LLMs downstream application with impact-oriented methods. Moreover, they contain two components: (1) functionality audits about legal and ethical compliance and intended uses and (2) audits of the impact on different user groups and environments.

Limitations of the proposed auditing procedures

The authors also highlighted three limitations:

Lack of methods and metrics to operationalise normative concepts;
Lack of an institutional ecosystem;
In practice, not all risks from LLMs can be addressed on the technology level.

At the present, the internal and external audits are not yet the compulsory requirement for all tech providers. Also, they may be expensive for small and medium providers. As a part of the AG project, we are doing our best to provide you with suggestions, cost-friendly approach to make your AI systems more accountable and responsible.

For more the full text of the research: https://arxiv.org/pdf/2302.08500.pdf

For more about the AG project:

https://www.astraiagear.com/category/ag-project/ and

https://www.astraiagear.com/about-ag/