Du benytter en nettleser vi ikke støtter. Se informasjon om nettlesere

Kapittel 5.2Initiated: Risks associated with the use of large language models

Background

The Ministry of Health and Care Services (HOD) has tasked the Norwegian Directorate of Health with the following: “as part of further work, the government agencies shall also assess the risks that large language models introduce.” [62]

The description of this measure is based on the recommendation concerning access to language models adapted to Norwegian conditions (5.1) in the knowledge base on the use of large language models [63].

What is the problem?

The use of large language models, especially generative language models, results in variable quality that is difficult to test and assess. Technology is evolving at an extremely fast pace. It introduces risks not only as regards poorer patient safety, such as misdiagnosis or mistreatment, and privacy breaches, but also as regards administrative and other non-clinical applications. There are both technical and psychological/human risks, such as automation bias, as well as the risk of health professionals taking longer to perform their duties. There is also a risk that the services will miss opportunities that could improve patient safety, diagnostics and treatment planning if the service does not have the courage or waits too long to adopt large language models.

There is a lack of expertise and knowledge about the risks introduced by the various applications. General methods are available for assessing risk, but these are rarely applied to language models in the health sector. There is also no overview of the types of risks that are associated with the use of language models.

As the development of language models accelerates, there is an increasing risk of such systems being introduced without adequate quality assurance. It is therefore important to afford a high priority to this work by introducing a framework for the safe use of language models in the health and care services.

What do we want to achieve?

There is a need to identify and/or (further) develop methods and frameworks to perform systematic risk analyses of large and different types of language models for different applications within healthcare. These should build on standards and general templates for risk analyses, while at the same time being dynamic given technological advances. It may be necessary to adopt a differentiated approach using a number of different methods and frameworks.

The risk assessment framework should introduce a risk analysis of the use of language models in the various areas. The methods used should be robust, balance risk and benefit, and include risk mitigation measures that maximise the benefits reaped. It may be useful to include success stories in the guidance to aid in understanding the factors that can contribute to positive outcomes. To obtain balanced assessments, multidisciplinary assessments should be encouraged.

What is happening within the area?

There are no national initiatives with regard to systematic work relating to risks in connection with the use of large language models in the health and care services.

Many sets of regulations, such as health legislation, the AI Act (adopted, but not yet in force), the GDPR and the medical devices regulations provide guidance on how both risk and quality assessments should/must be conducted.

How to make it happen

To gain a deeper understanding of the risks that large language models would introduce if they were to be used in health and care services, a report must be prepared that addresses the following topics at an overarching level:

  • Identifies relevant types and degrees of risks relating to the use of language models, based on international experience and research, including privacy and information security.
  • Identifies risk-mitigating and quality-enhancing measures. In this, there also lies an accepted level of error/bias etc. for public services.
    • For example, consider establishing a step-by-step approach to the risks associated with the use of language models in different applications, starting with the lowest-risk step, such as administrative tasks

In addition, the sector should build expertise relating to risk analyses by testing, adapting and applying the method(s) to different applications and using different types of language models.

Who is responsible?

The Norwegian Directorate of Health

Who collaborates?

The specialist and R&D community within the fields of AI and language technology (including the National Library of Norway), central administrative bodies (including the Norwegian Digitalisation Agency and the Norwegian Data Protection Authority), the health and care services and the business community provide input in the work.

Last update: 18. februar 2025