TrendNCart

society-logo-bcs-informatics

How machine learning is embedded to support clinician decision making: an analysis of FDA-approved medical devices

[ad_1]

Main findings and implications

The way that algorithms are embedded in medical devices shapes how clinicians interact with them, with different profiles of risk and benefit. We demonstrate how the stages of automation framework,19 can be applied to determine the stage of clinician decision making assisted by ML devices. Together with our level of autonomy framework, these methods can be applied to examine how ML algorithms are used in clinical practice, which may assist addressing the dearth of human factors evaluations related to the use of ML devices in clinical practice.17 Such analyses (table 1) permit insight into how ML devices may impact or change clinical workflows and practices, and how these may impact healthcare delivery.

While FDA approval of ML devices is a recent development, only six approvals in this study were via De Novo classification for new types of medical devices. Most approvals were via the PMN pathway for devices that are substantially equivalent to existing predicate devices. Some predicates could be traced to the ML device De Novo’s, while others were non-ML devices with similar indications except using different algorithms. As the FDA assesses all medical devices on the same basis, regardless of ML utilisation, it is unsurprising that ML medical devices largely follow in the footsteps of their non-ML forebears. Most were assistive or autonomous information which left responsibility for clinical decisions to clinicians.

We identified an interesting group of devices, primarily triage devices, which provided autonomous decisions, independent of clinicians. These triage devices appeared to perform tasks intended to supplement clinician workflow, rather than to automate or replace existing clinician tasks. The expected benefit is prioritising the reading of cases with suspected positive findings for time-sensitive conditions, such as stroke, thereby reducing time to intervention, which may improve prognosis. Unlike PMNs, De Novo classifications report more details, including identified risks. The De Novo for the triage device, ContaCT,45 identifies risks associated with false-negatives that could lead to incorrect or delayed patient management, while false-positives may deprioritise other cases.

Likewise, the diabetic retinopathy screening device, IDx-DR49 appears to supplement existing workflows by permitting screening in primary practice that would otherwise be impossible. The goal is to increase screening rates for diabetic retinopathy by improving access to screening and reducing costs.71 The De Novo describes risks that false-negatives may delay detection of retinopathy requiring treatment, while false-positives may subject patients to additional and unnecessary follow-up.49 However, the device may enable far greater accessibility to regular screening.

In contrast, with assistive devices there is overlap between what the clinician and device does. Despite many of these ML devices providing decision selection, such as reporting on the presence of disease, the approved indications of all assistive devices—nearly half of reviewed devices—emphasised that decisions are the responsibility of the clinician (box 1). Such stipulations specify how device information should be used and may stem from several sources, such as legal requirements for tasks: who can decide what, for example, diagnose or prescribe medicines, and legal liability about who is accountable when things go wrong. However, the trustworthiness of devices cannot be inferred by the presence of such indications.

Assistive devices change how clinicians work and can introduce new risks.72 Instead of actively detecting and diagnosing disease, through patient examination, diagnostic imaging or other procedures, the clinician’s role is changed by the addition of the ML device as a new source of information. Crucially, indications requiring clinicians to confirm or approve ML device findings create new tasks for clinicians; to provide quality assurance for device results, possibly by scrutinising the same inputs as the ML device, together with consideration of additional information.

The benefit of assistive ML devices is the possibility of detecting something that might have otherwise been missed. However, there is risk that devices might bias clinicians; that is, ML device errors may be accepted as correct by clinicians, resulting in errors that might not have otherwise occurred.9 73 Troublingly, people who suffer these automation biases exhibit reduced information seeking74–76 and reduced allocation of cognitive resources to process that information,77 which in turn reduces their ability to recognise when the decision support they have received is incorrect. While improving ML device accuracy reduces opportunities for automation bias errors, high accuracy is known to increase the rate of automation bias,78 likely rendering clinicians less able to detect failures when they occur. Of further concern, is evidence showing far greater performance consequences when later stage automation fails, which is most evident when moving from information analysis to decision selection.79 Greater consequences could be due to reduced situational awareness as automation takes over more stages of human information processing.79

Indeed, the De Novo for Quantx,57 an assistive device which identifies features of breast cancer from MRI, describes the risk of false-negatives which may lead to misdiagnosis and delay intervention, while false-positives may lead to unnecessary procedures. The De Novo for OsteoDetect52 likewise identifies a risk of false-negatives that ‘users may rely too heavily on the absence of (device) findings without sufficiently assessing the native image. This may result in missing fractures that may have otherwise been found.’52 While false-positives may result in unnecessary follow-up procedures. These describe the two types of automation bias errors which can occur when clinicians act on incorrect CDS. Omission errors where clinicians agree with CDS false-negatives and consequently fail to diagnose a disease, and commission errors whereby clinicians act on CDS false-positives by ordering unnecessary follow-up procedures.9 80

Other risks identified in De Novo classifications45 52 57 include device failure, and use of devices on unintended patient populations, with incompatible hardware and for non-indicated uses. Such risks could result in devices providing inaccurate or no CDS. Controls outlined in De Novos focused on software verification and validation, and labelling, to mitigate risks of device and user errors, respectively.

These findings have several implications. For clinicians, use of ML devices needs to be consistent with labelling and results scrutinised according to clinicians’ expertise and experience. Scrutiny of results is especially critical with assistive devices. There needs be awareness of the potential for ML device provided information to bias decision-making. Clinicians also need to be supported to work effectively with ML devices, with the training and resources necessary to make informed decisions about use and how to evaluate device results. For ML device manufacturers and implementers, the choice of how to support clinicians is important, especially the choice of which tasks to support, what information to provide and how clinicians will integrate and use those devices within their work. For regulators, understanding the stage and extent of human information processing automated by ML devices may complement existing risk categorisation frameworks,81 82 by accounting for how the ML device contribution to decision-making modifies risk for the intended use of device provided information; to treat or diagnose, to drive clinical management or to inform clinical management.81 Regulators could improve their reporting of ML methods used to develop the algorithms utilised by devices. These algorithms are akin to the ‘active ingredient’ in medicines as they are responsible for the device’s action. However, consistent with the previous study we found that the public reporting of ML methods varied considerably but was generally opaque and lacking in detail.12 Presently, the FDA only approves devices with ‘locked’ algorithms,82 but are moving towards a framework that would permit ML devices which learn and adapt to real-world data.83 Such a framework is expected to involve precertification of vendors and submission of algorithm change protocols.82 It will be important to continually evaluate the clinician-ML device interactions which may change with regulatory frameworks.

Finally, there are important questions about responsibility for ML device provided information and the extent to which clinicians should be able to rely on it. While exploration of these questions exceeds the scope of this article, models of use that require clinicians to double check ML devices results, may be less helpful than devices whose output can be acted on. As ML devices become more common there needs to be clearly articulated guidelines on the division of labour between clinician and ML devices, especially in terms of who is responsible for which decisions and under what circumstances. In addition to the configuration of tasks between clinician and ML devices, how devices work and communicate with clinicians is crucial and requires further study. The ability of ML devices to explain decisions through presentation of information, such as marking suspected cancers on images or using explainable AI techniques84 will impact how clinicians will assess and make decisions based on ML device provided information.

[ad_2]

Source link

Leave a Comment

Your email address will not be published. Required fields are marked *