Notes
Article history
The research reported in this issue of the journal was funded by the HS&DR programme or one of its preceding programmes as project number 09/2001/28. The contractual start date was in February 2011. The final report began editorial review in February 2016 and was accepted for publication in June 2016. The authors have been wholly responsible for all data collection, analysis and interpretation, and for writing up their work. The HS&DR editors and production house have tried to ensure the accuracy of the authors’ report and would like to thank the reviewers for their constructive comments on the final report document. However, they do not accept liability for damages or losses arising from material published in this report.
Declared competing interests of authors
Charles Vincent reports personal fees from Cardiff University during the conduct of the study, personal fees from the Swiss Federal Office of Health, personal fees from haélò, personal fees from Wiley-Blackwell, personal fees from Healthcare at Home and grants from The Health Foundation outside the submitted work. Jonathon Gray reports other from Counties Manukau District Health Board and other from Victoria University of Wellington, Wellington, New Zealand, outside the submitted work.
Permissions
Copyright statement
© Queen’s Printer and Controller of HMSO 2017. This work was produced by Mayor et al. under the terms of a commissioning contract issued by the Secretary of State for Health. This issue may be freely reproduced for the purposes of private research and study and extracts (or indeed, the full report) may be included in professional journals provided that suitable acknowledgement is made and the reproduction is not associated with any form of advertising. Applications for commercial reproduction should be addressed to: NIHR Journals Library, National Institute for Health Research, Evaluation, Trials and Studies Coordinating Centre, Alpha House, University of Southampton Science Park, Southampton SO16 7NS, UK.
Chapter 1 The measurement of harm in inpatient settings: an overview
Background
Since the publication of the first Institute of Medicine reports on quality and safety in health care,1,2 there has been growing awareness of the number of patients unintentionally harmed in the course of health-care management. Global estimates of incidents using a common measurement strategy vary, but rates are normally reported to be around 10% of all inpatient admissions, with a range of 8–12%. 3 In the last decade considerable effort has been made to improve patient safety through programmes and campaigns,4 but questions persist as to whether or not these efforts are well directed. Are patients any safer as a result? For all the effort and resource devoted to patient safety, we still do not know how much progress has been made. The lack of the surveillance of harm in the health-care system,5,6 along with the difficulties encountered in using these kind of measures to demonstrate beneficial effects in the evaluation of patient safety and quality improvement programmes, continues to pose significant challenges to the quality improvement community internationally. 7–10
Welsh context for the study
Wales is in a unique position in relation to harm measurement and mortality assessment, being supported by strong health policy and strategic leads working collaboratively across government, NHS Wales and academia. In 2010, at the start of the study, all NHS providers voluntarily signed up to a national harm prevention campaign, undertaking Global Trigger Tool (GTT) reviews every month in order to monitor progress over time. Data collection preceded the launch of the 1000 Lives Campaign11 and a methodology was developed to quantify the number of episodes of harm averted on a quarterly basis. NHS Wales made an explicit commitment to the continuation of harm measurement across Wales for a 5-year period from April 2010. The study was therefore timely and designed to sit alongside the NHS work, providing evidence of impact and contributing to the ongoing development of a health-care quality measurement strategy for NHS Wales.
During the study period, Wales has developed an active Harm and Mortality Collaborative, in which senior members of health boards, the Welsh Government (WG) and Public Health Wales meet to review strategy and progress on organisational patient safety metrics every 6 months. Through this agenda, and development work undertaken by individual health boards, NHS organisations in Wales stopped undertaking GTT reviews in Wales in 2014, and a mortality review process was developed and, subsequently, mandated for every death occurring within acute care. A screening process was developed to identify those individuals for whom problems in care may have been in some way associated with the nature and cause of death for the purpose of organisational learning. These episodes of care were then intended to progress to a second-stage review, which was not prescribed and varied considerably in approach across organisations. This approach was endorsed by the Palmer review12 along with a suite of audit and critical incident investigations to generate holistic and triangulated assessment of quality of care through these methods.
The identified need for the study
Wales did not have a benchmarkable national rate of harm in health care and, although GTT reviews were being undertaken and there was national guidance on the GTT process, there was variation in the composition of teams and the way in which the reviews were being undertaken. Some health boards opted for a multidisciplinary approach in which physicians were involved in determining adverse events (AEs), whereas others had an approach in which the reviews were undertaken by clinical governance staff. We wanted to determine how effective the GTT was in identifying health-care-related harm by comparing it with the two-stage retrospective case note review process and to use this learning to develop an approach for ongoing harm measurement. We know that system failures and causes of harm cannot be addressed efficiently until issues are identified. Furthermore, the quantification of the nature and extent of the problem provides the basis from which solutions and intervention programmes can be developed and subsequently monitored and evaluated. The starting point for this study was to quantify national rates of harm longitudinally across NHS Wales and to develop methodologies for the reporting of trends of common AEs, thereby providing organisations with the methodological know-how and data to complement current incident reporting practice and support the adoption and embedding of innovative ways of working and revised practices and processes.
The work fitted into a tripartite arrangement comprising the NHS, academic units and the WG (Figure 1). The aim was to develop a comprehensive programme of work describing the epidemiology of harm in NHS Wales hospitals, and to develop and evaluate methods for the systematic assessment of harm collected from a wide range of sources, with the end point being a move towards the active surveillance of salient events. 13 At the national level, the priority was the accurate and holistic measurement of harm14,15 as part of the development of national safety indicators. At the local level, the aim was to promote a culture of learning both from critical incident investigation and the identification of systemic issues in order to inform safety and quality improvement programmes.
The measurement of harm
In the UK, a review of case notes from single study sites suggest that 8–10% of patients experience an AE during inpatient management. 14,16 One systematic review of international studies, published in 2008, attempted to gain an overview of data and quantification of what is widely recognised as a ‘serious problem in health care’. 3 The median overall prevalence of in-hospital events was 9.2%, with a median percentage of preventability of 43.5%. More than half of these patients experienced no or minor disability, and 7.4% were associated with, but not necessarily causally implicated in, the death of a patient. Although the reporting of snapshot prevalence data has its uses, we know that the examination of AEs in health care is complex. As seen in aggregate data from studies, summated AE rates are difficult to interpret and conceal a range in severity from minor injuries caused by hospital equipment to catastrophic wrong-site surgery. Work undertaken to date has provided little detail on how the measurement and characterisation of AEs informs learning and the development of interventions, which then have the potential to have a demonstrable impact on clinical processes and practices.
In recent years, as interest and awareness of patient safety issues in health care grows, different methodologies have been used in widely different settings and contexts and for various purposes. 17–21 Although progress on measurement and evaluation is being made within the emerging international agenda of improvement science, challenges remain. Despite comprehensive work undertaken to develop an international classification for patient safety,22 there is still evidence that key concepts and terms are not consistent across research studies and improvement efforts. Fundamental questions still exist around the prevalence and nature of AEs and, perhaps most importantly, opportunities for the effective targeting of interventions and prevention programmes. 23
Reporting systems, such as the National Reporting and Learning System in the UK, have been the principal mechanism through which AEs are reported into a centralised repository that is organised at both local and national levels. Vincent et al. 13 describe how these systems invite voluntary reporting of unspecific safety incidents with the aim of learning lessons and feeding the findings back into the system. Although providing useful examples of cause and effect for the purposes of both professional and organisational learning, the reporting systems are not representative of AEs in patient populations, and a number of studies give rise to concern of significant under-reporting. In one UK study, reporting systems detected only 6% of AEs found through the systematic review of inpatient records24 and similar findings have been reported within the US health-care system. 4 With heightened awareness of the number of patients being harmed as a result of health care, there is consensus across health, academic and policy settings that, although these systems are a valuable component of a safety system, they are primarily warning and communication mechanisms within organisations and will never act as a measurement system for safety. 13 In 2015, health-care organisations and strategists still continue to grapple with identifying reliable and valid, yet pragmatic, tools to measure the quality and safety of care routinely provided. 4
Methodology of adverse event measurement
Measurement describes events in terms that can be analysed statistically, and should be free from random and systematic error. 25 The measurement of errors and AEs in health care, however, is complex and AEs need to be assessed and understood in the context in which they occur. 26 James Reason’s27 ‘Swiss cheese model’ has increased our understanding that errors and AEs in health care are commonly the result of numerous latent errors in addition to an active error committed by a practitioner. 26 Despite this growing understanding and theoretical underpinning, the actual determination of an AE is still often a subjective and complex task for even the most experienced clinician.
The literature reports numerous methodologies used to quantify harm and error in health care, with each method being associated with strengths and limitations. 26 They include the review of post mortems and medico-legal claims, summative data outputs from reporting systems, clinical outcomes reported through administrative data set analysis, structured review of medical records, ethnographic observation of practice and the clinical surveillance of specific events. Methods differ in a number of ways. Although some methods are oriented towards detecting the number of AEs, others address their nature and contributory factors. The scale also varies significantly: some focus on single cases or a small numbers of cases with particular characteristics, such as claims, whereas others attempt to randomly sample a defined population. Thomas and Petersen26 suggest that the methods can be placed along a continuum, with active clinical surveillance of specific types of AE (e.g. surgical complications) being the ideal method for assessing incidence, and methods such as case analysis and morbidity and mortality meetings being more oriented towards causes. It is clear that there is no ideal way of estimating the incidence of AEs in health care and all methods give a partial picture of the true extent of the problem.
Retrospective case note review
Retrospective review of medical records has been for 25 years the methodology of choice in large-scale studies of harm in health-care systems. Originating from the Harvard Medical Practice Study,28,29 and adopted for UK use by Woloshynowych et al. ,30 the process aims to assess the nature, incidence and economic impact of AEs and to provide some information on their causes. An AE is defined as an unintended injury caused by medical management rather than the disease process that results in harm to the patient or, at the very least, additional days in hospital. The review is undertaken in two stages. In stage 1, and using the review form 1 (RF1), nurses or experienced clinical governance facilitators are trained to identify case records that satisfy one or more of the 18 well-defined screening criteria shown to be associated with an increased likelihood of an AE. These criteria include unexpected death, clinical complications such as myocardial infarction (MI) and deep-vein thrombosis (DVT), unplanned transfer to a higher level of care, and readmission to hospital within a specified time frame. 31 In stage 2, using the modular review form 2 (MRF2), doctors trained in the use of a standard set of questions analyse positively screened records in detail to determine whether or not they contain evidence of an AE. The basic method has been followed in all the major epidemiological studies,14,16,32–39 although there have been modifications to the review form and data capture methods. 30
Previous work in a UK setting,16,24 assessing the sensitivity of the two-stage review process, reported a sensitivity of 92% and a specificity of 62%. There was high inter-rater reliability between trained nurse reviewers in a subset of records (84%, κ = 0.68) and lower agreement by physicians on the presence (86%, κ = 0.64) and preventability of AEs (83%, κ = 0.44). In the largest study to date, undertaken in the Netherlands, the reliability of the assessment of screening criteria by nurses was reported to be good [82%, κ = 0.62, 95% confidence interval (CI) 0.54 to 0.69], with lower reliability reported in (1) the determination of AEs by physicians (76%, κ = 0.25, 95% CI 0.05 to 0.45) and (2) the determination of the preventability of AEs (70%, κ = 0.40, 95% CI 0.07 to 0.73). 39 An in-depth analysis of the methods used to undertake case note review in a UK setting confirmed this trend; inter-rater reliability was higher when objective criterion-based tools, such as screening tools, were used than when more holistic assessment is required in order to make a clinical judgement on the quality of care provided (intraclass correlation coefficient 0.61–0.88 vs. 0.46–0.52).
Professional determination of adverse events
The traditional methodology of AE determination involves nurse screeners examining records for criteria that are suggestive of an AE [known as explicit (criterion)-based assessment]; this is followed by an implicit (holistic) assessment, also referred to as a global assessment of care, which is undertaken by physicians. 18 Relying on physician review of whole episodes of inpatient care is time-intensive and expensive, a factor that is naturally prohibitive and has resulted in a trend in health-care organisations of using either nursing staff or trained clinical governance staff to complete retrospective case note reviews. Nursing review may well be the way that organisations can afford to invest in a longitudinal assessment of quality-of-care issues. However, it is important to ensure that organisations understand what different clinical groups are measuring and the attendant implications for the results reported through any such activity. Of interest, work undertaken in the UK indicates poor consensus between physicians and nurses when assessing the quality of overall care (intraclass correlation coefficient 0.24–0.43). 18 Detailed analysis of the clinical summaries of the episode provided by the reviewers has revealed that the professional groups provide detail on different aspects of care. Non-clinical staff reported facts from the notes, nurses provided commentary around the process of care, along with implicit judgement on the quality of care, and physicians focused on technical aspects of care, making explicit judgements on the quality of care. 18
Weingart et al. 40 also drew similar conclusions, reporting that nurse and physician reviewers often came to substantially different conclusions while examining the same episode of care. Professional groups agreed more often about complications of care than global quality-of-care assessment, but inter-rater agreement varies substantially by the nature of the complication being assessed. They concluded that ‘reviewers agreed more often about complications in surgical cases than medical cases (κ 0.59 vs. 0.36), but agreement about quality was little better than chance’. 40 Strategies have been proposed to improve the inter-rater consensus between reviewers by including multiple implicit reviewers and excluding reviewers with extreme ‘hawk and dove’ approaches to case note review, but these have little applicability to service-led quality improvement that is oriented around professional and clinical learning and, furthermore, have been found not to improve agreement in research studies. 39
The Global Trigger Tool: a pragmatic approach to case note review
The GTT is another retrospective method used to review medical records, which was developed by the Institute for Healthcare Improvement (IHI) and is used throughout the world by countries adopting the IHI improvement methodology. It uses a time-limited pragmatic approach. The methodology centres on the random selection and limited case note review of 20 inpatient records per month per organisation. Qualified nurses and clinical governance facilitators commonly undertake reviews and the tool comprises similar criteria to those included in stage I of the two-stage review described previously in Retrospective case note review.
The largest reported study to date, which described temporal trends of AEs in 10 hospitals in North Carolina, USA, reported 25.1 AEs per 100 admissions. This rate of harm is significantly higher than rates reported using the two-stage retrospective review process, but statistically significant changes in rates over a 5-year period were not observed despite concurrent improvement efforts in the study sites. 19 Even higher rates of detection of AEs using the GTT methodology have been confirmed in other studies, for which AE rates in hospitalised patients are consistently reported to be > 30% of all inpatient admissions. 4,21,41–43
Naessens et al. 41 assessed the ability of nurse reviewers, using GTT methodology, to detect AEs in four US hospitals and reported favourable levels of detection of both triggers (κ = 0.63, 95% CI 0.58 to 0.68) and AEs (κ = 0.51, 95% CI 0.45 to 0.57). In a study evaluating the reliability of the GTT in tracking local and national AE rates, agreement by nurse reviewers was reported to be within a range of κ = 0.40–0.60. Significantly, internal teams were found to perform consistently better than external teams coming into organisations to undertake reviews of care.
A comparative analysis of the Global Trigger Tool and the Harvard method approaches
The GTT was conceived and developed to circumnavigate many of the challenges in implementing a resource-intensive multidisciplinary two-stage review process in routine clinical practice. The origins, key characteristics and measurement processes are described in Table 1, which is adapted from Unbeck et al. 42
Characteristic of interest | Method | |
---|---|---|
Harvard | GTT | |
Origin of tool | The Harvard Medical Practice Study | Quality improvement tool for clinical practice developed by the IHI |
Medico-legal and focus on negligence in the first studies and, thereafter, quality improvement and preventability perspective | Track A&E rate over time in a hospital or a clinic | |
Definition of harm | An unintended injury or complication that results in disability at discharge, death or prolonged hospital stay and is caused by health-care management rather than the patient’s underlying disease | Unintended injury resulting from or contributed to by medical care that requires additional monitoring, treatment or hospitalisation, or that results in death |
Nature of harm | Includes both omission and commission | Includes commission, excludes omission |
Sample | Random, big samples to measure the incidence and to generalise the result | Random, small samples sufficient for the design of safety work over time |
Screening | Generally undertaken by one nurse who screens for 18 criteria in the medical record | First screening independent for one of 54 triggers by trained nurses (can be other professionals), focus on triggers, no comprehensive reading, reads just relevant parts related to found triggers; second reviewer provides consensus |
Frame for inclusion | An AE had to have occurred before or during, and be detected during and/or after, index admission | 30-day inclusion period before and after index admission |
Criterion/trigger | An indication that patient harm may have occurred. Directs the medical reviewer to relevant parts of the records by the notes | 54 triggers, mostly narrow |
18 criteria; some criteria/triggers are AEs by definition, for example health-care-associated infections and hospital-incurred injury | ||
Review stage 1 | Comprehensive reading: non-time-limited identifies screening criteria indicating an AE may be present in the notes | Time limited |
Finds triggers, describes the potential AE and categorises harm according to the NCC MERP’s Categorizing Medication Errors Index44,45 | ||
Review stage 2 | Assess the AE by using different scales according to, for example, causation, severity, preventability, timing, causes and types | One physician, who does not generally review the record but does authenticate the consensus findings of the reviewers and the severity rating, and answers questions from reviewers in review stage 1 |
No assessment of preventability | ||
Number of harm events | Generally includes only one AE per patient, that is, the most severe | All identified AEs are included |
Although intending to measure the same construct, a few key differences emerge. First is the origin of development. The GTT aims to provide information to support patient safety efforts and monitor trends of harm events over time. In contrast, the Harvard tool, despite having origins in a medico-legal context, is now commonly used to provide point prevalence estimates of harm in health care on a national scale. The Harvard tool is labour-intensive; the GTT is time limited. Perhaps more significant differences emerge in the type of harm identified and the subsequent assessment of that harm. The Harvard tool directs the reviewer to potential harm events that include acts of both omission and commission, whereas the GTT detects only acts of commission. A physician confirms the presence of a harm event in the Harvard methodology and a physician who may have not read the record of the episode of care confirms the harm event in the GTT. Preventability is not assessed in the GTT, but, along with management causation, is central to the Harvard methodology. 46
The GTT and RF1 do, however, overlap in some of the methodological components and have criteria/triggers in common where it would be expected that there would be minimal variation in the identification of these in records. However, rates of AEs in the Harvard method are computed on the determination of harm and preventability and, in this respect, there are significant differences between the approaches. In the absence of a structured phase 2 process in the GTT, harm determination is viewed to be more prone to systematic bias. 5,41,43 Further questions arise as to what the GTT methodology is actually measuring, as, even when events caused by omission are not quantified, rates of reported harm are threefold higher than when the Harvard method is used. It might well be that triggers are being misclassified as AEs and/or multiple triggers relating to one AE are being classified as multiple AEs.
The assessment of internal validity of the GTT is challenging. In a European context, von Plessen et al. 47 examined rates of harm using the GTT in five Danish hospitals and reported significant variation between hospitals ranging between 18% and 33%. However, in another, more regulated, smaller study comparing the identification of AEs in 350 orthopaedic admissions in Sweden, the Harvard method detected 155 AEs, compared with 137 events detected by the GTT. AEs causing harm without disability accounted for most of the observed difference, with the positive predictive value being 40.3% and 30.4%, respectively. 42 Importantly, this suggests that, within a controlled study environment, the performance of the methodologies can be reasonably similar.
Retrospective case note review in inpatient deaths
Using case note review as a way of examining the quality of care in patients who die has always been a traditional part of modern surgical practice. Following the introduction of the hospital-wide mortality ratio in the early 2000s, the scope of hospital death reviews widened as they became a mechanism for understanding the underlying causes of variations in these measures and determining if ‘excess deaths’ identified represented deaths that were attributable to avoidable health-care-related harm. In 2010, the Mid-Staffordshire Foundation Trust scandal focused attention on the issue of avoidable deaths in the NHS and prompted the WG to mandate case note review of all deaths in health boards to provide reassurance for patients and quality assurance within the Welsh NHS. 48,49 The approach built on local mortality review processes, which were variable in terms of the number and clinical background of reviewers, sampling, review tools and organisation. In 2012, an element of standardisation was introduced with the adoption of a single screening tool. The majority of hospitals in England have also been developing systems for mortality review and, as in Wales, the approaches adopted have varied between sites.
Mortality review can provide a window on a hospital’s safety and quality. Patients who die as inpatients tend to be older, have more comorbidity and require more complex interventions. Such patients test the system and reveal weaknesses that, in turn, may generate harm. Evidence from retrospective case note record review studies does indicate that the types of problems in care experienced by patients who die are different from those who survive, with more diagnostic errors and problems related to clinical monitoring and relatively fewer problems related to surgical or technical procedures. 29,32,39 Mortality review is not just an efficient approach to identifying avoidable health-care-related harm, but also provides a window on the quality of the clinical care received by the patient, particularly during the dying process. With a growing elderly population, and the recent withdrawal of the Liverpool Care Pathway following some high-profile abuses,50 this area of care has been attracting increasing scrutiny in the UK. Moreover, clinicians are concerned about deaths and these concerns act as a rallying cry to engagement in quality improvement initiatives.
There are obvious limits to the scope of feedback on quality and safety that can be provided by reviewing deaths. Fewer than 1.5% of the 15 million admissions to the NHS in England and Wales each year end in death, and some specialties, such as ophthalmology or dermatology, will experience such an event only very rarely. 51 Examining deaths at the hospital level may result in uncovering relatively few heterogeneous quality and safety problems, making it challenging for staff to know where best to direct their improvement efforts. Over half the population end their natural lives in hospital,52 and judgements on whether or not a death was caused by health care and, therefore, avoidable are notoriously difficult in patients with only hours or a few days of natural life remaining. For these reasons, any comprehensive hospital-based surveillance programme requires identification of health-care-related harm across the full spectrum of patient admissions, both those who die and those who are discharged alive. The approach taken to date in Wales includes assessment of harm events across the whole inpatient population and an overview assessment of every patient death, ensuring that there are no concerns identified in the care provided in the final inpatient episode.
Summary
Over the last 25 years there has been concerted effort internationally to quantify the safety of health-care delivery. The original Harvard method, which was developed primarily as a research tool, informed the development of a more pragmatic method for routine surveillance and patient safety work. As the Harvard tool is predominantly used in academic settings, Wales adopted the GTT as a routine surveillance and patient safety tool across Welsh health boards and was gaining experience in its use. The GTT was used as part of a measurement strategy in both the Health Foundation’s Safer Patients Initiative and the 1000 Lives Campaign,11 and with some success. However, the experience of NHS Wales Hospitals using the IHI GTT was similar to that reported in Danish hospitals,47 and there was significant variation in rates and experience in its use across hospital sites. As Wales was committed to ongoing harm measurement and monitoring, we set out to compare the methodological approaches in order to identify the optimal approach to quantify the nature and extent of AEs occurring across NHS Wales’s acute hospitals over time and ensure the robustness of our current approach. Furthermore, we aimed to explore the contribution of harm measurement within a broader health-care quality metric framework and, most importantly, to understand how harm data can be used to inform and evaluate improvement efforts.
Study aims and objectives
Research question
What is the nature and extent of AEs occurring in the Welsh population admitted to hospital over a 4-year period?
Study aims
This study aimed to obtain definitive data on harm in NHS Wales hospitals and to compare the performance of the GTT with the two-stage retrospective review process, using findings to develop an approach to ongoing surveillance of harm in the Welsh NHS.
The specific aims over the course of the project are as follows:
-
to gain an in-depth understanding of the nature and extent of AEs occurring in the Welsh population admitted to hospital over a 4-year period comparing the use of both the retrospective two-stage process and GTT process
-
to compare the scale and scope of health-care-related harm identified by the retrospective two-stage process and GTT process
-
to develop a robust measurement system for harm
-
to embed this harm surveillance in organisations and determine the organisational response.
Study outcome definitions
Adverse event
We defined an ‘AE’ as an ‘unintended injury or complication causing temporary or permanent disability and/or increased length of stay (LOS) and resulting from health-care management’.
Preventable adverse event
A six-point scale was used to assess the likelihood of a causal link between the care given and the injury and the likelihood that the event was preventable. The assessment of preventability was restricted to the research team and MRF2 reviewers, as the assessment of preventability is not a component of the GTT methodology. 46 A judgement was therefore made by research physicians on the likelihood that the harm event may have been prevented (if care was delivered to the standard you could expect in that particular situation). Preventability was reported in three major categories as outlined in Box 1.
1. Virtually no evidence for preventability.
Low preventability2. Slight to modest evidence of preventability.
3. Possibly preventable, but not very likely (less than 50–50, but close call).
High preventability4. Probably preventable (more than 50–50, but close call).
5. Strong evidence for preventability.
6. Virtually certain evidence of preventability.
Adapted with permission from BMJ Publishing Group Limited (Case record review of adverse events: a new approach, Woloshynowych M, Neale G, Vincent C, vol. 12, pp. 411–15, 2003). 30
Severity of adverse events
The assessment of the severity of harm in the MRF2 second stage of the two-stage process is rated in both physical impairment and emotional trauma terms, and ranges from no impairment or trauma to death and severe trauma lasting more than 1 year. The scale is outlined in Table 2.
Level of physical and emotional impairment | Description of level of impairment |
---|---|
Physical impairment | |
0 | No physical impairment or disability |
1 | Minimal impairment and/or recovery in 1 month |
2 | Moderate impairment, recovery in 1–6 months |
3 | Moderate impairment, recovery in 6 months to 1 year |
4 | Permanent impairment, disability 1–50% |
5 | Permanent impairment, disability > 50% |
6 | Permanent nursing |
7 | Institutional care |
8 | Death |
8.1 | Death unrelated to A&E |
8.2 | Minimal contribution from A&E |
8.3 | Moderate contribution from A&E |
8.4 | Death entirely due to A&E |
9 | Cannot reasonably judge |
Emotional trauma | |
0 | No emotional trauma |
1 | Minimal emotional trauma and/or recovery in 1 month |
2 | Moderate trauma, recovery in 1–6 months |
3 | Moderate trauma, recovery in 6 months to 1 year |
4 | Severe trauma, recovery lasting longer than 1 year |
5 | Cannot reasonably judge |
In the GTT, severity is rated using categories E to I of the National Coordinating Council for Medication Error Reporting and Prevention (NCC MERP)’s Categorizing Medication Errors Index53 (Table 3).
Level of severity of harm event (category) | Description of severity of harm event |
---|---|
E | AE causing temporary harm and requiring intervention |
F | AE causing temporary harm and hospitalisation and/or extended stay |
G | AE causing permanent harm |
H | AE requiring life-saving intervention |
I | AE contributing to or causing patient death |
Overview of study design
The overall aim of phase 1 was to compare outcomes and rates of harm from both the retrospective Harvard and GTT methods. Using a common sample, NHS teams undertook harm assessment using the GTT and the research team used the Harvard method. Aggregate levels of harm were compared in the two methodologies using the same methodology. Secondary analysis involved linking generated data sets by unique study number and examining the consensus between methods and professional groups on the identification of AEs. Findings and experience from phase 1 were used to inform the national case note review process in phase 2.
The study transition from phase 1 to phase 2
A transition phase from phase 1 to 2 was not originally built into the project plan. A no-cost extension was sought, as we undertook analysis of the data and synthesis of the learning from the comparative data generated from phase 1. This development period ensured that evidence, experiential learning and the current NHS and policy environment informed the method and implementation of phase 2. Phase 2 aims and methods were subsequently agreed with the National Institute for Health Research Health Services and Delivery Research programme and operationalised 6 months after completion of phase 1 (Figure 2).
Considerations for the development of harm phase 2
There are a number of factors that influenced the way we planned and executed the second phase of the study, and these are detailed as follows.
NHS Wales’ decision to stop using the Global Trigger Tool methodology
Prior to any recommendations from the study team, NHS organisations in Wales had already made the decision to stop using the GTT, as it did not meet organisational requirements to understand harm occurring within the system and provide accessible data to inform and evaluate remedial or improvement activity. Most organisations, however, continued with the process, facilitating the completion of phase 1 aims. This NHS stance was confirmed in the interim descriptive analysis of our data, which demonstrated the variability in the identification of AEs using this tool across study sites and the paucity of information on whether or not the harm event was caused by health-care management or was preventable, thereby limiting clinical learning and prioritisation. The last Welsh health board to stop using the GTT did so in July 2014. At this point, the GTT ceased to be a viable option for NHS Wales and hence for the study going forward into phase 2.
A UK-wide focus on inpatient deaths as a system-level measure of quality
Learning from the Mid-Staffordshire Inquiry, in 2013 Wales mandated that every death occurring within the secondary care setting be assessed by a review of the inpatient episode of care. 48,49 A screening process was developed by the Harm and Mortality Collaborative (NHS led) to identify those individuals for whom there were concerns about the nature and cause of death for the purpose of organisational learning. These episodes of care were then to proceed to a second-stage review identifying contributory factors and clinical and organisational learning. This development was aligned with the future expectations of the medical examiner role, and a national steering group was set up to co-ordinate strategy, activity and monitoring of implementation.
Although the Welsh mortality review process did not set out specifically to measure AEs and was focused more on overall quality-of-care issues, a number of health boards were using GTT or RF1 screening criteria to identify AEs during the mortality review. At the time of phase 2 development, there was no peer-reviewed international literature examining AEs occurring in patients who die in the hospital setting. No study had reported on whether or not AEs in inpatient deaths have the same profile and composition of AEs as those occurring in the remaining 98% of the inpatient population. It is thus unclear if targeting deaths is an effective mechanism for identifying problems in care and prioritising quality improvement at the organisational level. Conversely, NHS leads for patient safety made the point that in respect of assessing quality across inpatient populations, the measurement and monitoring of harm across the inpatient population may provide more opportunities for learning. The identification of inpatient AEs gives a greater level of assurance of system-level organisational safety and quality issues. A decision was made to include in our sample patients whose inpatient episodes resulted in death in order to make a direct comparison in these groups and thereby rationalise and provide an evidenced-based perspective on national-level priorities for patient safety monitoring.
NHS Wales’ reluctance to implement a two-stage review process for routine harm monitoring
Every health board in Wales had committed resources to innovatively evaluate the quality of health care through the review of every death occurring across the system. The resource issues associated with adopting an additional two-stage review as part of routine harm assessment was something that would not be sustainable at the end of the study period and the study team agreed with the NHS and policy leads that this would not be an approach that we would pursue. This meant that neither the GTT nor the Harvard method was a candidate tool for sustainable implementation in Wales, and further work was needed to identify a method that had robust characteristics and was fit for purpose for ongoing routine harm monitoring across NHS Wales.
Recognition that current tools may be effective in identifying adverse events caused by commission but under-report and characterise events caused by omission
The GTT does not include AEs caused by acts of omission and nor are these acts explicit in the RF1 criteria. As part of our interim and exploratory analysis, we undertook thematic analysis of selected clinical summaries generated in phase 1 of the study. This analysis enabled a characterisation of AEs arising from specific areas of clinical or managerial risk, such as readmission and unclassified risk, which did not fit into any screening criteria. What emerged from this analysis were AEs that were caused by acts of omission, which included delayed and missed diagnosis. Including these acts of omission in the screening criteria offered the opportunity to expand the characterisation of the risk of AEs in secondary care settings and became a key focus for phase 2 of the study.
As a result of the interim thematic analysis, the comparative data on the two most commonly used methods to measure harm in a health-care system and our experience of using both methodologies over a period of time, we were well placed to undertake an evidenced-based assessment of how NHS Wales could continue to measure the safety of care delivered across health boards.
The review of phase 2 aims
In our original proposal, the focus of phase 2 was to robustly monitor AEs over time. As described, the amount of learning from phase 1 superseded what became a blunt aim and we identified a number of additional aims that were included in the revised protocol. These are detailed as follows.
To develop a robust measurement system for harm
One of the key points of learning emerging from phase 1 of the study was the need to rationalise the data collected during the case note review process. The GTT provided a paucity of data and the Harvard methodology gave data sets that were unwieldy and labour-intensive both to collect and input. At this point, these tools had been used repeatedly over studies and improvement initiatives with few structural changes made to the original versions. Some countries, for example Sweden, were beginning to supplement the GTT with components of the Harvard methodology, such as the assessment of preventability. We had as our focus efficient data generation that could be used directly by NHS organisations but was still aligned in measurement and definitional structure to robust AE determination.
Having data generated from > 4000 reviews using both methods, we set out to examine the components of both tools, the data generated from each tool and how the tools fitted in with both national and organisational priorities for safety and quality metrics. We also attempted to identify any changes that had been made to the tools, as reported in quality improvement or research reports. In addition, as the review of deaths was a key focus across UK countries, we also included a review of the PReventable Incidents Survival and Mortality (PRISM) study tool for the assessment of avoidable mortality. This assessment led to the subsequent development of the Harm2 tool, of which the main characteristics are pragmatic GTT structure and a Harvard method AE measurement structure, with an assessment of preventability and contributory factors.
In addition to implementing this tool and assessing its performance against the Harvard method, phase 2 offered the opportunity to provide an evidenced-based perspective on AEs occurring in patients who die during their period of hospitalisation. Not only was this intended to provide a comprehensive view of AEs across the inpatient spectrum, but we believed that this could inform and clarify future priorities for national system-level metrics for quality and safety based on empirical evidence and distanced from the controversy surrounding the use of metrics, such as the hospital standardised mortality ratio. Although aligned in terms of building the evidence base around safety in inpatient settings using structured case note review, it differed from the work led by Hogan et al. 20,56 (PRISM study), asking different questions (what is the proportion of patients experiencing AEs? as opposed to what is the proportion of avoidable deaths?), and being heterogeneous in terms of measurement end points (AEs vs. problems in care) and sampling procedures (longitudinal monitoring vs. random snapshot).
A number of additional components were added to the tool, such as the assessment of comorbidity at the time of inpatient management and the specific identification of issues (such as learning disability and dementia) to increase our understanding across a national health-care system on the epidemiological characteristics of AEs.
To embed this harm surveillance in organisations and determine the organisational response
The Harm2 tool was operationalised in a similar way to phase 1. After training, research nurses screened a random sample of inpatient records before making judgements on the presence of AEs and their severity and preventability. A 10% sample of records was double reviewed using the Harvard method and research physicians were used to confirm or refute the presence of AEs.
The tool was shorter and data fields designed to be entered easily into a spreadsheet. Improved timeliness in terms of data entry enabled us to feedback periodically into organisations the summary findings and the details of episodes of care for which AEs were identified. As this was a new tool with novel data and information feedback, we undertook a service evaluation involving both the research team and NHS sites to explore issues of logistics, feasibility, face validity and implications for quality improvement and assurance at both the organisational and national level.
Chapter 2 Measuring harm in Wales
Aim 1
To compare outcomes and the extent of harm in a random sample of NHS Wales admissions from both the retrospective two-stage process and the GTT process.
Objectives
-
To provide a baseline estimate of the number and percentage of Welsh service users who are harmed during the course of their inpatient admission.
-
To examine harm events and make a judgement on their preventability.
-
To examine harm events and categorise them in terms of severity and the specialty and health-care intervention resulting in the harm event.
-
To examine episodes of harm in detail and characterise the service users most likely to experience a harm event during their period of hospitalisation.
Methods
The retrospective two-stage process was undertaken alongside the current infrastructure of NHS-led GTT reviews, in order to compare the rates and quality of information generated on inpatient harm at the health board and national level.
The review tools
The two-stage retrospective review process methodology originally devised in the early 1970s for the Californian Insurance Feasibility Study was refined for use in the Harvard study and subsequently used internationally. 14,16,32–39 The study groups all made sequential changes to the original form by adding or subtracting questions, but maintaining the basic format. Charles Vincent and his group made revisions to the review forms, providing a stronger focus on causation and a more tightly structured format after using the tool in a UK setting; this iteration, known as the MRF2,30,31 is the one used in the study and found in Appendix 1.
Structurally, in terms of the screening criteria, the RF1 remained the same as used in previous studies. We did, however, introduce to the RF1 a narrative of the inpatient episode to provide context to any positive criteria or conversely any episode of care determined to be a negative screen. Furthermore, because we were interested in examining the nursing cohort’s determination of AEs, we asked nurses to make their own determination of AE occurrence at the end of the screening process and rated their confidence in their decision on a 1–5 scale (1 being not at all confident and 5 being very confident). This nurse-led process was structurally similar to the GTT process, in that a timely screening review was followed by AE determination. A copy of the amended RF1 is found in Appendix 2.
The Global Trigger Tool process
No changes were made to the current practice and protocols for undertaking GTT reviews across NHS Wales during the study period. The GTT reviews were undertaken using a structured Welsh GTT pro forma derived from the IHI process,46 with teams being advised to spend no longer than 20 minutes reviewing the discharge summary, medication charts, laboratory results for that admission, operative/theatre documentation, and nursing and medical documentation. If a trigger was identified in the notes, the reviewers made a decision on whether or not harm had occurred and rated its severity using the NCC MERP’s Categorizing Medication Errors Index. 44
Study sample
The study sample comprised all the six health boards providing acute/general inpatient care within NHS Wales. Two hospital sites were chosen from each health board to represent a 50–50 split of acute/emergency care and less acute district general hospital-level care. Eligibility for study inclusion was being over the age of 18 years at admission, having a LOS of > 24 hours and not being treated in a designated mental health or obstetric facility. Two-stage retrospective case note review was conducted on the case records that were sampled for the GTT process. Welsh NHS organisations randomly selected and retrospectively reviewed 20 sets of case notes each month, post discharge and after discharge summaries and coding had been completed. Case notes for review were identified by NHS organisations by generating lists of all hospital discharges fulfilling the inclusion criteria for the month in question and using a random number generator to select 30 inpatient episodes for the period covered. Oversampling in this way was required to ensure that at least 20 sets of notes had been through coding and had been returned to the filing library.
Sample size
The sample size for phase 1 was estimated on previous work undertaken in the USA, Australia and the UK. A systematic review on AEs in health care reports a median percentage of AEs at 9.2%, with a range of 8–12%. 3 The largest single UK-based retrospective review study, to date, on 1008 inpatient records reports a percentage of records with at least one AE of 8.7%. 16 If the percentage of AEs in Wales is around 10%, we required a sample of 3457 records to detect a one-sided difference of 1%, with the reference value with a power of 0.97 and an alpha from 0.05. If the incidence of AEs during hospital admission is lower at 8%, a sample of 4200 admissions is necessary to detect a one-sided difference, with the reference value with a power of 0.80 and an alpha from 0.05.
The review process
In the first stage of the Harvard method, the case notes were screened for potential harm by a team of clinical research nurses (n = 24) and research assistants (n = 2) from the National Institute for Social Care and Health Research (NISCHR) Clinical Research Collaboration (CRC) infrastructure. The team was trained in the use of the RF1 tool through face-to-face meetings with members of the research team, who then supervised the first two case note review sessions in each site to troubleshoot any issues arising and to ensure that the reviewers were confident in the use of the tool. The two non-clinically trained assistants were partnered with experienced research nurses for review sessions. Three research nurse teams covered the three geographic areas of Wales: north, south-east and south-west. If an episode of care screened positive for any of the 18 criteria for an AE, the reviewers described how the criterion was met in a free-text box. At the end of the review, a summary of the episode of care was written irrespective of whether or not positive screens had been identified. There was no restriction set on the time the research nurse review teams spent examining each set of notes.
Admissions that screened positive on at least one screening criterion were examined within the research office before a decision was made on whether or not a full examination of the record needed to be conducted by a senior research physician. This was undertaken to check for accuracy in the completion of the forms, to identify if there was any potential for missed AEs and to ensure that screening criteria were met in full, thereby rationalising the use of the intensive two-stage review when there was no indication that a harm event had occurred. We therefore referred a small number of reviews for which screening criteria did not identify any potential for harm but the clinical summary stated, for example, that a patient was readmitted after a knee replacement with serous fluid oozing from the wound, or stated that no obvious criteria were present but that this was a complex episode of care that would benefit from physician review. Conversely, we did not refer cases for in-depth review when the narrative describing the positive screen indicated that a fall had occurred but under the description of the criteria was documented, ‘no injury sustained’, when the descriptor for pressure ulcer read ‘buttocks red, cavalon applied’ or ‘moisture lesion present’ but tissue damage did not develop, when adverse drug reaction mentioned rash or an isolated episode of diarrhoea but drugs were continued with no systemic ill effect, and when readmission occurred in different clinical specialties with no connection. This was undertaken in the research office and was independent of the nurse process and was part of a quality assurance process.
Referred notes from phase 1 went on to a second-stage review, in which 12 research physicians, who were experienced in record review through involvement with similar studies, used the MRF2 tool to judge the occurrence, nature, preventability and consequences of the referred event on a per diem basis. The research physician reviewed the admission in question and episodes of care either side of the admission before making a determination on whether or not the patient experienced an AE before its subsequent categorisation. The inter-rater reliability was assessed in a 10% sample of admissions, which were double reviewed at both the nursing screening and physician confirmation stages and proportional to the contribution made by each participating organisation.
Statistical analysis
The GTT is conventionally analysed using a three-step method. The aggregated data are converted from episodes of harm into a harm rate and then compared with a baseline. We used the same methodology to compare aggregate rates in the two methodologies. Secondary analysis involved linking GTT and RF1 (nurse-generated) and MRF2 (physician-generated) data by unique study number and examining the consensus between methods and professional groups on the identification of AEs.
The two-stage review data were analysed to determine the percentage and corresponding CI of patients who experienced an AE during the period of hospitalisation and the severity and level of preventability of the incident both in individual hospital sites and nationally. Univariate logistic regression assessed the relationship between patient variables and the presence of an AE in physician-determined AEs. An unweighted kappa coefficient was used to assess inter-rater reliability with corresponding 95% CIs and the percentage of records that were concordant and discordant, respectively. Missing data were assessed and, because of their infrequency, (1) were presumed to be missing at random and (2) were analysed on the available data set. Weighted means were calculated using simple techniques to calculate a weighted average of the percentage of AEs in each study site, adding weights together and dividing by determined events.
Time series analysis was undertaken by plotting percentages of AEs by quarter and calculating the beta of the coefficient with corresponding CIs. Statistical analysis was undertaken using Stata version 14 (StataCorp LP, College Station, TX, USA).
Data from different tools (RF1, MRF2 and GTT) were linked via the VLOOKUP function in Microsoft Excel® (2016; Microsoft Corporation, Redmond, WA, USA). GTT cases were identified by their hospital number, and the study data, including the MRF2 reviews, were identified through a unique study number. An independent database, held separately from the study data, provided patient identification fields for the match between hospital and study numbers. Cases of harm from the GTT and MRF2 review were identified and matched together using this reference data and the VLOOKUP function in Microsoft Excel.
Ethics approval was granted from the Wales Ethics Committee 1 (integrated research application system project identification 54861). Section 251 support was granted from the National Information Governance Board [ECC-5-02(FT2)12011]. Research and development approval was granted from every participating health board.
Findings: measuring harm in Wales
Aim 1
To compare outcomes and rates of harm from both the Harvard and GTT methods.
Findings
The findings are presented here in four parts. Part 1 describes the sample and extent of AEs determined from the Harvard method. Part 2 outlines the nature and characteristics of the AEs and part 3 the efficiency of the RF1 screening criteria in converting to harm events and their use in characterising AEs in secondary care settings. Finally, part 4 compares the percentage of patients experiencing AEs generated from both the two methods.
Part 1: percentage of patients with determined adverse events identified from the Harvard methodology
Sample size and quality of records
Twelve NHS hospitals across NHS Wales were invited to participate in the study, with 11 actively taking part from the study commencement. Individual samples from study sites ranged from 174 to 560, but the sample over time was significant, with one health board (sites 1 and 2) providing a larger sample than provided in any other previous similar study in a UK setting (Table 4).
Site | Number of reviews undertaken | Number of reviews included in analysis | Second-stage review MRF2 undertaken (percentage of total records reviewed) |
---|---|---|---|
1 | 560 | 553 | 98 (17.5) |
2 | 537 | 530 | 65 (12.1) |
3 | 441 | 432 | 99 (22.4) |
4 | 394 | 375 | 53 (14.5) |
5 | 338 | 309 | 73 (21.6) |
6 | 174 | 172 | 49 (28.2) |
7 | 317 | 290 | 57 (18.0) |
8 | 341 | 330 | 55 (16.1) |
9 | 413 | 393 | 64 (15.5) |
10 | 494 | 484 | 111 (22.5) |
11 | 527 | 520 | 97 (18.4) |
Total | 4536 | 4388 | 821 |
Record exclusions
Episodes of care were excluded from the final analysis on the basis of (1) errors in the patient record retrieved for review, (2) incomplete documentation in the patient record preventing the completion of the review and (3) not meeting age and inpatient LOS inclusion criteria. The percentage of reviews excluded from analysis at the individual organisational level ranged from < 1% to 9.17%.
As part of the review process, an assessment was undertaken of the completeness of the inpatient record. The percentage of records with insufficient information to undertake a robust appraisal of the episode of care, that is, inadequate documentation, ranged from < 1% to > 6% (Figure 3).
Valid observations on key demographic characteristics were available for 4361 episodes of care (99.4%) and the mean age of the patient during the index admission under review was 64.1 years (95% CI 64.48 to 64.66 years). The mean age of our study sample is older than the reported breakdown of hospital admissions nationally, in which 43% of patients discharged from hospital nationally are over the age of 65 years, compared with 56% in our study sample. It is likely that the non-inclusion of day cases and paediatric admissions in our study sample accounts for most of this difference. All inpatient episodes were stratified by age band and, as seen in Figure 4, the full age spectrum was represented in the study sample. Representation increased sequentially until the over-85s age group, when there was a significant drop in the number of reviews undertaken. Age-related data were unavailable in 27 episodes of care.
The sample varied across participating organisations and one site (site 4) had a younger demographic profile, which was explored by the research team at the interim analysis stage (Figure 5). We identified potential problems with their sampling strategy, which was subsequently amended on request.
There was a 47% to 53% male to female gender split in the 4371 inpatient records examined, and in 17 cases (0.39%) age was not recorded.
On completion of the RF1 screening process, 1430 (32.6%) inpatient episodes of care were identified that had at least one identified positive screening criterion for potential AEs and, of these, 821 inpatient episodes (18.1%) were referred after review by the research team for in-depth implicit physician review. At the time of the second-stage reviews, a further 31 records were irretrievable for secondary review within hospital sites and a total of 790 MRF2 reviews were completed (96.2%). The percentage of cases with one or more positive criterion for harm ranged between 23% and 43.8% (Table 5).
Site | Number of reviews | Number of valid reviews | Criteria present in data set (% of total number of valid reviews) (n = 4388) |
---|---|---|---|
1 | 560 | 553 | 153 (27.7) |
2 | 537 | 530 | 122 (23.0) |
3 | 441 | 432 | 189 (43.8) |
4 | 394 | 375 | 105 (28.0) |
5 | 338 | 309 | 114 (36.9) |
6 | 174 | 172 | 66 (38.4) |
7 | 317 | 290 | 108 (37.2) |
8 | 341 | 330 | 139 (42.1) |
9 | 413 | 393 | 142 (36.1) |
10 | 494 | 484 | 151 (31.2) |
11 | 527 | 520 | 141 (27.1) |
Total | 4536 | 4388 | 1430 (32.6) |
After MRF2 review, at least one AE was determined in 450 out of 4388 (10.3%) discrete episodes of care examined, with a corresponding 95% CI of 9.4% to 11.2% [median 9%, standard deviation (SD) 2.44%]. Nearly 10% of episodes had a further AE documented (n = 42). When the overall percentage of patients with AEs was weighted for the proportional contribution of individual study sites to the study, the weighted AE rate increased to 10.83% (95% CI 9.91% to 11.75%). Percentages of AEs by individual study sites ranged from 7.9% to 16.1%, with 10 of the 11 participating organisations falling within the range reported in previous two-stage retrospective review studies (Table 6).
Site | Total number of RF1s | Number of included RF1s in analysis | Total number of MRF2s requested | Total number of MRF2s undertaken | Number of injuries or complications identified though review | Percentage of AEs identified |
---|---|---|---|---|---|---|
1 | 560 | 553 | 99 | 88 | 50 | 9.0 |
2 | 537 | 530 | 65 | 63 | 42 | 7.9 |
3 | 441 | 432 | 99 | 99 | 51 | 11.8 |
4 | 394 | 375 | 53 | 52 | 30 | 8.0 |
5 | 338 | 309 | 73 | 68 | 32 | 10.4 |
6 | 174 | 172 | 49 | 47 | 15 | 8.7 |
7 | 317 | 290 | 56 | 54 | 35 | 12.0 |
8 | 341 | 330 | 55 | 52 | 29 | 8.8 |
9 | 413 | 393 | 64 | 60 | 33 | 8.4 |
10 | 494 | 484 | 111 | 108 | 78 | 16.1 |
11 | 527 | 520 | 97 | 96 | 55 | 10.5 |
Incomplete information contained in review | 3 | |||||
Total | 4536 | 4388 | 821 | 790 (96.2%a) | 450 | 10.3 |
For all episodes of care, we were able to categorise the primary AE into whether it was caused (1) by health-care management, (2) by health-care management interacting with the disease process and (3) solely by the disease process. The breakdown is shown by individual study site in Table 7.
Site | Causes of AEs | ||
---|---|---|---|
Health-care management | Health-care management interacting with the disease process | Solely the disease process | |
1 | 17 | 20 | 14 |
2 | 6 | 26 | 10 |
3 | 16 | 23 | 12 |
4 | 8 | 16 | 6 |
5 | 13 | 13 | 6 |
6 | 6 | 6 | 3 |
7 | 10 | 14 | 11 |
8 | 16 | 10 | 3 |
9 | 18 | 15 | 0 |
10 | 17 | 39 | 21 |
11 | 5 | 32 | 18 |
Total | 132 (29.3%a) | 214 (47.5%a) | 104 (23.1a) |
Just over three-quarters of all determined AEs resulted from health-care management or health-care management interacting with the disease process, with the remaining events being complications or unavoidable events arising from an individual’s underlying health status. Figure 6 demonstrates that when the cases that were deemed to arise solely from the disease process are removed from the AE rate, there is a reduction in the variability of rates across NHS Wales sites.
Adverse events were also categorised according to the level of management causation on a scale of 1–6, for which 1 is virtually no evidence for management causation/system failure and 6 is virtually certain evidence for management causation. Of the 448 events classified, 195 (43%) events were graded 1 or 2 (i.e. no or low evidence of management causation), 132 (29%) were graded as moderate in terms of evidence of management causation and 121 (27%) were graded as highly likely to be caused by health-care management (Table 8).
Level of confidence by physicians irrespective of preventability of the management causation of the AE | Number of AEs classified as | Percentage of the total number of classified AEs |
---|---|---|
1. Virtually no evidence for management causation/system failure | 107 | 23.7 |
2. Slight to modest evidence for management causation | 88 | 19.6 |
3. Management causation not likely: < 50–50, but close call | 42 | 9.3 |
4. Management causation more likely than not; > 50–50, but close call | 90 | 20.0 |
5. Moderate/strong evidence for management causation | 65 | 14.4 |
6. Virtually certain evidence for management causation | 56 | 12.4 |
Not classified | 2 | 0.5 |
Total | 448 |
This judgement was made after consideration of all the clinical details of the patient’s management, irrespective of preventability and is a measure of the level of confidence that the health-care management caused the injury. In just under half of all cases (46.8%) there was sufficient confidence from the record of care that there was a likelihood that health-care management was directly responsible for the AE. The cases for which there was no evidence for management causation (23.7%) corresponds crudely to the 104 cases (see Table 7) for which the physician judged that the AE was directly attributable solely to the patient’s underlying disease status.
Across all study sites, 51.5% of AEs were identified as having at least some evidence of preventability when defined as events resulting from care not being delivered to a standard considered to be good clinical practice, in normal circumstances (median 55%, SD 13.30%, 95% CI 46.88% to 56.12%) (Table 9).
Site | Valid RF1s | MRF2 undertaken | Number of injuries or complications | Number of preventable AEs | Percentage of preventable AEs |
---|---|---|---|---|---|
1 | 553 | 88 | 50 | 27 | 54 |
2 | 530 | 63 | 42 | 17 | 40 |
3 | 432 | 99 | 51 | 28 | 55 |
4 | 375 | 52 | 30 | 16 | 53 |
5 | 309 | 68 | 32 | 18 | 56 |
6 | 172 | 47 | 15 | 13 | 87 |
7 | 290 | 54 | 35 | 20 | 57 |
8 | 330 | 52 | 29 | 16 | 55 |
9 | 393 | 60 | 33 | 22 | 67 |
10 | 484 | 108 | 78 | 32 | 41 |
11 | 520 | 96 | 55 | 23 | 42 |
Total | 4388 | 790 (96.2%a)b | 450 | 232 | 51.5 |
When the percentage of preventable AEs was weighted by the individual organisation’s contribution to the study sample, the mean percentage of preventability increased slightly to 53.13% (95% CI 48.52% to 57.74%). Preventability was also examined on a six-point Likert scale and in three main categories (Table 10).
Site | Level of preventability | ||
---|---|---|---|
Percentage of AEs that were preventable | Number of preventable AEs greater or equal to a score of 2 (slight to modest evidence) | Number of AEs greater than a score of 4 (probably preventable) | |
1 | 54 | 27 | 19 |
2 | 40 | 17 | 3 |
3 | 55 | 31 | 7 |
4 | 53 | 16 | 7 |
5 | 50 | 17 | 12 |
6 | 87 | 13 | 10 |
7 | 57 | 20 | 11 |
8 | 55 | 16 | 6 |
9 | 66 | 20 | 10 |
10 | 41 | 29 | 13 |
11 | 42 | 20 | 12 |
Total | 51.5 | 226 (50.2%a) | 110 (24.4%a) |
The majority of all AEs (n = 226, 51%) was classified as having at least ‘slight to modest evidence for preventability’, corresponding to scores of between 2 and 6 on the Likert scale (95% CI 46.88% to 56.12%). In only 24% (n = 110) of cases, the level of preventability was determined to be ‘probably preventable; > 50–50 but close call’ corresponding to a score between 4 and 6 on the Likert scale (95% CI 20.43% to 28.37%).
Ten per cent of all records were double reviewed at both the RF1 and RF2 stages and the level of agreement between reviewers was assessed, which is reported in Table 11.
Site | Total number of reviews | Number of pairs | Unweighted kappa | 95% CI | Altman scale 199157 | |
---|---|---|---|---|---|---|
Concordant | Discordant | |||||
1 | 58 | 45 | 13 | 0.29 | 0 to 0.63 | Fair |
2 | 56 | 45 | 11 | 0.43 | 0.13 to 0.73 | Moderate |
3 | 49 | 33 | 16 | 0.32 | 0.05 to 0.60 | Fair |
4 | 48 | 36 | 12 | 0.37 | 0.06 to 0.68 | Fair |
5 | 32 | 23 | 9 | 0.43 | 0.13 to 0.75 | Moderate |
6 | 20 | 13 | 7 | 0.27 | 0.0 to 0.70 | Fair |
7 | 29 | 24 | 5 | 0.63 | 0.33 to 0.92 | Good |
8 | 41 | 32 | 9 | 0.56 | 0.30 to 0.81 | Moderate |
9 | 54 | 41 | 13 | 0.51 | 0.28 to 0.74 | Moderate |
10 | 53 | 43 | 10 | 0.59 | 0.48 to 0.82 | Moderate |
11 | 49 | 33 | 16 | 0.34 | 0.07 to 0.60 | Fair |
Total | 492 | 377 | 121 | 0.47 | 0.39 to 0.55 | Moderate |
The overall agreement on episodes of care having at least one positive criterion for harm after assessment was moderate (the kappa falling between 0.41 and 0.60 on the Landis Koch Benchmark scale) (κ = 0.47, 95% CI 0.39 to 0.55) or 76.6% concordant and there was significant variation in the reliability performance across the participating study sites.
Eighty-seven double reviews were undertaken for implicit MRF2 inter-rater reliability assessment. Agreement between reviewers was achieved in 83% of cases (n = 72), with a kappa of 0.63 and a corresponding 95% CI of 0.46 to 0.80.
Our study sample was longitudinal in nature and rates of AEs were plotted over the phase 1 period. Figure 7 reports crude rates of AEs determined by research physicians after MRF2 review per quarter. Although a trend is suggested, this was not statistically significant in linear regression. The beta of the coefficient was 1.58 (95% CI –0.16 to 3.34; p = 0.070).
Part 2: the nature and severity of adverse events across NHS Wales
Physicians identified the principal problems in care that resulted in the occurrence of an AE. Table 12 shows the distribution of problems in care across the inpatient episodes examined and categorised when management causation was established (n = 343). Failure in clinical monitoring and management was an issue identified in one-third of all events (n = 115, 33.5%). Other significant problems arose from failure to manage and prevent infection (n = 101, 29.4%), issues directly related to a problem with an operation or procedure (n = 73; 21.2%) and problems in the prescribing, administration or monitoring of drugs and fluids (n = 64; 18.7%). When assessed at the individual hospital site, variations in the relative importance of problems in care are observed, but it is evident that patients are at risk as a result of failures in clinical monitoring, infection control-related procedures and drug and fluid administration across all sites.
Problem in care category | Frequency | Percentage of total AEs |
---|---|---|
1. AE relating to diagnostic or assessment error | 47 | 13.7 |
2. AE from failure to appreciate patient’s overall condition | 18 | 5.2 |
3. AE arising from a failure in clinical monitoring management | 115 | 33.5 |
4. AE in relation to failure to prevent/control/manage infection | 101 | 29.4 |
5. AE directly related to a problem with an operation or procedure | 73 | 21.2 |
6. AE relating to prescribing, administration or monitoring of drugs or fluids including blood | 64 | 18.7 |
7. AE relating to resuscitation | 1 | 0.3 |
If the occurrence of these common events identified through case note review is crudely extrapolated across the total number of hospital discharges in Wales, per annum (760,418 events), they would equate to around 19,900 events relating to failures in clinical monitoring, 17,500 AEs resulting from failures in infection control, 12,650 events relating to problems with operations and procedures and 11,000 AEs relating to the administration of drugs, fluids or blood. This reflects significant excess resource utilisation and expenditure.
Adverse events have significant clinical impact but are also costly in terms of excess days in inpatient settings with attendant excess treatment costs. Table 13 outlines the percentage of (1) patients with physician-confirmed AEs across NHS Wales who were discharged with a level of disability not present at admission; (2) patients who required an extended LOS and additional treatments and/or no extended stay but additional treatment; and (3) patients whose AEs were associated with but not necessarily causally related to, inpatient death. Nearly three-quarters (n = 329) of our cohort of patients with AEs needed additional treatment and/or days in hospital, 71 (16%) were discharged with a significant impairment in their functional status and in 40 patients (9%) the AE may have been associated with but not necessarily causally related to, the patient dying during the inpatient episode.
End point of AE | Frequency | Percentage of patients with AEs with a defined outcome | 95% CI |
---|---|---|---|
Disability | 72 | 16.0 | 12.6 to 19.4 |
Prolonged stay/subsequent treatment | 328 | 72.9 | 68.8 to 77.0 |
Death | 41 | 9.1 | 6.45 to 11.8 |
The severity of the injury across NHS Wales
Physicians were asked to make an additional assessment of the severity of the event, in terms of both the physical and emotional impact on the patient. This is reported in Table 14. An assessment of physical disability was made in the 323 events reported (72%). Reviewers did not make an assessment of disability in cases in which the AE was determined to have been caused solely as a result of the disease process. Nearly 74% of the events assessed resulted in no or minimal physical impairment, with an expected recovery within 1 month. At the other end of the spectrum, the physician determined in only 26 cases that the event was associated with the patient’s death and in six cases that it was responsible for the death of the patient.
Level of disability | Number of AEs |
---|---|
Physical disability | |
Minimal impairment and recovery expected in 1 month | 115 |
Moderate impairment, recovery in 1–6 months | 41 |
Moderate impairment, recovery in 6 months to 1 year | 6 |
Permanent impairment, disability | 11 |
Death | |
Death with minimal contribution from the AE | 5 |
Death with moderate contribution from the AE | 13 |
Death entirely attributable to the AE | 6 |
Death, but cannot judge association with the AE | 2 |
Emotional trauma | |
Minimal emotional trauma and/or recovery in 1 month | 78 |
Moderate trauma, recovery in 1–6 months | 29 |
Moderate trauma, recovery in 6 months to 1 year | 7 |
Severe trauma, recovery lasting longer than 1 year | 3 |
The impact of the injury in terms of the likely emotional trauma experienced by the patient was less comprehensively assessed, with categorisation available for 309 out of 450 AEs. The severity reported follows a similar trend, with 61% of events categorised as no or minimal trauma and < 3% of events classified as causing moderate to severe trauma, with recovery predicted to last over 6 months.
Adverse events and excess length of stays
For 45% of patients who experienced an AE, the research physician estimated the excess LOS at the hospital resulting from the management of the injury. Of the 206 patients identified with confirmed AEs and estimated excess LOS of > 24 hours in duration, one patient had a LOS of 564 days, which would have significantly skewed the data and therefore was excluded from the analysis. Excluding this patient, the range was between 1 extra day as an inpatient to 43 extra days, with a median duration of 6 days. The distribution of excess LOS by quartile is reported in Figure 8. One in four of the studied patients who experienced AEs had an additional LOS of at least 10 days.
Readmissions to manage the clinical impact of the adverse events across NHS Wales
Adverse events are also commonly associated with a subsequent readmission necessary to deal with the clinical sequelae of the event. In this cohort of patients experiencing an AE across NHS Wales, one-quarter of all determined AEs resulted in a readmission to specifically manage attendant clinical complications (Table 15); this equates to 2.7% of the total inpatient studied (120/4388). The readmission rate ranged from 9.38% to 40.0% of patients with confirmed AEs across participating hospitals. Common readmissions in the data set include those required to drain and evacuate haematomas post-surgical procedures, to receive treatment for hospital-acquired infections and to manage complications of drugs and other therapeutic treatments.
Site | Number of AEs | Number of AEs resulting in new admission | Percentage of total number of AEs |
---|---|---|---|
1 | 50 | 18 | 36.0 |
2 | 42 | 7 | 16.7 |
3 | 51 | 16 | 31.37 |
4 | 30 | 12 | 40.00 |
5 | 32 | 3 | 9.38 |
6 | 15 | 5 | 33.33 |
7 | 35 | 8 | 22.86 |
8 | 29 | 10 | 34.48 |
9 | 33 | 11 | 35.48 |
10 | 78 | 17 | 21.52 |
11 | 55 | 13 | 23.64 |
Total | 450 | 120 | 26.73 |
Overall assessment of the quality of inpatient care
At the end of the MRF2 process, physicians were asked the following question: considering all that you know about this patient’s admission, how would you rate the overall quality of care? Among 450 episodes of care for which AEs were identified, the physician cohort still deemed the overall quality of care to be excellent or good in 71% of cases (Table 16). This assessment of quality is a subjective one. However, an association exists between the rating of quality of care and the physician’s determination of a preventable AE. The preventability of AEs by care quality rating is shown in Figure 9, with levels of preventability increasing with poor quality-of-care rating. What this suggests is that physicians rate global quality-of-care issues on whether or not care was delivered to accepted standards under normal circumstances, and this therefore precludes, in many cases, deviation from common guidelines or adverse patient outcomes that could have been potentially avoided. It is, however, also recognised that AEs might occur against a backdrop of good overall quality of care.
Site | Number of AEs where estimated quality of care is rated as excellent or very good | Total number of AEs | Percentage of total number of AEs where estimated quality of care is rated as excellent or very good |
---|---|---|---|
1 | 35 | 50 | 70.0 |
2 | 36 | 42 | 85.7 |
3 | 38 | 51 | 74.5 |
4 | 24 | 30 | 80.0 |
5 | 15 | 32 | 47.3 |
6 | 5 | 15 | 33.3 |
7 | 28 | 35 | 80.0 |
8 | 15 | 29 | 51.7 |
9 | 17 | 33 | 51.5 |
10 | 67 | 78 | 85.9 |
11 | 40 | 55 | 72.7 |
Total | 320 | 450 | 71.1 |
Factors independently associated with subsequent adverse event determination
The likelihood of having an AE during inpatient management was assessed with specific interest in the patient’s age, sex and the duration of the inpatient episode (Table 17). Age is a significant factor in an individual’s risk of experiencing an AE during inpatient care, with the risk increasing by 13% when assessed by 10-year age bands [odds ratio (OR) 1.13, 95% CI 1.07 to 1.19]. An exception was observed in the middle-aged groups (i.e. 45–54 years age group), in which the AE rate of 6.75 was lower than expected. The AE rate peaks in the over-85 years age group, 14.4% of whom experienced an AE during the inpatient episode (Figure 10).
Variable | OR | 95% CI | p-value |
---|---|---|---|
LOS (3-day groupings and > 14 days) | 1.37 | 1.29 to 1.46 | < 0.001* |
Age band (18–24 years, 10-year age groupings and over-85 years) | 1.12 | 1.07 to 1.19 | < 0.001* |
Sex (male) | 0.99 | 0.82 to 1.20 | 0.949 |
Admission status (emergency admission) | 0.83 | 0.65 to 1.06 | 0.128 |
The mean LOS across study sites for the 4388 valid episodes of care was 7.16 days (95% CI 7.06 to 7.42 days), with a median of 5 days. The average total length of hospital stay is 6.2 days across Welsh hospitals and 5 days per finished consultant episode. A 37% increased risk of AE determination is evident when assessed by incremental inpatient durations of 3 days (OR 1.37, 95% CI 1.29 to 1.46). The percentage of patients experiencing an AE ranged from 5.9% of those with a LOS of 2–4 days to 19.8% of those whose inpatient stay was > 2 weeks in duration (Table 18).
LOS | Frequency | Frequency of physician-confirmed AE | Percentage of total frequency |
---|---|---|---|
< 48 hours | 114 | 9 | 7.9 |
2–4 days | 1752 | 104 | 5.9 |
5–7 days | 1049 | 97 | 9.3 |
8–10 days | 523 | 79 | 15.1 |
11–14 days | 372 | 50 | 13.4 |
> 14 days | 527a | 104 | 19.8 |
Total | 4337 | 443 | 10.21 |
The interaction between LOS and AE determination is difficult to interpret and may reflect either longer length of exposure or longer inpatient duration to deal with the AE. The mean age of patients in each length of hospital stay band was assessed to determine the influence of age on duration of inpatient admission (Figure 11). Age does not appear to be a significant factor up to and including a length of hospital stay of 10 days. The mean age is significantly higher in patients whose inpatient stay was > 10 days.
Sex and admission status (i.e. elective or emergency) were not found to be statistically associated with AEs across NHS Wales, although being an elective patient appears to predispose patients to around a 17% increased risk of experiencing an AE (OR 0.83, 95% CI 0.65 to 1.06). This suggests an increased risk of AEs in surgical/procedure episodes of care, but failed to reach statistical significance. The breakdown in our elective/emergency case mix again differs from the case mix reported nationally, 41/59 cases compared with our 23/77 split and, again, the exclusion of cases on the basis of being an inpatient for < 24 hours significantly influences the sample from which our reviews were undertaken.
Inpatient specialty as a risk factor for inpatient adverse events
The AE rate was also assessed crudely according to whether patients were managed within a medical or surgical inpatient area. Of the 450 AEs identified, 263 (58%) were managed within surgical specialties and the remaining 187 (42%) within medical areas. Given that only approximately one-third of our study population received surgical care, the risk of AEs is substantially higher for this patient group.
As AEs are known to occur more frequently in surgical patients than medical patients and there were significant differences in the percentage of AEs identified site by site, we examined the reviews undertaken within each individual health board, assessing the composition of surgical and medical patients within the individual samples (Table 19).
Site | AE rate in organisation | Percentage of surgical patients in sample from surgical care |
---|---|---|
1 | 9.0 | 62.0 |
2 | 8 | 21.4 |
3 | 12 | 58.8 |
4 | 8.0 | 83.3 |
5 | 10 | 43.8 |
6 | 9 | 26.7 |
7 | 12 | 62.9 |
8 | 9 | 37.9 |
9 | 8 | 60.6 |
10 | 16 | 75.6 |
11 | 12 | 69.9 |
Total | 10.3 | 58.4 |
The percentage of the study sample that was derived from surgical patients ranged from 21% to 83% across participating organisations. Excluding two sites (sites 4 and 9) where there were recognised issues with the sampling or the number of reviews that were undertaken, AE rates are higher in organisations in which the sample included a higher percentage of surgical patients.
Key characteristics of differences between AEs that were surgical in nature and those that were medical in nature were explored and are reported in Table 20. The mean age was significantly different, being 73 years on admission to medical wards and 64.3 years on admission to surgical wards (p < 0.001). Other differences are noted, but failed to reach statistical significance, and these included higher rates of readmission in surgical patients (29% vs. 23%); higher rates of disability at discharge in surgical patients (19% vs. 12%); and a higher percentage of surgical patients requiring a prolonged inpatient stay (77% vs. 67%). The association between AEs and death, although not representing any causality, was lower in surgical patients than in medical patients (8% vs. 10%). We conclude that, across NHS Wales, events occurring in surgical patients are more frequent and are associated with both a higher level of disability and greater inpatient resource utilisation. Examples of typical AEs seen in these two settings are given in Table 21.
Variable or characteristic of interest | Patients with AE managed in | Assessment of difference | |
---|---|---|---|
Medicine | Surgery | ||
Mean age (years) with AE (95% CI) | 73 (70.71 to 75.5) | 64.3 (62.0 to 66.7) | t = 5.036; p < 0.001* |
Median LOS (days) with AE (SD) | 5 (7.70) | 7 (7.65) | NS |
Median excess LOS (days) with AE (SD) | 6 (6.10) | 6 (2.2) | NS |
Readmission attributable to AE (%) (95% CI) | 23.2 (17.7 to 29.9) | 28.9 (23.7 to 34.7) | χ2 = 1.958; p = 0.162 |
Injury caused solely by disease process (%) (95% CI) | 23.1 (17.6 to 29.8) | 22.9 (18.2 to 28.4) | χ2 = 0.0008; p = 0.977 |
Preventable AE, % (95% CI) | 52.9 (45.8 to 60.1) | 50.8 (44.7 to 56.8) | χ2 = 0.2072; p = 0.649 |
Percentage with prolonged hospital stay due to AE (95% CI) | 67.4 (60.3 to 73.8) | 76.8 (71.3 to 81.5) | χ2 = 6.681; p = 0.083 |
Percentage whose AE was associated with death (95% CI) | 10.1 (7.0 to 16.1) | 8.0 (5.2 to 12.0) | χ2 = 4.268; p = 0.371 |
Percentage who had a degree of disability at discharge (95% CI) | 11.8 (7.8 to 17.3) | 19.0% (14.7 to 24.2) | χ2 = 6.681; p = 0.083 |
Time spent to review the episode of care in minutes (95% CI) | 26.3 (24.1 to 28.5) | 26.2 (24.6 to 27.8) | NS, t = 0.08; p = 0.468 |
Example of | |
---|---|
Surgical AE | Medical AE |
Infected swelling of umbilicus post hernia repair 1 month previously. Rx surgical excision under LA and follow-up with community wound team | Presented with 1-month history of epigastric pain, jaundice, pale stool and dark urine. Underwent ERCP confirming bile duct blockage and stent inserted. Became unwell and transferred to HDU with pancreatitis and biliary sepsis |
Elective admission for liver biopsy. Became haemodynamically unstable 2 hours post biopsy. CT showed subcapsular haematoma and collection in pelvis – subsequently developed infection. Required ERCP, antibiotics and painkillers | Patient admitted 1-week post MI and angioplasty. Commenced on ramipril. Admitted with hypotension, swelling of lips and rash. Ramipril stopped |
Admitted for total right hip replacement. Fell on ward 2 weeks post surgery and had a periprosthetic fracture requiring revision with long stem | Admitted with left-sided pleuritic chest pain post hysterectomy 2 weeks earlier. Diagnosed with PE secondary to DVT. Patient admitted with PE. Treated with warfarin but developed warfarin-induced skin necrosis on left hip. Admitted with haematuria. History of PE and DVT and on warfarin. INR of > 10 on admission and warfarin stopped |
Laparoscopic cholecystectomy. Postoperative bleed requiring return to theatre, drain and transfusion | Admitted with cellulitis. GP had assessed for DVT, but ruled out in A&E. Given one dose of prophylactic Clexane® (Sanofi-Aventis, Guildford, UK). Collapsed on ward as a result of a PE |
Neurosurgical patient with tumour complained of abdominal and leg pain post operation, prior to discharge. Diagnosed as trapped wind and constipation. Died 2 days later at home with thromboembolic event | Acute admission following fall. Sustained wedge fracture of thoracic spine and admitted for hyponatraemia. Fell on ward, sustaining fracture of left ankle |
Admitted for elective endovascular stenting of aortic aneurysm. Severe back pain present 1 day post operation. CT angiography 3 days later revealed non-perfused left kidney and blocked renal artery. Eventual kidney infarct | Admitted after mechanical fall and laceration to leg and arm – sutured by oral and maxillofacial surgery. Subsequently had CVA on ward and grade 2–3 pressure damage noted to sacral area |
Part 3: conversion of harm criteria to determined adverse events in the Harvard method
Screening criteria for harm have been comprehensively assessed in a number of previous studies, including in terms of their sensitivity, specificity, positive predictive values and negative predictive values, and these were not fully assessed in this research. However, we monitored the rates of conversion from a nurse-identified criterion for AEs to physician confirmation of AEs to gain insight into the relationships between the two in order to assess the efficiency of the screening criteria in identifying AEs in the Welsh context (Table 22).
Criteria | Frequency in | Percentage conversion | ||
---|---|---|---|---|
4388 admissions in RF1 | MRF2-confirmed events | |||
1 | Admission pre index admission within 1-year period | 376 | 93 | 24.7 |
2 | Admission post index admission within 30 days | 454 | 115 | 25.3 |
3 | Hospital-incurred accident or injury | 240 | 112 | 46.7 |
4 | Drug error or reaction | 169 | 76 | 45.0 |
5 | Unplanned transfer to higher level of care, for example ITU | 47 | 20 | 42.5 |
6 | Unplanned transfer to another hospital | 23 | 5 | 21.7 |
7 | Unplanned return to surgery | 41 | 22 | 53.6 |
8 | Procedure-related organ or internal injury | 16 | 10 | 62.5 |
9 | Complications including CVA, DVT, MI | 75 | 35 | 46.7 |
10 | Neurological deficit not present on admission | 52 | 20 | 38.5 |
11 | Unexpected death | 39 | 14 | 35.9 |
12 | Inappropriate discharge/planning | 129 | 46 | 35.7 |
13 | Cardiac or respiratory arrest | 44 | 13 | 29.5 |
14 | Injury or complications of labour/termination of pregnancy | 4 | 1 | 25.0 |
15 | Hospital-acquired infection | 191 | 112 | 58.6 |
16 | Patient/carer complaint | 100 | 24 | 24.0 |
17 | Litigation documented in the medical record | 5 | 1 | 20.0 |
18 | Other not included in the criteria above | 220 | 75 | 34.1 |
Rates of conversion vary considerably across the tool, with 7 out of the 18 criteria resulting in a conversion rate of around 25%, and 4 out of 18 criteria in a conversion rate of > 50%. Although not providing the epidemiological insight offered by the full assessment of screening parameters, the conversion data are important as (1) they were more intuitive to understand by those using the tools in routine practice and (2) they confirmed findings reported by organisations that some criteria are better at identifying AEs and have greater face validity than others. An assessment of these factors enhanced our understanding of the origin of risk to patients and highlights areas in which screening criteria may be refined to represent specific clinical AEs more closely.
Studying variation in the composition of harm events at the organisational level to inform improvement priorities
The measurement of harm occurs in two settings: academic groups undertaking robust research, testing and validating methods and measurement properties; and health-care organisations monitoring the nature and extent of harm in their local context. Both have quality assurance and improvement as philosophical end points, but to date it is difficult to see how the data generated are used by clinicians and managers to realise safer care. Figure 12 highlights how harm measurement is the first of a five-step journey of improvement leading to understanding, problem-solving, ongoing assessment and the implementation of new processes or interventions. It is a cycle that is dependent on new information being generated to assess progress and priorities for targeted intervention.
We have presented some of the epidemiological information generated from the Harvard methodology, but the full data set is akin to a root-cause analysis and, although undoubtedly useful, both data collection and extraction make it challenging within a service setting. The GTT, by contrast, is pragmatic with a limited data set and, although advantageous in terms of the ability of the service to both collect and analyse data, the outputs are restricted to rates on a small monthly sample (demonstrated in Figure 13) and counts of the number of harm events associated with common triggers. Such data are presented to executive boards periodically, but the reason for doing so is unclear and there is limited anecdotal evidence of it being understood and acted on within the quality agenda in organisations.
As part of the process of quality assuring the data generated from the RF1 reviews, we began to see patterns of harm emerging from individual organisations. For example, we recognised that some organisations seemed to have very few pressure ulcers, whereas others had them regularly within their monthly quota of notes. Other organisations, which were outside the main urban centres in Wales, had harm events linked to delayed diagnosis, which we did not see often elsewhere, and others had harm events arising from the untimely identification and management of sepsis. We tested these patterns with site leads by ranking the top five frequently occurring criteria that were linked to AEs for each organisation and presenting these as their ‘organisational signatures’ of harm.
The top five were chosen rather than the full spectrum of criteria, as the total number of events in any organisational sample is small and the aim was to highlight a few key areas for each organisation for which the AE risk is highest. This process provides a clear distinction between the managerial assurance generated from the quantification of harm and the clinical learning generated from what happened, which can be triangulated with local knowledge and data. In our experience, issues that could be addressed include those of a structural, managerial, clinical, process, educational and resource nature. The following examples begin to elucidate some of these issues.
Figure 14 outlines the phase 1 harm signature for site 1. A mixed pattern of harm is observed, with the underlying AEs in this organisation being linked mainly to infections, readmissions, hospital-acquired injuries, medication errors, thromboembolic events and the ‘other’ events criterion in the RF1 pro forma. The percentages reported reflect the contribution of each individual criterion to the organisation’s overall risk profile. This profile is fairly representative of the spectrum of AEs commonly seen in clinical practice.
In site 3, the pattern of harm highlights issues with patient flow and discharge. Thirty-five per cent of this organisation’s AEs were associated with patients requiring repeated admissions for related problems, with the remaining arising from hospital-acquired infection, hospital-acquired injuries and medication errors and reactions (Figure 15). The percentages reported reflect the contribution of each individual criterion to the organisation’s overall risk profile.
This was not new knowledge to the organisation, which had already embarked on a significant programme of investigating patient flow and innovative modelling of process and service redesign. However, a relatively small number of reviews confirmed that this was a well-targeted priority for improvement along with a focus on the management of sepsis.
In site 5, ward-based care issues were predominant, with harm being associated with hospital-acquired accident and injury, hospital-acquired infection and problems arising from discharge planning processes (Figure 16). The percentages reported reflect the contribution of each individual criterion to the organisation’s overall risk profile. In this organisation attention has been drawn to these issues externally, and efforts are under way to monitor care standards across the organisation. In all cases, the harm signatures identified organisational issues that could be assessed and improved through concerted effort.
The representation of harm in this way offers a number of opportunities:
-
Knowing both the percentage of inpatients experiencing an AE and the proportional breakdown of AEs allows for a crude extrapolation of the magnitude of an issue in a health-care system, thereby facilitating assessment and prioritisation. In larger samples, attention may be focused on developing harm signatures for subsets of data, such as those that are most likely to improve outcomes for patients as a result of their level of preventability.
-
Areas of harm deemed to be important can be triangulated with other organisational data and an appraisal instigated of current monitoring and intervention efforts. For example, if pressure areas are described in the AE profile, efforts might be well directed at reviewing compliance with interventions such as intentional rounding or skin bundles. Similarly, with infections, compliance with hand washing, asepsis, antibiotic administration and timely response to deranged early warning scores may be appropriate areas for assessment.
-
Areas for awareness raising or specific intervention can be identified at the corporate level, with accountability for action distributed to relevant clinical teams.
-
It is conceivable that these profiles of harm will change over time and some types of AEs may completely drop out of the signature and be replaced by another area for assessment. They therefore also offer potential as an evaluative tool in quality improvement efforts.
-
Simple diagramatic representation of harm can be used to engage and influence practice at all levels of an organisation from the chief executive officer to anxcillary staff, acknowledging every member of staff’s role in promoting patient safety.
Part 4: comparison of the Global Trigger Tool and the two-stage retrospective review process rates of harm across NHS Wales
The percentage of inpatient episodes determined to contain evidence of an AE using the GTT methodology across NHS Wales was lower than that reported through the two-stage retrospective review process (mean 9.0%, 95% CI 8.82% to 9.18%, median 10.9%, SD 6.36%). However, there was significant variation in the extent of harm observed at the individual organisational level (range 1.9–23%). This indicated that, in routine use in some organisations, harm events were not being picked up and, in others, more events were being identified than when the Harvard method was used on the same notes. When weighted for percentage contribution to the study sample, the weighted mean increased to 11.63% and was higher than the percentage reported using the two-stage retrospective review process (Table 23).
Site | Number of GTT reviews undertaken | Number of AEs determined | Organisational GTT rate |
---|---|---|---|
1 | 680 | 33 | 4.9 |
2 | 661 | 21 | 3.2 |
3 | 463 | 19 | 4.1 |
4 | 438 | 53 | 12.1 |
5 | 288 | 17 | 5.9 |
6 | 160 | 3 | 1.9 |
7 | 99 | 23 | 23.2 |
8 | 477 | 65 | 13.6 |
9 | 560 | 61 | 10.9 |
10 | 348 | 47 | 13.5 |
11 | 659 | 88 | 13.4 |
Total | 4833 | 430 | 9.0 |
To undertake matched-pairs analysis of the two methodologies, we matched outcomes of the two different methodologies using a unique study identifier. Despite having 4546 two-stage reviews and 4833 GTT reviews, we successfully matched only 77% of available records with a range of between 31% and 93% across study sites (Table 24). Reasons for the failed matches varied across the organisations and included loss of paper records of GTT reviews before they could be entered on a database and incompatible hospital numbers. Probability matching was not pursued, as the study was sufficiently powered to detect a difference in 3504 matched reviews. There was so much variability in the reported GTT rates, together with low frequencies of harm events occurring across the GTT triggers, that plans for a sensitivity analysis of individual triggers and further analysis became redundant.
Site | Number of GTT reviews | Number of RF1 reviews | Total number of cases with complete RF1 and GTT data | Percentage match per organisation |
---|---|---|---|---|
1 | 680 | 518 | 429 | 83 |
2 | 661 | 531 | 474 | 89 |
3 | 463 | 442 | 401 | 91 |
4 | 438 | 418 | 246 | 59 |
5 | 288 | 341 | 274 | 80 |
6 | 160 | 176 | 154 | 88 |
7 | 99 | 316 | 99 | 31 |
8 | 477 | 347 | 287 | 83 |
9 | 560 | 421 | 351 | 83 |
10 | 348 | 501 | 292 | 58 |
11 | 659 | 535 | 497 | 93 |
Total | 4833 | 4546 | 3504 | 77 |
The reported difference in the occurrence of AEs using the two methodologies is 2.3% (95% CI 0.93% to 3.67%) (Table 25).
Site | Percentage of cases with an AE | |
---|---|---|
MRF2 | GTT | |
1 | 9 | 4 |
2 | 8 | 3 |
3 | 12 | 4 |
4 | 8 | 13 |
5 | 10 | 6 |
6 | 9 | 2 |
7 | 12 | 23 |
8 | 9 | 13 |
9 | 8 | 10 |
10 | 16 | 13 |
11 | 12 | 13 |
Total | 10.3 | 8 |
The Harvard method detected more AEs in 6 of the 11 study sites and, in four of these, the number of AEs detected was at least double the number identified in the GTT (Figure 17).
In the five organisations in which the GTT detected more AEs, the increase was smaller, with only one organisation reporting a rate that was substantially higher than the rate detected using the Harvard method (site 7). The internal validity of the GTT in this context is questioned; we report that in Wales, when used longitudinally over a period of time, the GTT did not perform as well as the two-stage Harvard method and AEs were under-reported in most study sites. Further diagrammatic representation of the overlap between the GTT and Harvard method AEs is found in Appendix 3.
There are two key drivers for AE monitoring: assurance and improvement. Assurance is derived from quantifying the extent of a problem and having a reference rate against which benchmarking is possible. Both the GTT and Harvard method percentages reported fall within the range of AEs previously reported and, therefore, fulfil this function. However, the clinical implications of the observed difference lie in the second driver for undertaking harm monitoring, that is, clinical and organisation learning. Health-care organisations need to understand the origins of their AEs and the opportunities for those AEs to be reduced through programmes of quality improvement. Other outcome measures that we report, namely severity and preventability, are arguably more important in any individual organisation’s quality improvement activity.
The impact of Global Trigger Tool-determined adverse events for patients and organisations
The GTT data were further explored to elucidate clinically relevant information that could potentially inform improvement priorities (Figures 18 and 19). A total of 158 out of 315 harm events identified by the GTT method resulted in temporary harm requiring intervention (50%), and a further 115 (37%) events required initial or prolonged hospitalisation. Nine per cent of all GTT reviews across the study sites were associated with more significant clinical harm, including eight events that contributed to the patient’s death.
As no information is derived from the GTT method on issues such as management causation, the preventability of the AE and additional treatment or identification of contributory factors, the spectrum of triggers identified that were aligned with AEs were explored.
Figure 19 represents the most commonly occurring triggers in the GTT data set determined to be a harm event. The most frequently occurring trigger was readmission within 30 days (G4) (n = 107), followed by complications of treatment (G7) (n = 33) and falls (G2) (n = 30). These three triggers accounted for 55% of all harm events reported through the GTT method.
Furthermore, the conversion rate from triggers to established harm was very low for a high number of frequently occurring criteria. For example, in the case of lack of early-warning score, which was triggered on 424 occasions, the conversion rate to harm events was 4% (17/424). The number of events in other categories was so low that comparison with the RF1 criteria was not undertaken. Details on all GTT triggers can be found in Appendix 4.
Summary of key points arising from Chapter 2
-
In total, 4388 inpatient episodes of care were screened in phase 1 of the study and 450 were identified using the Harvard methodology (10.3%, 95% CI 9.4 to 11.2%). A total of 51.5% of AEs were identified as having at least some evidence of preventability (95% CI 46.88% to 56.12%).
-
Agreement between reviewers on the presence of AEs was achieved in 83% of cases (n = 72) with a kappa value of 0.63 and a corresponding 95% CI of 0.46 to 0.80.
-
Nearly three-quarters of patients with AEs needed additional treatment and/or days in hospital, 16% were discharged with a significant impairment in their functional status and, in 9% of cases, the AE may have been associated with, but not necessarily causally related to, the patient dying during the inpatient episode. The median additional inpatient LOS was 6 days.
-
In 450 episodes of care in which AEs were identified, the physician cohort still deemed the overall quality of care to be excellent or good in 71%, with poor quality-of-care ratings being more common in AEs that were deemed to be preventable.
-
Older age and longer length of hospital stay were statistically associated with physician AE determination [OR 1.12, 95% CI 1.07 to 1.20; and OR 1.37, 95% CI 1.12 to 1.20, respectively].
-
Assessment of the conversion from positive criteria to confirmed AEs suggest areas for the assessment and development of screening criteria for AEs.
-
Signatures of organisational harm may provide (1) opportunities for the prioritisation of quality improvement programmes, (2) triangulation of hypothesised harm with other organisational metrics and information, (3) increased awareness of patient safety issues and (4) a monitoring and evaluation tool.
-
The percentage of inpatient episodes determined to contain evidence of an AE using the GTT methodology across NHS Wales was lower than that reported through the two-stage retrospective review process (mean 9.0%, 95% CI 8.82% to 9.18%). However, there was significant variation in the extent of harm observed at the individual organisational level (range 1.9–23%).
Chapter 3 Development of nurse-led identifications of adverse events and the Harm2 tool
Methods
Aim 2
To use the findings of phase 1 to develop a robust measurement system for harm.
Objectives
-
To respond to stakeholder feedback on the feasibility of the two-stage retrospective case note review process.
-
To review the contents of the Harvard method.
-
To compare rates of nurse-determined and physician-determined AE rates.
-
To measure the inter-rater reliability of nurse-determined AEs.
-
To determine trends in nurse-determined AEs over time.
-
To determine the degree of association between positive criteria and nurse-determined AEs.
-
To evaluate RF1 screening criteria to determine if these criteria capture the full spectrum of harm, especially harm attributable to omissions.
-
To assess the implications of a time-limited approach in case note review methodology.
-
To develop a new tool to identify AEs.
Background
At the end of phase 1, a decision needed to be made on the methodological approach required to meet the aims of phase 2. As no transition period was built into the project plan, a no-cost extension was agreed with the National Institute for Health Research to undertake interim analysis of phase 1 data and to consult with our stakeholders on how to operationalise phase 2. Activity undertaken during this period included the assessment of nurse determination of harm events, review of the GTT and Harvard method outputs with key stakeholders, the assessment of the efficiency of harm screening criteria, and a review of the structure and content of the common tools in practice. Towards the end of this period it became apparent that a hybrid set of tools was required to provide robust information on harm, which hospitals could use on an ongoing basis for harm surveillance and to drive quality improvement.
Stakeholder consultation and testing of phase 1 findings
We reviewed with medical, nursing and managerial colleagues the nature of data generated from both the GTT process and the Harvard method in phase 1 of the study. Most organisations at this point had already made the decision to stop undertaking GTT reviews and were articulate about the limitations of this approach. The information generated from the Harvard method was viewed as providing a reliable measure of harm in the system capable of generating information for organisational learning, which is grounded in the ‘patient story’. Stakeholders also valued the potential for outcomes to be benchmarkable, that is, whether or not they as organisations were outliers or, perhaps more significantly, have AE rates greater than the 1 in 10 figure commonly stated. Harm signatures were viewed as having an exciting potential to identify priorities for organisational intervention and a mechanism for wider engagement in patient safety monitoring across organisations. Although the Harvard method delivered on what organisations said they needed, it became apparent on the review of the tools that this was not going to a sustainable approach in a NHS setting once the second phase of the study was complete.
This drove our methodological approach for phase 2 and our aims were to generate outcomes from the data that (1) provided high-level metrics, which are benchmarkable with the international literature and other epidemiological studies of harm; (2) inform future strategies for case note review methodology in terms of team composition; (3) improve the face validity of screening criteria by understanding the association between positive screens and AEs; and (4) engage frontline clinical teams, as well as managers and executive teams, in monitoring patient safety risk and outcomes.
This transition phase of the study therefore facilitated an early descriptive overview analysis of the RF1 and MRF2 data and a review of the clinical summaries in order to prioritise areas for further exploration and testing in the second phase of data collection. The areas of specific focus were the (1) assessment of screening criteria and their association with AEs, (2) identification of acts of omission that are not explicitly covered in current screening methodologies and (3) review of current tools in use and the subsequent identification of key components of harm measurement for phase 2.
Nurse determination of harm events
In most harm measurement methods, non-physicians undertake the screening process with physicians making a subsequent determination on the presence of an AE. We added nurse determination of AEs to the screening process in phase 1 in order to assess the rates generated and the composition of AEs identified. Having two professional groups review the same case notes provided an opportunity to study, along with AE rates, the differences in the identification of events, therefore, providing a robust assessment of the optimum and the most cost-effective approach to team composition in retrospective case note review.
As the non-physicians were making an AE determination outside the logical and structured process outlined in the MRF2, their confidence in making the decision was rated on a scale of 1 to 5. This was to ensure that lack of confidence did not prevent the identification of any potential AEs. The confidence scale ranged from 1, not at all confident, to 5, very confident, and the inter-rater reliability of their judgement was assessed in a 10% sample of all records that were double reviewed for agreement.
We compared the nurse and physician rate of harm nationally and by individual hospital site, and computed rates of harm in the non-physician cohort by the level of confidence that the reviewers had in an AE being present. For both reviewer groups, rates of AEs were plotted longitudinally in order to assess trends over time.
Review of stage 1 adverse event screening criteria including thematic analysis
In order to assess the relationship between the criteria that screened positive and the subsequent determination of AEs, we quantified, for each of the 18 criteria in the screening component of the RF1, the frequency of positive criteria and the subsequent rate at which they converted to confirmed AEs. This rate of conversion was assessed for both the non-physician cohort when they made a decision on the presence of AEs at the end of the screening process and after the research physicians had reviewed the case notes and made a definitive assessment on the presence of AEs.
A further step was undertaken to explore in greater depth criteria in which actual patient risk was difficult to elucidate in terms of subsequent AE determination. For example, in the case of a readmission either side of the index episode being studied, readmission is an ambiguous criterion and could indicate a number of clinical and pathway issues. We therefore aimed to explore these cases to establish what was happening to these patients and why they were thought to have experienced an AE. Other criteria were also generic, particularly ‘any other – not covered by the above’. The assessment of these specific criteria would not only enhance our understanding of screening criteria, but also enable us to explore our hypothesis that acts of omission accounted for many of the AEs in these categories. Thematic analysis was the methodological approach chosen to examine the clinical summaries of all positive screens in criteria prioritised for exploration.
Review of the components of the stage 2 retrospective case note record review form
The two-stage retrospective review tools have now been in use for 25 years, during which time few structural changes have been made to the original versions, and the GTT has been used for around a decade. The end point for the new tool was a process that could efficiently generate data that could be used directly by NHS organisations in safety and quality work but was still aligned in measurement and definitional structure to the validated tools in use.
As well as reviewing and developing screening criteria, other components of the proposed new Harm2 tool were informed and derived from the process of documentary analysis of other tools in current practice and commonly reported measures and outcomes of validated tools in the international literature. Outcomes included validated screening criteria, AE definitions, classifications of health-care management causation, AE preventability, AE severity, contributory factors, along with other definitions including those of ‘problems in care’, and avoidable mortality. Tools assessed included the Harvard method, various versions of the GTT, the PRISM tool and the IHI mortality review process.
Statistical and thematic analysis
The RF1 data with nurse identification of AEs were analysed to determine the percentage and corresponding CI of patients who experienced an AE during the period of hospitalisation, and the severity and level of preventability of the incident both in individual hospital sites and nationally. An unweighted kappa coefficient was used to assess inter-rater reliability with corresponding 95% CIs and the percentages of records that were concordant and discordant. Records were assessed for missing data, and as the number of missing data was low it was presumed that they were missing at random; thus, the analysis was performed using the available data set. Statistical analysis was undertaken using Stata, version 14.
Data from different tools, RF1, MRF2 and GTT, were linked via the VLOOKUP function in Microsoft Excel. GTT cases were identified by their hospital number, whereas the study data (including the MRF2 reviews) were identified through a unique study number. An independent database, held separately from the study data, provided patient identification fields for the match between hospital and study numbers. Cases of harm from the GTT and MRF2 review were identified and matched together using this reference data and the VLOOKUP function in Microsoft Excel.
The thematic analytical process incorporated the five stages of qualitative data analysis techniques: familiarisation, identification of a framework, indexing, charting and interpretation. A literature review was carried out to aid the establishment of an initial thematic framework and the RF1 clinical summaries were then reviewed in Microsoft Excel 2010 in four distinct phases. First, the process involved reading through all the patient records to assist with familiarisation. A second iteration identified and numerically coded key concepts in the data. The third phase involved the formation of protothemes and subthemes under which the coded data were placed. A final pass aimed to reaffirm categorisation and reduce any misclassification of data. The classifications generated were presented in flow charts and tables, and rates of subsequent conversion to determined AEs were presented in order to prioritise new criteria emerging, including acts of omission that could be tested in the second phase tool. The in-depth analysis of criteria and the identification of additional screening criteria provided the platform from which the other components of the new tools could be identified or developed.
Findings: development of nurse-led identifications of adverse events and the Harm2 tool
Overview
Aim 2
To use the findings of phase 1 to develop a robust measurement system for harm.
Findings are presented in two parts. Part 1 presents the findings of nurses’ determination of AEs in a supplementary step within the RF1 in phase 1 of the study. Part 2 outlines the findings from the thematic analysis and the development of the new screening criteria for AEs in phase 2 of the study.
Part 1: the nature and extent of nurse-determined adverse events across NHS Wales
As we amended the RF1 process introducing narrative and AE determination, records were kept on the time taken across organisations for the research nurse cohort to read and evaluate all documentation on the episode of care and complete the study documentation. The mean length of time to undertake this process per review was 20 minutes, but ranged from < 5 minutes to a maximum of 2 hours for complex episodes of care (Figure 20).
The percentage of patients identified as experiencing AEs at the end of the RF1 process is higher than the percentage determined by physicians. Across all NHS Wales sites, 811 (18.4%) inpatient episodes of care reviewed were found to include an AE (95% CI 17.33% to 19.63%). When AEs were stratified by the nurse’s level of confidence in his or her determination of the event, the percentage of patients identified with an AE fell, and in the categories of ‘somewhat confident’ (4) and ‘very confident’ (5), the percentage of patients affected was 10.4%, which is comparable to the physician-determined percentage of 10.3% (Table 26).
Site | Nurse-determined AEs | Percentage of AEs determined at confidence level | Physician-confirmed (MRF2) harm | ||
---|---|---|---|---|---|
n/N | % | ≥ 3 | ≥ 4 | ||
1 | 97/533 | 18.2 | 12.9 | 6.8 | 9.0 |
2 | 67/530 | 12.6 | 9.4 | 4.9 | 7.9 |
3 | 101/432 | 23.4 | 17.6 | 10.0 | 11.8 |
4 | 56/375 | 14.9 | 10.7 | 7.2 | 8.0 |
5 | 76/309 | 24.6 | 22.7 | 17.5 | 10.4 |
6 | 46/172 | 26.7 | 20.9 | 18.6 | 8.6 |
7 | 57/290 | 19.6 | 14.1 | 6.9 | 12.0 |
8 | 51/330 | 15.4 | 11.8 | 6.4 | 8.8 |
9 | 62/393 | 15.7 | 13.0 | 7.9 | 8.4 |
10 | 108/484 | 22.3 | 20.9 | 18.8 | 16.1 |
11 | 90/520 | 17.3 | 16.7 | 14.6 | 10.5 |
Total | 811/4388 | 18.4 | 15.0 | 10.4 | 10.3 |
Inter-rater reliability of nurse determination of adverse events
Agreement on the occurrence of an AE was higher among the nursing cohort (87%) (κ = 0.56, 95% CI 0.46 to 0.66) than when based on the presence of screening criteria (76.6%) (κ = 0.47, 95% CI 0.39 to 0.55), but was still moderate, with significant variation in the level of agreement across participating sites (Table 27).
Site | Total number of reviews | Numbers of pairs | Unweighted kappa | 95% CI | Altman scale 199157 | |
---|---|---|---|---|---|---|
Concordant | Discordant | |||||
1 | 58 | 57 | 1 | 0.85 | 0.55 to 1.0 | Very good |
2 | 56 | 51 | 5 | 0.61 | 0.29 to 0.94 | Good |
3 | 49 | 43 | 6 | 0.33 | 0.25 to 0.83 | Fair |
4 | 48 | 43 | 5 | 0.65 | 0.36 to 0.94 | Good |
5 | 32 | 26 | 6 | 0.54 | 0.20 to 0.87 | Moderate |
6 | 20 | 18 | 2 | 0.61 | 0.092 to 1 | Good |
7 | 29 | 29 | 0 | 1 | 1 to 1 | Very good |
8 | 41 | 33 | 8 | 0.56 | 0.45 to 0.66 | Moderate |
9 | 54 | 47 | 7 | 0.55 | 0.24 to 0.86 | Moderate |
10 | 53 | 40 | 13 | 0.31 | 0 to 0.64 | Fair |
11 | 49 | 40 | 9 | 0.52 | 0.23 to 0.80 | Moderate |
Duplicates | 3 | 2 | 1 | |||
Total | 492 | 429 | 63 | 0.56 | 0.46 to 0.66 | Moderate |
Conversion of screening criteria to nurse-determined adverse events
With the percentage of AEs identified by the nursing cohort almost double that determined by the physicians, the conversion rate to determined AEs was higher. This is outlined for all screening criteria in Table 28.
Criteria | Frequency of criteria in | Conversion of criteria to AEs in | ||
---|---|---|---|---|
4388 admissions | AEs | Nurses | Physicians | |
Admission within 1 year before index admission | 376 | 175 | 46.3 | 24.7 |
Admission within 30 days of index admission | 454 | 200 | 44.1 | 25.3 |
Hospital-incurred accident or injury | 240 | 190 | 79.2 | 46.7 |
Drug error or reaction | 169 | 127 | 73.7 | 45.0 |
Unplanned transfer to higher level of care, for example ITU | 47 | 36 | 75.2 | 42.5 |
Unplanned transfer to another hospital | 23 | 10 | 43.5 | 21.8 |
Unplanned returned to surgery | 41 | 36 | 87.8 | 53.6 |
Procedure-related organ or internal injury | 16 | 15 | 93.8 | 62.5 |
Complications including CVA, DVT, MI | 75 | 62 | 82.7 | 46.7 |
Neurological deficit not present on admission | 52 | 37 | 71.2 | 38.5 |
Unexpected death | 39 | 36 | 92.3 | 35.9 |
Inappropriate discharge/planning | 129 | 94 | 72.9 | 35.7 |
Cardiac or respiratory arrest | 44 | 29 | 65.9 | 29.5 |
Injury or complications related to labour/termination of pregnancy | 4 | 2 | 50.0 | 25.0 |
Hospital-acquired infection | 191 | 168 | 88.0 | 58.6 |
Patient/carer complaint | 100 | 60 | 60.0 | 24.0 |
Litigation documented in the medical record | 5 | 3 | 60.0 | 20.0 |
Other not included in the criteria above | 220 | 144 | 65.5 | 34.1 |
Further analysis of the difference in the identification of AEs comprising the physician-determined percentage of 10.3% and the nurse-determined percentage of 10.4% when they are ‘confident’ or ‘very confident’ in the presence of an AE is presented in Table 29.
Criteria | Frequency of AEs in nurse assessment at confidence levels 4 and 5 | Percentage conversion confidence levels 4 and 5 | Percentage conversion in MRF2 | Difference in conversion rates (%) |
---|---|---|---|---|
Admission within 1 year before index admission | 102 | 27.1 | 24.7 | +2.4 |
Admission within 30 days of index admission | 100 | 22.0 | 25.3 | –3.3 |
Hospital-incurred accident or injury | 131 | 54.6 | 46.7 | +7.9 |
Drug error or reaction | 74 | 43.7 | 45.0 | +1.3 |
Unplanned transfer to higher level of care, for example ITU | 24 | 51.1 | 42.5 | +8.6 |
Unplanned transfer to another hospital | 5 | 21.7 | 21.7 | 0 |
Unplanned returned to surgery | 23 | 56.1 | 53.6 | +2.5 |
Procedure-related organ or internal injury | 10 | 62.5 | 62.5 | 0 |
Complications including CVA, DVT, MI | 40 | 53.3 | 46.7 | +6.6 |
Neurological deficit not present on admission | 24 | 48.0 | 38.5 | +9.5 |
Unexpected death | 24 | 61.5 | 35.9 | +25.6 |
Inappropriate discharge/planning | 49 | 38.0 | 35.7 | +2.3 |
Cardiac or respiratory arrest | 17 | 38.6 | 29.5 | +9.1 |
Injury or complications related to labour/termination of pregnancy | N/A | N/A | N/A | N/A |
Hospital-acquired infection | 108 | 56.5 | 58.6 | –2.1 |
Patient/carer complaint | 44 | 44.0 | 24.0 | +20 |
Litigation documented in the medical record | 2 | 40.0 | 20.0 | +20 |
Other not included in the criteria above | 94 | 42.7 | 34.1 | +8.6 |
In eight screening criteria, the difference in the rate of conversion from positive screening criteria to determined AEs is observed to be < 5%. Moderate differences in the rate of conversion are seen in the identification of AEs arising from criteria such as neurological deficit not present on admission (9.5%) and hospital-acquired accident and injury (7.9%), both of which are higher in nurse-determined AEs. Differences of greater significance are associated with two criteria, patient or carer complaint documented in medical record (24%) and unexpected death (25.6%), both of which are higher in nurse-determined AEs. Trends suggest a number of points:
-
Nurses may lack confidence, and in some cases knowledge, in some of the more clinically technical elements of AE detection.
-
Physicians may not identify AEs if they are not documented in the medical progress notes and may not capture the full spectrum of AEs.
-
The overall AE rate may be higher if there is a balance of disciplines represented in case note review teams.
Part 2: findings from the thematic analysis – developing screening criteria for adverse events
Screening criteria can be viewed as data that, when present, prompt further investigations into whether or not the patient received substandard care leading to less than optimal outcomes. Criteria or triggers broadly reflect harm to patients in inpatient settings. Criteria vary considerably in terms of their characteristics and/or constructs. Some criteria are, by their very definition, AEs, and these include hospital-incurred injury and hospital-acquired infection. Others describe common scenarios such as readmission within 30 days of discharge or unplanned transfer to a higher level of care, which in themselves do not fulfil AE criteria but highlight a situation in which patients might be at increased risk. The closer screening criteria are aligned with AEs, the more accurately risk can be described and the greater the face validity screening tools will have with their users. Interim analysis of RF1 data in the final stages of phase 1 described the variability in the conversion of positive criteria of harm in both nurse- and physician-confirmed AEs (Figure 21).
In 4388 valid episodes of care reviewed, a total of 2225 criteria of harm were identified. Two of the criteria, (1) injury or complications related to termination or labour and delivery (including neonatal complications) and (2) documentation or correspondence indicating litigation, either contemplated or actual, occurred infrequently (four and five times, respectively). Others, such as admissions either side of the index admission under review, occurred frequently (a total of 820 times in the data set). When the conversion of criteria to harm events was examined, a number of key issues were identified. Some criteria by their very name and nature are AEs, for example hospital-acquired infection/sepsis and hospital-incurred patient accident or injury, and the conversion rate to actual confirmed harm was high. Other criteria occurred frequently, but the ‘labelling’ of the criteria did not give any indication of either the risk to the patient or the way in which the patient was harmed and, in these cases, the conversion rate to clinical harm was relatively low. Three criteria were of specific interest because of their (1) high frequency in the data set, (2) relatively low conversion rate and (3) ambiguity around potential or actual clinical harm. The three are as follows:
-
unplanned admission within the 12 months prior to the index admission, as a result of any health-care management
-
unplanned admission to any hospital post this discharge
-
any other undesirable outcome.
In addition, although criterion 3 occurred frequently and had a high conversion rate and therefore featured commonly in organisational signatures of harm, the term ‘hospital-acquired patient accident and injury’ is nebulous, and we felt that this area warranted further exploration and characterisation of these events in the data set. The frequency of these criteria in the data set is outlined in Table 30 and all four criteria underwent thematic analysis to further categorise risk to patients.
Criterion number | Criterion | Frequency in | Conversion in | |||
---|---|---|---|---|---|---|
Data set | Physician-confirmed events | Nurse-determined events | Physician-confirmed events (%) | Nurse-determined events (%) | ||
1 | Admission pre-index admission within a 1-year period | 376 | 93 | 170 | 24.7 | 45.3 |
2 | Admission post-index admission within 30 days | 454 | 115 | 193 | 25.3 | 42.6 |
3 | Hospital-incurred accident or injury | 240 | 112 | 184 | 46.7 | 77.6 |
18 | Other | 220 | 75 | 145 | 34.1 | 65.3 |
Thematic analysis of criteria 1 and 2, to improve understanding of the association of readmission to secondary care with adverse events
Criteria 1 and 2 were examined together as ‘hospital readmissions’ in order to elucidate points in the patient pathway and the potential reasons that may have resulted in an AE being associated with readmission. A framework was developed through a literature review and causes of harm in readmitted patients from our data set were grouped into whether the event was attributable to system or health-care professional factors. System factors were subdivided into structural factors (such as the configuration of hospitals and type of health care provided), which may predispose to harm, and logistical factors (such as communication between the interfaces of care). Health-care professional factors were subdivided into the constituent parts of care provision including pre-assessment, diagnosis, investigations, treatment and discharge categories. An initial examination and coding of data facilitated the development of the initial structure to categorise the causes of readmissions. Readmissions were deemed to be either planned or unplanned. Planned readmissions are generally not the result of iatrogenic harm, but a predictable part of routine care. Unplanned readmissions were categorised either as ‘failures in care’ or as arising from the ‘nature of disease’ (Figure 22). The former involves causes of AEs that are amenable to intervention, whereas the latter includes issues of patient care that are difficult to circumnavigate as a result of natural disease states and progression.
A total of 54% of unplanned readmissions across both criteria were a result of AEs potentially amenable to intervention, compared with 42% within the planned readmissions cohort. There was a statistically significant (p < 0.005) difference between the percentage of patients in the ‘failure of care’ and ‘nature of disease’ categories who were identified as having had an AE.
Figures 23 and 24 illustrate the breakdown of these categories into their subcategories, as generated through the thematic analysis. Figure 23 illustrates the 10 subcategories relating to ‘failure of care’ events, each of which identifies potential areas in the patients’ management that could potentially be addressed to prevent readmission. Figure 24 shows the six subcategories derived under the ‘nature of disease’ category. Frequency is reported by each category along with the attendant conversion rate to determined harm events. Further explanation of this classifications can be found in Appendix 5.
Identifying acts of omission resulting in failures in care and hospital readmission
Many categories reported as ‘failure of care’ were described in the other parts of the RF1 and these AEs were already ‘labelled’ in other clinical criteria such as hospital-acquired infection and patient complications. The ‘premature discharge’ category, which emerged through the thematic analysis, is an example of an organisational issue, which again is covered in criterion 12 of the RF1, defined as ‘inappropriately discharging patients or inadequately planning their discharge’. However, our analysis of readmissions did uncover causes of readmission attributable to ‘diagnostic issues’, ‘ineffective symptom management’ and ‘prolonged waiting times’ which were not effectively covered in other parts of the form and together account for around 63% of the ‘failure of care’ categorised readmissions. Incorporating these areas into the screening criteria we believe covers important areas of acts of omission, which will further enhance the routine characterisation of harm in NHS organisations.
Thematic analysis of criterion 3, unintended patient accident or injury, and criterion 18, any other undesirable outcome
The same methodological process was employed earlier in phase 1 to explore criterion 3 (‘unintended patient accident or injury’) and criterion 18 (‘other undesirable outcomes’). The analysis was undertaken on 3043 inpatient episodes and, at this point of data collection, 171 episodes of care fell under criterion 3 and 133 under criterion 18. Cases were excluded if the research nurse had not determined the positive screen to contain an AE, providing a pool for thematic analysis of 136 and 100 cases, respectively. Figure 25 shows the flow chart of included cases. Case reports were included within multiple themes if more than one AE was recorded.
During thematic analysis, narratives that were classified into criteria 3 or 18 were determined as being misclassified if they fitted into another of the 16 remaining criteria more appropriately. These records were excluded from the development of the thematic framework, as they did not contribute knowledge to the development of new themes or categorisation of harm events. We believe that this misclassification may have arisen because of (1) limited space for narratives of clinical summaries to be written in the RF1 and (2) these events were secondary and more minor events in patients who experienced more than one event during the inpatient episode.
Characterisation of adverse events in criteria 3 and 18
Figures 26 and 27 illustrate the breakdown of these categories into their subcategories, as generated through the thematic analysis. Figure 26 illustrates themes that are classified into the main cause of AEs within criterion 3. The categorisation is straightforward, with the majority of events falling into the categories of falls, pressure damage including pressure sores and injuries caused by equipment.
Twenty-five records indicated a pressure sore that developed during the inpatient stay. Clear diagnostic process meant that these cases were well defined, often with the direct keywords ‘pressure sore’. Falls in hospital were more complex, and 57 clinical summaries described a fall event. This theme was further subdivided into records that showed a history or predisposition to falls and those that did not. The majority were in patients who were in the former group, who were predisposed to falls because of a past medical history of falls, sedation or instability. Keywords included ‘fall’ and ‘slip’ or ‘found on floor’.
Sixteen patient records indicated injury attributable to equipment use. This theme is further divided into injury caused by equipment that was either invasive or non-invasive. The majority of invasive harm was a result of intravenous (i.v.) cannula (BD Venflon™, BD, Oxford, UK) use, and phlebitis and cellulitis were the adverse outcomes. This subtheme also includes complications from procedures arising from traumatic catheterisation and drain insertion. Non-invasive causes of harm includes those from dressings and masks. In all cases, the causation for injury was understood and clearly stated. Seven cases were considered miscellaneous in nature, as they did not conform to other coding because of content or frequency, which reflects the kinds of injury that may occur during an inpatient episode.
Characterisation of adverse events in criterion 18: other undesirable outcomes
The framework developed through thematic analysis of clinical summaries relating to other undesirable outcomes is shown in Figure 27. Irregularities and omissions in written communication leading to delayed initiation of treatment made up the largest percentage of incidents leading to an AE. In total, 19 records indicated that the patient experienced a delay in treatment during their stay. The staff-related subtheme was subdivided further into AEs caused by staff shortages, and those caused by delays in the process of clerking, including delay in medication prescribing and response by the on-call team. Resource constraints include bed availability, drugs and procedures, which contributed to the experienced delays either directly or because of cancellations. Communication was the largest theme in this group and themes were categorised into failure in both verbal and written communication. Thirty-four records were considered more applicable to alternative criteria in the RF1 and, again, this probably indicates that these were secondary events.
Criterion 18 contained more staff-/organisational-level issues than criterion 3. The documented events were predominantly errors arising during the process of admission and inpatient stay that were caused by staff and communication failures, rather than harm caused directly by health-care intervention and management. Examples of these classifications can be found in Appendix 6.
Acts of omission featured dominantly, informing the addition of several additional screening criteria in the proposed new tool to improve the balance of screening criteria between acts of omission and acts of commission.
The assessment of a time-limited approach to case note review
The time taken to review inpatient episodes of care is an important feature of case note methodology, particularly when it is used routinely in clinical practice. The Harvard method is non-time limited, whereas the GTT review restricts the time taken to examine the record of care to 20 minutes. As data were collected on time to review the inpatient record as part of the RF1 process, we calculated AE percentages by time taken to comprehensively screen the inpatient record. Figure 28 demonstrates that the review time is associated with the percentage of AEs identified increasing sequentially by 10-minute periods until 60 minutes, at which point the percentage of AEs drops. A total of 72% (3150 of 4388) of reviews were completed within 20 minutes, and the percentage of AEs detected in these cases was lower than in the total sample (6.3% vs. 10.3%). Reviews taking 20–30, 30–40, 40–50 and 50–60 minutes were associated with percentages of 18%, 23%, 39% and 49%, respectively. In reviews that took in excess of 60 minutes, the percentage of inpatient episodes with an AE dropped to 23.8%.
Age was explored as a potential explanatory variable in increasing the length of time taken to complete the review because of increasing complex care (Figure 29). In the time periods up to 60 minutes, there is no significant difference in mean age by length of time to review the episode of care. In the > 60 minutes group, the age difference is significant. This group, albeit small (n = 21), had a mean age of 83 years, and does probably reflect a group with comorbidity and complex episodes of care, thereby increasing the length of time to undertake the review.
This has important implications for case note review methodology. The majority of reviews can be completed within 20 minutes but relying on a time-limited approach may lead to not considering all relevant information or to the inappropriate attribution of AEs to positive criteria or triggers and AEs may also be missed, as cases with AEs can take significantly longer than 20 minutes to assess and make a robust decision on presence.
The development of the Harm2 tool
The thematic analysis generated acts of omission for inclusion in the amended screening criteria and this, along with the examination of different tools in practice, informed the development of the Harm2 tool, which included the following key components.
This initial section draws heavily on the current Harvard method, documenting demographic and type and source of admission before undertaking a complete assessment of the quality of documentation contained in the inpatient record. It is further supplemented by categorising the specialty at time of initial ward admission and the identification of patients with a mental health diagnosis or a learning disability, the last on request of mental health colleagues. We introduced the Charlson Comorbidity Index as a measure of comorbidity in order to assess, in a large cohort of patients, comorbid states, which might increase the likelihood of any one individual experiencing an AE during an inpatient episode. All this background information was intended to be readily and easily available, and extractable from the original clerking entry in the medical record.
Informed by the thematic analysis, the harm screening assessment in Harm2 incorporated important new categories of both acts of omission and commission. The new criteria or subdivisions are described in Table 31.
Criteria number | Criteria |
---|---|
1 | Unplanned admission within 30 days before index admission, as a result of any health-care management |
2 | Unplanned admission to any hospital within 30 days of this discharge |
3 | Hospital-incurred accident or injury |
Pressure ulcer | |
Fall | |
Equipment-related accident or injury | |
Other not covered above | |
4 | Adverse drug reaction, side effect or drug error |
5 | Unplanned transfer from general care to intensive care/high-dependency unit |
6 | Unplanned transfer to another acute care hospital |
7 | Unplanned return to theatre in this admission |
8 | Unplanned removal, injury or repair of organ or structure during surgery, invasive procedure or vaginal delivery |
9 | Other patient complications to include MI, DVT, CVA |
10 | Development of neurological deficit not present on admission |
11 | Unexpected death or referral to coroner |
12 | Inappropriate discharge home, inadequate discharge plan |
13 | Cardiac or respiratory arrest |
14 | Hospital-acquired infection |
Wound infection | |
HAP | |
Sepsis | |
Other | |
15 | Patient/family dissatisfaction with care received documented in the medical record and or evidence of complaint lodged |
16 | Delays in or cancellation of treatment |
17 | Evidence of inappropriate decision-making regarding the treatment or intervention the patient received |
18 | Problems relating to communication |
19 | Problems relating to documentation |
20 | Missed, delayed or incorrect diagnosis |
21 | Any other undesirable outcomes not covered by any other criteria |
The ‘determination of harm events’ and ‘identification of problems in care’ follows the screening process, with the severity of harm for the sake of simplicity being categorised from the NCC MERP’s Categorizing Medication Errors Index44 (GTT) and problems in care derived from section D of the MRF2. 30,31 The form was simplified to allow multiple AEs to be entered separately, facilitating straightforward identification of patients experiencing multiple events sequentially. In cases in which harm was determined, a short narrative section asks reviewers to provide a brief summary of the inpatient episode, placing clinical context to the harm event. In the event of no harm, the review is complete at this stage.
Reviewers are given options in a tick-box grid to identify the point in the patient’s pathway at which the harm event occurred before identifying, again by ticking, any contributory factors leading to the poor outcome for the patient. Assessment of preventability is taken directly from the MRF2 and is a six-point Likert scale, with an option to provide narrative to highlight any key issues, messages or points of organisational learning relating to the event.
As a decision had been made to include a large number of deceased patients in our phase 2 sample, we added a small section that is completed only for patients whose inpatient episode resulted in death. Statements were taken from the PRISM tool20,56 in order to assess whether or not a simplified process could generate data that are benchmarkable with robust tools in identifying (1) whether or not the patient’s death was caused by a problem or problems in care, (2) whether or not problems in care contributed to the patient’s death and (3) whether or not there was any evidence that the death was preventable.
Reviewers were finally asked to consider all that they know about this patient’s admission and rate the overall quality of care received by this patient from the health-care provider.
The combination of a condensed gold standard measurement structure and a process in line with current GTT methodology seemed an appropriate way to implement a harm monitoring process for organisations. Harm2 was designed for routine use within a climate of organisational financial constraints but organisational commitment to track and monitor trends of harm in their organisations through both rates of AEs and signatures of harm (a copy of the Harm2 tool is found in Appendix 7).
Summary of key points arising from Chapter 3
-
In total, 18.4% of records reviewed were found to have an AE present when determined by nurse reviewers (95% CI 17.33% to 19.63%). When AEs were stratified by the nurse’s level of confidence in his or her determination of the event, rates of harm fell, and in the categories of ‘somewhat confident’ (4) and ‘very confident’ (5), the AE rate was 10.4%; this was comparable to the physician-determined rate of 10.3%.
-
Agreement was higher in the nursing cohort for the determination of an AE (κ = 0.56, 95% CI 0.46 to 0.66): 87%.
-
Interim analysis of RF1 data in the final stages of phase 1 described the variability in the conversion of positive criteria of harm in both nurse- and physician-confirmed AEs and thematic analysis was undertaken for the categories of readmission, hospital-incurred injuries and ‘other’ not meeting any other criteria.
-
Our analysis of readmissions uncovered causes of readmission attributable to ‘diagnostic issues’, ‘ineffective symptom management’ and ‘prolonged waiting times’, which were not effectively covered in other parts of the form and together account for around 63% of the failure of care-categorised readmissions.
-
The injury and other events analysis indicated that events were predominantly errors arising during the process of admission and inpatient stay. Errors were typically due to staff and communication failures such as staffing levels and shortages and delays in the process of checking including tardiness in chart authorisation and response by the on-call team rather than to health-care intervention and management directly. Acts of omission featured dominantly, informing the addition of several additional screening criteria in the proposed new method.
-
Incorporating these areas into the screening criteria we believe covers important areas of acts of omission, which will further enhance the routine characterisation of harm in NHS organisations.
-
The overall AE rate may be higher if there is a balance of disciplines represented in case note review teams.
-
A time-limited approach may have significant implications for the identification of AEs as records that take longer to review are more likely to contain them.
Chapter 4 The implementation of a harm surveillance system using the Harm2 tool
Aim 3
To test the Harm2 tool to monitor trends and improvement over time.
Objectives
-
To train and supervise a team of research nurses to use the Harm2 tool to undertake medical record assessment, screening for AEs and subsequent determination of AEs in the identified inpatient episode.
-
To compare the occurrence of AEs in a random sample of discharged patients with a cohort of patients in whom the inpatient episode ended in death.
-
To compare the performance of Harm2 with the two-stage retrospective case note review process.
-
To assess the intermethod reliability (parallel form reliability) of AE identification and AE outcomes (such as severity and preventability of AEs) in a 10% sample of records that have been reviewed using the gold standard two-stage retrospective review process and the Harm2 tool.
-
To assess the inter-rater reliability (concordance among raters) of key outcomes using the Harm2 tool, by double-reviewing 10% of all case notes reviewed.
-
To determine the feasibility and acceptability of this approach among key stakeholders.
Methods
A one-stage harm screening and determination method was implemented across Welsh health boards in order to quantify the nature and extent of AEs occurring at the individual organisational level and national level across NHS Wales.
The review tool and training
The Harm2 tool, a hybrid of Harvard method AE determination and GTT one-stage process, was developed alongside a 39-page guide on how to use the tool, the definitions of the criteria and a guide to identifying an AE. An example of the instructions given to the reviewers to identify AEs can be found in Appendix 8. Reviewers (n = 24), predominantly from a nursing background (n = 22), were commissioned from the NISCHR CRC and had previous experience of screening medical records for criteria of harm. Training in the use of the Harm2 tool was both instructive and experiential; didactic teaching in the components of the tool and supervised review sessions were included. Reviewers were taught where to look for screening criteria and the technique of case note review. The focus was on the discharge summary, nursing and medical documentation, the medication/prescription chart, laboratory results for that admission, operative/theatre documentation and any other documentation in the case notes. Initial training included outlining the basic principles of AE determination and preventability. In determining judgement, reviewers were asked to consider the patient’s overall health and whether or not the AE arose as a result of an error/system failure. Face-to-face meetings or teleconferences were held on a regular basis to discuss challenging examples and document outcomes to feed back to the wider team.
Harm and preventability determination
Harm/AEs were defined as an ‘unintended injury or complication causing temporary or permanent disability and/or increased LOS resulting from health care’, and preventability was assessed on a six-point Likert scale, as reported previously in phase 1.
Injury severity was rated using categories E to I of the NCC MERP’s Categorizing Medication Errors Index,44 where E is temporary harm to the patient requiring intervention and I is patient death. The location, principal problems in care and the identification of any contributory factors were recorded, along with patient comorbidity, age, length of hospital stay, specialty treated within and admission status. Overall quality of care was assessed on a 1–5 scale, with 1 being excellent and 5 being poor.
Study sample
The Harm2 tool sampling frame (n = 5000) included 20 random discharges and 10 randomly selected admissions that ended in patient death from each study site using explicit case note selection criteria each month. In order to locate 20 sets of case notes, organisations were advised to produce a list of 30 hospital numbers randomly selected (from the list of all admissions in the specified month) using a random number generator that was provided by the research team. The 10 deceased cases were randomly selected in the same manner from the list of deceased patients in the month specified.
The study selection criteria for the 20 random admissions cases are as follows:
-
20 randomly selected patients admitted in the specified month with a LOS of > 24 hours
-
patient aged > 1 year on the date of admission.
We did not exclude obstetric or paediatric cases as we estimated that the numbers would be small, but future examination would highlight issues with using a generic tool across the whole acute sector, which could be explored further in subsequent work.
In 9 of the 12 health boards involved in phase 1 of the study, hospitals agreed to ongoing involvement, whereas in another two cases the health boards requested that other hospital sites participate in this phase. One health board, which had only one hospital site involved in phase 1, increased its involvement to two sites in phase 2. The 50–50 split between tertiary-level services and district general hospitals remained.
The review process
Phase 2 commenced as a pilot in two organisations in April 2014, and across the 12 study sites in July 2014, on inpatient discharges randomised from January 2014.
The process was condensed to a one-stage review process that was logistically simple. Arrangements were negotiated with the health boards by the research office and a schedule of reviews was planned for the duration of the study. Three research nurse teams covered the three geographic areas of Wales (north, south-east and south-west) and individual review sessions were undertaken in teams of 2–4 reviewers. In most cases, the schedule of 30 reviews would be undertaken in 1 day per hospital per month. Once the notes were made available, the research team visited the NHS site and completed the review using the Harm2 tool. A thorough assessment was made of the completeness of the medical record, and when the nursing and medical documentation was not available the review was stopped. The review process involved the review of the episode of care and any potentially relevant admissions either side of the index admission and assessment for the presence of 21 criteria known to be sensitive to the occurrence of AEs. If screening criteria were positive, the review continued to an assessment of the nature, severity and impact of the event.
After completion of the reviews, a summary sheet was completed by the research nurse. This detailed the number of reviews undertaken, any potential AEs identified and any requests for second opinions of the inpatient episode within the organisation, ensuring that serious harm was flagged within individual organisations. The use of the Harm2 tool was monitored both by inspection of completed forms when returned to the research office and through regular liaison with researchers and regional leads.
Intermethod and inter-rater reliability of the Harm2 tool
Ten per cent of all reviewed records were assessed using the one-step Harm2 tool and the two-stage gold standard approach. Intermethod reliability assessed the degree to which key outcomes were consistent from one review tool to the next. This involved the review teams also completing the RF1 screening tool from the Harvard method after the Harm2 review had been completed. When positive criteria were identified in the RF1, the record was referred to a research physician, who completed a MRF2 assessment, confirming the presence of AEs and categorising the nature, severity, contributory factors and preventability of the event. This formed the basis of the comparative analysis of the conversion rate from criterion of harm and the determination of AEs from both the Harm2 tool and the Harvard method. The inter-rater reliability of the Harm2 tool was assessed on 10% of inpatient episodes to assess the agreement or concordance of key outcomes by different reviewers. Both forms of reliability studies were arranged on a random selection of reviews on a quarterly basis.
Statistical analysis
The presence of harm in the episode of care was the primary study outcome. The Harm2 tool review data were analysed to determine the percentage and corresponding CI of patients who experienced an AE during the period of hospitalisation and the severity and level of preventability of the incident both in individual hospital sites and nationally. Missing data were assessed and, because of their infrequency, (1) they were presumed to be missing at random and (2) they were analysed on the available data set. Weighted means were calculated using simple techniques to calculate a weighted average of the percentage of AEs in each study site, adding weights together and dividing by number of sites.
Univariate analysis explored the association between demographic and chronic disease states and the subsequent development of harm events. Descriptive statistics were used to describe and compare the nature and severity of AEs in the random and deceased patient cohorts. An unweighted kappa coefficient was used to assess inter-rater reliability with corresponding 95% CI and percentage of records, which were concordant and discordant, respectively. When samples were independent, z-scores and corresponding p-values were used to compare percentages. Statistical analysis was undertaken using Stata, version 14.
Findings: the implementation of a harm surveillance system using the Harm2 tool
Aim 3
To test the Harm2 tool to monitor trends and improvement over time.
Findings are presented in four parts. Part 1 describes the study sample and the rate, preventability and severity of AEs identified through the Harm2 tool. Part 2 re-examines the screening criteria in the Harm2 tool and their association with AE determination. Part 3 outlines factors independently associated with AE determination in the Harm2 sample before part 4 outlines the nature and characteristics of AEs identified by the Harm2 tool and differences in the random discharge and deceased patient reviews.
Part 1: the rate, preventability and severity of adverse events identified through the Harm2 tool
Study sample and demographics
Twelve NHS organisations participated in phase 2 of the study, nine of which had previously contributed data to phase 1 of the study. Two of the health boards requested that different hospital sites submit data for this phase, and the health board that did not fully participate in phase 1 committed to this phase of the study.
The active data collection was shorter in phase 2 of the study, running for a total of 16 months, between May 2014 and September 2015. The number of episodes of care reviewed during this period was 4396. As we were less dependent on the study sites pulling notes specifically for the purpose of the GTT reviews, there was less variation in the number of reviews submitted from each NHS site (Table 32). We did encounter issues with capacity to undertake the full schedule of reviews in some regions, which in all but one site (site 5) were circumnavigated. The identification of capacity issues within the NISCHR CRC infrastructure meant that pragmatic decisions needed to be taken in terms of reducing the schedule of reviews in deceased patients by 50% at the end of 2014. At the end of phase 2, 3352 and 1044 case note reviews were undertaken in a random sample of discharge reviews and deceased patient reviews, respectively.
Site | Number of | Total number of reviews included in analysis | |||
---|---|---|---|---|---|
Discharge reviews | Discharge reviews included in analysis | Deceased patient reviews | Deceased patient reviews included in analysis | ||
1 | 289 | 287 | 96 | 92 | 379 |
2 | 293 | 285 | 82 | 82 | 367 |
3 | 298 | 292 | 103 | 101 | 393 |
4 | 273 | 254 | 106 | 100 | 354 |
5 | 237 | 236 | 86 | 86 | 322 |
6 | 255 | 249 | 75 | 72 | 321 |
7 | 290 | 290 | 89 | 88 | 378 |
8 | 290 | 282 | 57 | 57 | 339 |
9 | 264 | 252 | 79 | 77 | 329 |
10 | 266 | 259 | 97 | 91 | 350 |
11 | 297 | 296 | 88 | 87 | 383 |
12 | 300 | 297 | 86 | 85 | 382 |
Total | 3352 | 3279 | 1044 | 1018 | 4297 |
As in phase 1, a number of reviews were excluded from the sample as the data were reviewed and quality checked. Ninety-nine episodes of care were excluded from the final analysis on the basis of (1) errors in the patient record retrieved for review, (2) incomplete documentation in the patient record preventing the completion of the review and (3) the set of case notes not meeting phase 2 inclusion criteria.
Demographic profile of study population
Valid observations on key demographic characteristics were available for 4365 episodes of care (99.3%) and the mean age of the patient during the index admission under review was 56.5 years in the randomly selected discharge reviews cohort (95% CI 55.67 to 57.33 years) and was significantly higher, with a mean of 78.8 years (95% CI 78.05 to 79.65 years), in the deceased patient cohort.
The random sample of inpatient episodes studied was more in line with the demographic distribution of hospitalisations across NHS Wales, where 43% of the discharges occur in the over-65 years age group. In this phase of the study, 46% of reviews in the randomly selected discharge reviews cohort and 88% of the reviews in the deceased patient cohort were in patients over the age of 65 years during the inpatient episode examined (Figure 30).
The gender breakdown in both cohorts is outlined in Table 33 and is more equal in the deceased patient cohort than in the randomly sampled discharge reviews. Gender was not reported in 50 reviews undertaken (1.2%).
Sex | Cohort | |||
---|---|---|---|---|
Discharge reviews | Deceased patient reviews | |||
Frequency | Per cent | Frequency | Per cent | |
Male | 1346 | 41.0 | 494 | 48.5 |
Female | 1900 | 57.9 | 507 | 49.8 |
Unknown | 33 | 1.0 | 17 | 1.7 |
Total | 3279 | 1018 |
Presence of positive criterion of harm in the randomly selected discharge reviews data set
Across the 12 NHS study sites, 34.25% (95% CI 30.39% to 38.11%) of all episodes of care reviewed had at least one positive criterion for harm, with a range seen across individual organisations of between 23.34% and 45.79% and a corresponding median percentage of 35.1% (SD 6.58%) (Table 34).
Site | Number of reviews | Frequency of positive criteria present | Per cent of reviews with criteria present |
---|---|---|---|
1 | 287 | 67 | 23.34 |
2 | 285 | 82 | 28.77 |
3 | 292 | 117 | 40.06 |
4 | 254 | 103 | 40.55 |
5 | 236 | 86 | 36.44 |
6 | 249 | 84 | 33.73 |
7 | 290 | 86 | 29.66 |
8 | 282 | 110 | 39.01 |
9 | 252 | 71 | 28.17 |
10 | 259 | 69 | 26.64 |
11 | 296 | 112 | 37.84 |
12 | 297 | 136 | 45.79 |
Total | 3279 | 1123 | 34.25 |
The percentage of adverse events in 3279 randomly selected discharge reviews and 1018 randomly selected deceased patient reviews across NHS Wales (2014–15) using the Harm2 tool
In the randomly selected discharge review sample, at least one AE was determined in 11.3% of all episodes of care (95% CI 10.22% to 12.40%), increasing to 11.68% (95% CI 10.58% to 12.78%) when weighted for individual sites’ proportional contribution to the study sample. Multiple AEs during a single episode of care were common. In these 371 episodes of care, a total of 497 AEs were identified, with one-quarter of patients having at least one other AE identified. This was higher than the 10% with multiple AEs detected using the Harvard method in phase 1 because we made it easier to record multiple events separately along with their key characteristics in the Harm2 tool.
Table 35 reports on the percentage of AEs identified in all of the 12 participating NHS sites, with the range of patients identified with AEs of between 8.7% and 14.86% and a median percentage of 10.72% (SD 2.06%).
Site | Total number of reviews | Cohort | ||||
---|---|---|---|---|---|---|
Randomly selected discharge reviews | Deceased patient reviews | |||||
Frequency of AEs | Per cent of reviews with AEs | Number of reviews | Frequency of AEs | Per cent of reviews with AEs | ||
1 | 287 | 25 | 8.71 | 92 | 31 | 33.70 |
2 | 285 | 28 | 9.82 | 82 | 27 | 31.40 |
3 | 292 | 40 | 13.70 | 101 | 18 | 25.0 |
4 | 254 | 33 | 12.99 | 100 | 30 | 34.09 |
5 | 236 | 34 | 14.41 | 86 | 22 | 38.60 |
6 | 249 | 37 | 14.86 | 72 | 21 | 27.27 |
7 | 290 | 26 | 9.00 | 88 | 33 | 37.36 |
8 | 282 | 30 | 10.64 | 57 | 28 | 33.33 |
9 | 252 | 24 | 9.52 | 77 | 20 | 23.53 |
10 | 259 | 26 | 10.04 | 91 | 33 | 40.74 |
11 | 296 | 32 | 10.81 | 87 | 24 | 23.76 |
12 | 297 | 36 | 12.12 | 85 | 28 | 28.00 |
Total | 3279 | 371 | 11.31 | 1018 | 315 | 30.97 |
In the inpatient episodes examined that ended in patient death, at least one AE was determined in 30.1% of all episodes of care (95% CI 28.13% to 33.82%). This is almost a threefold increase in the percentage of patients experiencing an AE when compared with the randomly selected discharge reviews, indicating that AEs occur, and are detected, more frequently in this group of patients. There was also considerable variation in the percentage of patients with AEs, ranging from 23.76% in site 11 to 40.7% in site 10 (median percentage 32.37%, SD 5.63%). The reasons for such significant variation are likely to be multifactorial, reflecting issues such as the percentage of the general population who access secondary health-care services for end-of-life care and the complexity of services provided. As in the randomly selected discharge reviews, multiple AEs during a single episode of care were common in the deceased patient cohort. In the 315 episodes of care in which AEs are reported, a total of 512 AEs were identified, with 45% of patients having more than one AE.
The severity of adverse events in 3279 randomly selected discharge reviews and 1018 deceased patient reviews across NHS Wales (2014–15) using the Harm2 tool
The severity of all 497 AEs in the randomly selected discharge reviews was rated according to the NCC MERP’s Categorising Medication Errors Index and the breakdown is shown in Table 36. Ninety-six per cent of the events identified required intervention or subsequent readmission to manage the clinical sequelae of the event and < 4% of all harm events were categorised as resulting in permanent patient harm or serious clinical outcomes, including the death of the patient. Examples of the clinical context giving rise to the Harm2-determined AEs falling into each NCC MERP index category can be found in Appendix 9.
Severity | AE (n) | Total number | Per cent of total | |||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
E | 197 | 37 | 13 | 1 | 248 | 49.9 |
F | 161 | 51 | 17 | 1 | 230 | 46.3 |
G | 4 | 2 | 1 | 1 | 8 | 1.6 |
H | 3 | 1 | 4 | 0.8 | ||
I | 6 | 1 | 7 | 1.4 | ||
Total | 371 | 92 | 31 | 3 | 497 |
The severity of AEs was also rated according to the NCC MERP’s Categorizing Medication Errors Index44 in the deceased discharge cohort and the breakdown is shown in Table 37.
Severity | AE (n) | Total number | Per cent of total | |||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
E | 131 | 38 | 21 | 1 | 191 | 37.3 |
F | 49 | 29 | 10 | – | 88 | 17.2 |
G | 14 | 10 | 2 | 1 | 27 | 5.3 |
H | 8 | 8 | 1 | – | 17 | 3.3 |
I | 110 | 48 | 18 | – | 176 | 34.4 |
Unknown | 3 | 8 | 2 | – | 13 | 2.5 |
Total | 315 | 141 | 54 | 2 | 512 |
The reported levels of AE severity differed significantly between the randomly selected discharge reviews and the deceased patient reviews (Figure 31). The proportion of all events falling into the lower levels of severity, namely E (temporary harm) and F (temporary harm requiring further inpatient management), was 96.3% in the randomly selected discharge reviews, compared with just over half in the deceased patient reviews (54.5%, z-score 15.4; p < 0.0001). Perhaps more significantly, although the AE was determined to have contributed to the death of the patient in 1.4% of cases in the randomly selected discharge reviews, this rose to 34.4% in the deceased patient reviews (z-score 13.6, p < 0.0001). On further exploration, it became clear that the NCC MERP’s Categorizing Medication Errors Index is challenging to apply to many AEs that are followed soon after by patient death. This is particularly evident in cases such as the acquisition of hospital-acquired infection, and events categorised as level I would have been classified as level F if the inpatient episode had not ended in the death of the patient. Furthermore, this may reflect the process of dying even though AE criteria are met.
The clinical context giving rise to the Harm2-determined AEs in the deceased patient reviews cohort and falling into each of the NCC MERP’s Categorizing Medication Errors Index categories can be found in Appendix 10.
The preventability of adverse events in 3279 randomly selected discharge reviews and 1018 randomly selected deceased patient reviews across NHS Wales (2014–15) using the Harm2 tool
In the randomly selected discharge reviews, 296 of the 497 (59.6%) AEs were deemed to have been preventable if care had been delivered to a standard considered to be good clinical practice in normal circumstances (95% CI 55.29% to 63.91%). The level of preventability for each AE was assessed on a scale of 1–6 (Table 38), and the majority of all AEs (55.1%) were classified as having at least ‘slight to modest evidence for preventability’, corresponding to scores of between 2 and 6 on the Likert scale (95% CI 50.73% to 59.47%). In 32.7% of cases, the level of preventability was determined to be ‘preventability more likely than not; more than 50–50 but close call’, corresponding to a score of between 4 and 6 on the Likert scale (95% CI 28.58% to 36.82%).
Preventable AE | AE (n) | Total number | Per cent of total | |||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
Frequency | 219 | 57 | 17 | 3 | 296 | 59.6 |
Preventability score | ||||||
1 | 18 | 3 | 1 | – | 22 | 7.4 |
2 | 28 | 5 | 5 | – | 38 | 12.8 |
3 | 50 | 16 | 2 | 1 | 69 | 23.3 |
4 | 54 | 13 | 3 | – | 70 | 23.6 |
5 | 32 | 7 | 4 | 1 | 44 | 14.9 |
6 | 37 | 9 | 2 | 1 | 49 | 16.6 |
Total | 219 | 53 | 17 | 3 | 296 | |
Unknown | – | 4 | – | – | 4 | 1.4 |
Among the randomly selected deceased patient reviews, 316 of the 512 (61.7%) AEs were deemed to be preventable (95% CI 57.49% to 65.91%), with the majority of all AEs being classified as having at least ‘slight to modest evidence for management causation’, corresponding to scores of between 2 and 6 on the Likert scale (Table 39). This is not statistically different from the percentage determined preventable in the randomly selected discharge reviews (z-score 0.7, p = 0.4948). Notably, in 78 cases in which preventability was determined, no score was attributed to the level of preventability, reflecting the challenges faced by reviewers in assessing AEs and the attendant preventability in this cohort.
Preventable AE | AE (n) | Total number | Per cent of total | |||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | |||
Frequency | 195 | 87 | 34 | – | 316 | 61.7 |
Preventability score | ||||||
1 | 8 | 6 | 1 | – | 15 | 6.3 |
2 | 30 | 10 | 1 | – | 41 | 17.2 |
3 | 64 | 18 | 12 | – | 94 | 39.5 |
4 | 21 | 14 | 4 | – | 39 | 16.4 |
5 | 15 | 5 | 9 | – | 29 | 12.2 |
6 | 13 | 7 | – | – | 20 | 8.4 |
Total | 151 | 60 | 27 | 238 | ||
Score not attributed | 78 | 24.7 |
General quality-of-care assessment in 3279 randomly selected discharge reviews and 1018 randomly selected deceased patient reviews across NHS Wales (2014–15) using the Harm2 tool
At the end of the Harm2 process, the research nurses were asked the following question: considering all that you know about this patient’s admission, how would you rate the overall quality of care?
Among 371 inpatient episodes in the randomly selected discharge reviews cohort in which AEs were identified, the nurse reviewers deemed the overall quality of care to be excellent or good in 78.5% of inpatient episodes examined (Table 40). The quality-of-care rating in the randomly selected deceased patient reviews cohort was also very similar to that described for the randomly selected discharge reviews. In 315 episodes of care in which AEs were identified, the overall quality of care was rated as excellent or good in 72.7% of inpatient episodes examined. Across the total cohort, AEs that were deemed to be preventable were again associated with lower overall quality-of-care ratings.
Overall quality assessment of episode of care | Cohort | |||
---|---|---|---|---|
Randomly selected discharge reviews | Randomly selected deceased patient reviews | |||
Frequency | Per cent of reviews rated | Frequency | Per cent of reviews rated | |
Excellent | 86 | 23.2 | 63 | 20.0 |
Good | 205 | 55.3 | 166 | 52.7 |
Adequate | 45 | 12.1 | 57 | 18.1 |
Poor | 4 | 1 | 6 | 1.9 |
Very poor | – | – | 1 | 0.3 |
Not recorded | 31 | 8.4 | 22 | 7.0 |
Comparison of the percentage of inpatients Identified with adverse events using the Harvard method and Harm2 tool
In the nine organisations that contributed to both phases of the study, the percentage of patients experiencing an AE in phase 2 was compared with that determined in phase 1 (Figure 32). In seven study sites, the percentage of patients that experienced an AE showed little change. A more significant change is observed in sites 4 and 5, with increased percentages in phase 2 reflecting a more robust sampling strategy.
The CIs of the percentage of AEs occurring during phase 1 of the study determined using the Harvard method (10.3%, 95% CI 9.4% to 11.2%) and the Harm2 tool (mean 11.3%, 95% CI 10.22% to 12.40%) overlap and, therefore, the rate of AEs determined by the two different methods is not statistically different (z-score 1.4, p = 0.162).
The inter-rater and intermethod reliability of the Harm2 processes
Ten per cent of all Harm2 records were double reviewed to determine the extent to which different reviewers agreed on the presence of an AE (Table 41). Among 380 episodes of care in the randomly selected discharge reviews that were doubled reviewed, there was agreement in 336 cases (88.4%, κ = 0.50, 95% CI 0.38 to 0.65).
Site | Total number of reviews | Number of pairs | Unweighted kappa | 95% CI | Altman scale 199157 | |
---|---|---|---|---|---|---|
Concordant | Discordant | |||||
1 | 31 | 27 | 4 | 0.52 | 0.043 to 0.99 | Moderate |
2 | 32 | 32 | 0 | 1 | – | Very good |
3 | 31 | 30 | 1 | 0.78 | 0.36 to 1.00 | Good |
4 | 29 | 26 | 3 | 0.52 | 0.015 to 0.08 | Moderate |
5 | 28 | 23 | 5 | 0.58 | 0.24 to 0.91 | Moderate |
6 | 30 | 26 | 4 | 0.42 | 0.27 to 0.95 | Moderate |
7 | 28 | 25 | 3 | 0.34 | 0.00 to 1.00 | Fair |
8 | 30 | 24 | 6 | NC | NC | Less than chance |
9 | 23 | 22 | 1 | 0.83 | 0.51 to 1.00 | Very good |
10 | 38 | 30 | 8 | 0.42 | 0.06 to 0.78 | Moderate |
11 | 29 | 23 | 6 | 0.13 | 0.00 to 0.75 | Poor |
12 | 51 | 47 | 4 | 0.67 | 0.36 to 0.98 | Good |
Not known | 1 | Excluded from analysis | ||||
Total | 380 | 335 | 45 | 0.50 | 0.38 to 0.65 | Moderate |
In total, 132 episodes of care in the deceased patient reviews were double reviewed, and there was agreement on the presence of AEs in 105 (79.5%) cases (κ = 0.54, 95% CI 0.39 to 0.70) (Table 42).
Site | Total number of reviews | Number of pairs | Unweighted kappa | 95% CI | Altman scale 199157 | |
---|---|---|---|---|---|---|
Concordant | Discordant | |||||
1 | 5 | 5 | 0 | 1 | – | Very good |
2 | 9 | 8 | 1 | 0.77 | 0.34 to 1 | Good |
3 | 9 | 9 | 0 | 1 | – | Very good |
4 | 10 | 8 | 2 | 0.52 | 0 to 1 | Moderate |
5 | 11 | 6 | 5 | NC | NC | Less than chance |
6 | 20 | 14 | 6 | 0.38 | 0.21 to 0.80 | Fair |
7 | 14 | 13 | 1 | 0.81 | 0.45 to 1 | Very good |
8 | 11 | 10 | 1 | 0.81 | 0.47 to 1 | Very good |
9 | 9 | 6 | 3 | 0.30 | 0 to 0.95 | Fair |
10 | 10 | 7 | 3 | 0.40 | 0 to 0.97 | Moderate |
11 | 13 | 9 | 4 | 0.36 | 0 to 0.84 | Fair |
12 | 11 | 10 | 1 | 0.79 | 0.40 to 1 | Good |
Total | 132 | 105 | 27 | 0.54 | 0.39 to 0.70 | Moderate |
Intermethod reliability was also undertaken to assess the degree to which key outcomes are consistent from one review tool to the next. In 422 Harm2 reviews, the RF1 was also completed and 77 inpatient episodes were referred for physician MRF2 review. In total, 74 reviews were completed, and agreement on the presence of AEs between the Harm2 and two-stage retrospective review process was evident in 58 or 78.4% (κ = 0.45, 95% CI 0.07 to 0.62), indicating a moderate level of agreement.
Part 2: assessment of screening criteria in the Harm2 tool
Conversion of individual criterion to harm events using the Harm2 tool
At the end of phase 2, we reassessed the conversion rate between any individual patient who screened positive for any individual screening criterion and the subsequent determination of AEs in both the randomly selected discharge reviews and the randomly selected deceased patient reviews (Table 43). As harm determination was assessed in relation to each individual criterion in the Harm2 tool, rather than by applying multiple criteria to predominantly single events as in the Harvard method, this allowed for greater precision in the assessment of conversion. Although the difference in the rate of conversion was < 10% across 15 criteria, a number of significant differences were seen. The Harm2 reviewers assessed more hospital-incurred injuries to be harm events in the cohort of patients who died in hospital. For example, in the case of pressure ulcers, the rate of conversion was 46.6% in the randomly selected discharge reviews cohort and 64.1% in the deceased patient reviews cohort. On examination of these events, there appears to differences in the grading of tissue damage in the deceased patient reviews cohort, with more of these graded 2 and above.
Criterion number | Descriptor of criterion | Cohort | Difference in conversion (%) | |||
---|---|---|---|---|---|---|
Criteria present/AE present random sample | Rate (%) | Criteria present/AE present deceased cohort | Rate (%) | |||
1 | Unplanned admission within 30 days prior to index admission as a result of any health-care management | 59/204 | 28.9 | 27/128 | 21.1 | +7.8 |
2 | Unplanned admission to any hospital within 30 days post this discharge | 63/252 | 25.0 | N/A | ||
3 | Hospital-incurred accident or injury | 15/57 | 26.3 | 14/44 | 31.8 | –5.5 |
Pressure ulcer | 21/45 | 46.6 | 50/78 | 64.1 | –17.5 | |
Fall | 33/90 | 36.6 | 34/82 | 41.5 | –4.9 | |
Equipment-related accident or injury | 6/13 | 46.2 | 5/8 | 62.5 | –16.3 | |
Other not covered above | 16/23 | 69.6 | 14/22 | 63.6 | –6.0 | |
4 | Adverse drug reaction, side effect or drug error | 57/140 | 40.7 | 35/75 | 46.7 | –6.0 |
5 | Unplanned transfer from general care to intensive care/high-dependency unit | 4/30 | 13.3 | 8/58 | 13.8 | –0.5 |
6 | Unplanned transfer to another acute care hospital | 3/23 | 13.0 | 3/15 | 20.0 | –7.0 |
7 | Unplanned return to theatre in this admission | 13/33 | 39.4 | 5/11 | 45.5 | –6.1 |
8 | Unplanned removal, injury or repair of organ or structure during surgery, invasive procedure or vaginal delivery | 14/39 | 35.9 | 4/7 | 57.1 | –21.2 |
9 | Other patient complications to include MI, DVT, CVA | 15/30 | 50.0 | 16/55 | 29.1 | +20.9 |
10 | Development of neurological deficit not present on admission | 5/14 | 35.7 | 13/53 | 24.5 | +11.2 |
11 | Unexpected death or referral to coroner | 5/18 | 27.8 | 43/149 | 28.9 | –1.1 |
12 | Inappropriate discharge home, inadequate discharge plan | 39/83 | 47.0 | 8/12 | 66.7 | –19.7 |
13 | Cardiac or respiratory arrest | 1/13 | 7.7 | 18/96 | 18.8 | –11.1 |
14 | Hospital-acquired infection | 19/52 | 36.5 | 21/73 | 28.7 | +7.8 |
Wound infection | 29/35 | 82.6 | 12/23 | 52.2 | +30.4 | |
HAP | 29/43 | 67.4 | 104/159 | 65.4 | +2.0 | |
Sepsis | 18/29 | 62.1 | 65/113 | 57.5 | +4.6 | |
Other | 27/61 | 44.3 | 22/49 | 44.9 | –0.6 | |
15 | Patient/family dissatisfaction with care received documented in the medical record and or evidence of complaint lodged | 12/106 | 11.3 | 14/70 | 20.0 | –8.7 |
16 | Delays in or cancellation of treatment | 35/132 | 26.5 | 32/65 | 49.2 | –22.7 |
17 | Evidence of inappropriate decision-making regarding the treatment or intervention the patient received | 30/41 | 73.2 | 24/67 | 35.8 | +37.4 |
18 | Problems relating to communication | 25/56 | 44.6 | 19/42 | 45.2 | –0.6 |
19 | Problems relating to documentation | 22/149 | 14.8 | 20/96 | 20.8 | –6.0 |
20 | Missed, delayed or incorrect diagnosis | 28/42 | 66.7 | 20/28 | 71.4 | –4.7 |
21 | Any other undesirable outcomes not covered by any other criteria | 25/74 | 33.8 | 19/32 | 59.4 | –25.6 |
Other significant differences arise in other criteria. Although the number of surgical events was lower in the deceased patient reviews cohort, the reviewers considered almost half of surgical wound infections in this cohort not to be AEs, whereas the rate of conversion was 82.6% in the randomly selected discharge reviews. In the case of clinical decision-making and evidence of inappropriate treatment and interventions received by the patient, the conversion rate to AEs in the randomly selected discharge reviews cohort was almost double that in the deceased patient reviews cohort (73.2% and 35.8%, respectively).
This analysis points to a number of factors: (1) the context of the individual patient is important in any assessment of AEs; (2) the types of harm events seen in the randomly selected discharge reviews and the deceased patient reviews appear to be different; and (3) some of the newly introduced acts of omission had high rates of conversion to determined AEs.
Part 3: factors independently associated with subsequent adverse event determination in the Harm2 sample
Assessment of inpatient risk of adverse events in the randomly selected discharge reviews
The likelihood of having a Harm2-determined AE was assessed in this NHS Wales cohort of patients, with specific interest in the patient’s age, sex, underlying comorbidity and the duration of the inpatient episode. Increasing age, longer length of hospital stay, being an emergency admission and higher Charlson Comorbidity Index scores were all independently associated with AE determination in univariate analysis. Hospital site and the presence of a mental health diagnosis were not found to be statistically associated with AEs in this cohort of patients (Table 44).
Variable | Univariate OR | 95% CI | p-value |
---|---|---|---|
Hospital site | 0.997 | 0.97 to 1.03 | 0.841 |
LOS band (3-day durations until 14 days, > 14 days) | 1.47 | 1.37 to 1.58 | < 0.001* |
Age band (< 18 years, 18–25 years, 10-year age groups until over 85 years) | 1.12 | 1.07 to 1.74 | < 0.001* |
Admission status (elective) | 0.75 | 0.57 to 0.98 | 0.035* |
Charlson Comorbidity Index score (cumulative scoring) | 1.10 | 1.03 to 1.17 | 0.004* |
Mental health diagnosis present on admission | 1.14 | 0.79 to 1.64 | 0.495 |
The role of age and underlying chronic disease status in the likelihood of being exposed to a harm event in secondary care was explored further by examining the percentage of AE rates within each age band, and in the major classifications of chronic disease constituting the Charlson Comorbidity Index.
Age is a significant factor in an individual’s risk of experiencing an AE during inpatient care, with the risk increasing with age (Table 45). The likelihood of experiencing an AE peaks in the over-85 years age group, 16.09% of whom experienced an AE during the inpatient episode (Figure 33).
Age band (years) | Number of valid reviews | Frequency of AE | Percentage of valid reviews |
---|---|---|---|
< 18 | 225 | 14 | 6.22 |
18–24 | 190 | 15 | 7.89 |
25–34 | 325 | 28 | 8.62 |
35–44 | 290 | 28 | 9.66 |
45–54 | 348 | 32 | 9.20 |
55–64 | 407 | 54 | 13.27 |
65–74 | 552 | 70 | 12.68 |
75–84 | 539 | 66 | 12.24 |
≥ 85 | 379 | 61 | 16.09 |
Unknown | 24 | 3 | – |
Total | 3279 | 371 |
In assessing chronic disease, only three of the chronic disease variables explored were statistically significantly associated with AE determination (Table 46). These were peripheral vascular disease (PVD) (with an associated 2.5-fold increased risk) and hemiplegia and dementia (with a 2.27-fold increased risk) and corresponding p-values of 0.043 and < 0.001, respectively. Therefore, although physiological status, such as immunosuppression, seemed to have no association with the increased risk of experiencing an AE as an outpatient, having a chronic disease that affected levels of physical and cognitive function did appear to predispose patients to increased risk.
Chronic disease | Frequency | Per cent of total | OR | 95% CI | p-value |
---|---|---|---|---|---|
MI | 219 | 6.68 | 1.21 | 0.81 to 1.82 | 0.352 |
Chronic heart failure | 170 | 5.18 | 1.51 | 0.99 to 2.32 | 0.055 |
PVD | 93 | 2.84 | 2.52 | 1.54 to 4.11 | < 0.001* |
Coronary vascular disease | 259 | 7.90 | 1.34 | 0.93 to 1.93 | 0.117 |
Dementia | 177 | 5.40 | 2.27 | 1.56 to 3.31 | < 0.001* |
Hemiplegia | 36 | 1.10 | 2.27 | 1.03 to 5.01 | 0.043* |
Chronic lung disease | 373 | 11.38 | 1.14 | 0.82 to 1.59 | 0.405 |
Connective tissue disease | 79 | 2.41 | 1.28 | 0.69 to 2.44 | 0.460 |
Diabetes | 361 | 11.01 | 1.10 | 0.79 to 1.54 | 0.579 |
Gastric ulcer | 27 | 0.82 | 1.79 | 0.67 to 4.76 | 0.242 |
Chronic liver disease | 13 | 0.40 | 0.65 | 0.08 to 5.03 | 0.681 |
Moderate/severe liver disease | 39 | 1.19 | 0.89 | 0.32 to 2.53 | 0.834 |
Moderate/severe chronic kidney disease | 124 | 3.78 | 1.35 | 0.81 to 2.25 | 0.253 |
Lymphoma | 10 | 0.30 | 1 | – | – |
Leukaemia | 8 | 0.24 | 0.97 | 0.14 to 9.12 | 0.916 |
Non-metastatic carcinoma | 100 | 3.05 | 1.12 | 0.81 to 1.46 | 0.589 |
Metastatic tumour | 62 | 1.24 | 0.997 | 0.45 to 2.21 | 0.995 |
This analysis was rerun to include only individuals over the age of 55 years at the time of their hospitalisation. Having a confirmed diagnosis of dementia and PVD remained significant, but hemiplegia, although associated with similar risk, failed to reach statistical significance, probably as a result of its lower level of occurrence in the cohort (Table 47).
Chronic disease | Univariate OR | 95% CI | p-value |
---|---|---|---|
PVD | 2.03 | 1.19 to 3.46 | 0.0091* |
Dementia | 1.95 | 1.32 to 2.88 | 0.001* |
Hemiplegia | 2.11 | 0.94 to 4.73 | 0.070 |
Furthermore, at the age of 55 years, the effect of age band and Charlson Comorbidity Index score attenuated, and failed to remain significant for, risk factors for experiencing an AE during an inpatient episode, but increased LOS and being an elective admission remained associated with increased risk (Table 48).
Variable | Univariate OR | 95% CI | p-value |
---|---|---|---|
LOS band (3-day durations until 14 days, > 14 days) | 1.42 | 1.30 to 1.55 | < 0.001* |
Age band (< 18 years, 18–25 years, 10-year age groups until over 85 years) | 1.07 | 0.94 to 1.2 | 0.330 |
Admission status (emergency) | 0.70 | 0.50 to 0.99 | < 0.001* |
Charlson Comorbidity Index score (cumulative scoring) | 1.04 | 0.97 to 1.1 | 0.211 |
Mental health diagnosis present on admission | 1.18 | 0.72 to 1.94 | 0.509 |
Factors independently associated with subsequent adverse event determination in the randomly selected deceased patient reviews
In the cohort of patients in whom death was the end point of the period of hospitalisation, increased length of hospital stay was also significant, with an attendant 33% (95% CI 1.22% to 1.44%) increased risk by inpatient duration, and being an emergency rather than an elective admission was associated with a twofold increased risk of AEs (OR 2.10, 95% CI 100 to 4.41). Rather than reflecting actual risk or harm associated with AEs, this probably reflects the nature and reason for admission, with a significantly lower percentage of this group being admitted for elective surgery and a significantly higher percentage being admitted for emergency clinical management and end-of-life care. Table 49 reports that neither increasing Charlson Comorbidity Index score nor increasing age was associated with an increased risk of AEs in this cohort.
Variable | Univariate OR | 95% CI | p-value |
---|---|---|---|
LOS band (3-day durations until 14 days, > 14 days) | 1.33 | 1.22 to 1.44 | < 0.001* |
Age band (< 18 years, 18–25 years, 10-year age groups until over 85 years) | 1.06 | 0.95 to 1.18 | 0.297 |
Admission status (emergency) | 2.10 | 1.00 to 4.41 | 0.049* |
Charlson Comorbidity Index score (cumulative scoring) | 1.01 | 0.96 to 1.06 | 0.810 |
Figure 34 shows the AE rate by age band and shows very little variation in the age-specific occurrence of AEs. Table 50 shows that in univariate analysis none of the chronic disease variables explored was statistically associated with AE determination.
Chronic disease | Frequency | Per cent of total | Univariate OR (p-value) | 95% CI | p-value |
---|---|---|---|---|---|
MI | 144 | 14.16 | 1.22 | 0.84 to 1.78 | 0.292 |
Chronic heart failure | 186 | 18.29 | 1.02 | 0.72 to 1.44 | 0.906 |
PVD | 58 | 5.70 | 1.04 | 0.59 to 1.84 | 0.898 |
Coronary vascular disease | 173 | 17.01 | 1.02 | 0.72 to 1.46 | 0.912 |
Dementia | 175 | 17.26 | 0.82 | 0.57 to 1.18 | 0.280 |
Hemiplegia | 30 | 2.95 | 0.91 | 0.61 to 1.38 | 0.662 |
Chronic lung disease | 233 | 22.91 | 0.89 | 0.64 to 1.2 | 0.459 |
Connective tissue disease | 24 | 2.36 | 0.89 | 0.49 to 2.73 | 0.742 |
Diabetes | 179 | 17.60 | 1.19 | 0.85 to 1.69 | 0.300 |
Gastric ulcer | 18 | 1.77 | 0.88 | 0.31 to 2.50 | 0.815 |
Chronic liver disease | 7 | 0.69 | 1.73 | 0.39 to 7.79 | 0.473 |
Moderate/severe liver disease | 45 | 4.42 | 1.42 | 0.77 to 2.64 | 0.265 |
Moderate/severe chronic kidney disease | 127 | 12.49 | 1.07 | 0.71 to 1.59 | 0.751 |
Lymphoma | 14 | 1.38 | 1.74 | 0.60 to 5.06 | 0.309 |
Leukaemia | 17 | 1.67 | 0.70 | 0.23 to 2.18 | 0.543 |
Non-metastatic carcinoma | 54 | 5.31 | 1.50 | 0.85 to 2.64 | 0.160 |
Metastatic tumour | 139 | 13.67 | 0.85 | 0.57 to 1.26 | 0.416 |
Part 4: the nature and characteristics of adverse events identified by the Harm2 tool and differences in the randomly selected discharge reviews and randomly selected deceased patient reviews
We decided on a purposive sample that facilitated the direct comparison of patients who died during the inpatient episode and a random sample of patients from across the inpatient spectrum. The aim was to elucidate key areas of difference within these groups. Part 1 demonstrates that extent and associated severity were significantly higher in the deceased patient cohort. Part 2 indicates that positive screening criteria in the deceased patient cohort did not convert to AEs as frequently as they did in the randomly selected discharge reviews. Part 3 confirms theoretically driven concepts, such as the fact that this is an older group, who are less likely to undergo surgical management and more likely to receive end-of-life care. In part 4, the differences in the nature and characteristics of the AEs are explored.
Origin of adverse events
As part of the AE assessment, the origin was determined for each event identified and is graphically displayed in Figure 35.
The origin and timing of the AEs vary considerably between groups. Figure 35 shows that, in the randomly selected discharge reviews, more AEs arise in the period before inpatient management. For example, in general practice AEs arise as a result of failures in drug monitoring, whereas early in admission they are more usually a result of assessment and diagnosis problems, and during surgery and procedures they are a result of complications such as bleeding and infection. In the deceased patient reviews cohort, slightly more events arose in the intensive therapy unit (ITU) and general ward environment. It was more of a challenge to classify the origin and timing of the AE in the deceased patient reviews cohort, and in one in five cases it could not be determined. Examples of AEs arising in these different settings are outlined in Appendix 11.
Breakdown and analysis of problems in care in the randomly selected discharge reviews and randomly selected deceased patient reviews cohorts
A comprehensive breakdown of problems in care was undertaken for both cohorts, and the overview breakdown is shown in Figure 36. Significant differences in the composition of AEs in the groups are apparent. Of particular note is the number of hospital-acquired infections identified in the samples. Overall, infection control issues were associated with 35% of AEs in the deceased patient reviews and with 18.4% of AEs in the randomly selected discharge reviews. One surgical wound and 108 cases of hospital-acquired pneumonia (HAP) were identified in the deceased patient sample, compared with 34 wound infections and 34 hospital-acquired respiratory tract infections in the randomly selected discharge reviews, again reflecting the different reasons for and type of admission between the groups. Other differences are evident in diagnostic and assessment errors and AEs relating to the prescribing, administration or monitoring of drugs/fluids or blood (Table 51). Categories in which significant differences were not detected include failure to appreciate the patient’s overall condition and failure in clinical monitoring – areas more aligned to general ward care in which differences between the two groups would not be expected.
Problem in care categories | Cohort | z-score | p-value | |||
---|---|---|---|---|---|---|
Deceased patient reviews | Discharge reviews | |||||
n | % | n | % | |||
1. AE relating to diagnostic or assessment error | ||||||
Total | 43 | 8.2 | 57 | 12.2 | –2.097 | 0.036* |
2. AE from failure to appreciate patient’s overall condition | ||||||
Total | 26 | 5.0 | 14 | 3.0 | +1.562 | 0.1183 |
3. AE arising from a failure in clinical monitoring | ||||||
Total | 135 | 25.7 | 130 | 27.8 | –0.754 | 0.4508 |
4. AE in relation to failure to prevent/control/manage infection | ||||||
Total | 184 | 35.0 | 86 | 18.4 | +5.875 | < 0.0002* |
5. AE directly related to a problem with an operation or other invasive procedure | ||||||
Total | 11 | 2.09 | 48 | 10.3 | –5.439 | < 0.0002* |
6. AE relating to prescribing, administration or monitoring of drugs/fluid/blood | ||||||
Total | 57 | 10.6 | 80 | 17.1 | –2.859 | 0.0042* |
7. AE relating to resuscitation | ||||||
Total | 16 | 3.0 | 1 | 0.2 | – | |
8. Any other problem not fitting categories above | 53 | 10.1 | 51 | 10.9 | –0.424 | 0.6716 |
Total | 525 | 467 |
Summary of findings in Chapter 5
-
Using the Harm2 tool, at least one AE was determined in 11.3% (95% CI 10.22% to 12.40%) of all episodes of care in the randomly selected discharge reviews and 59.6% (95% CI 55.29% to 63.91%)of AEs were deemed to be preventable if care were to have been delivered to a standard considered to be good clinical practice in normal circumstances.
-
At least one AE was determined in 30.1% (95% CI 28.13% to 33.82%) of all inpatient episodes of care that resulted in death, of which 61.7% (95% CI 57.49% to 65.91%) were deemed to be preventable, with the majority of all AEs being classified as having at least ‘slight to modest evidence for management causation,’ corresponding to scores between 2 and 6 on the Likert scale.
-
The AE rate in phase 2 was compared with the rate determined in phase 1 in the nine organisations that contributed to both phases of the study, and in seven of these organisations there was little difference between the AE rate reported using the Harvard method and that obtained using the Harm2 method.
-
At the end of the Harm2 process, the nursing cohort deemed the overall quality of care to be excellent or good in 78.5% of inpatient episodes examined in the randomly selected discharge reviews and in 72.7% of inpatient episodes examined in the deceased patient reviews cohort.
-
The inter-rater reliability for the Harm2 tool results in agreement in 336 out of 380 cases (88.4%) (κ = 0.50; 95% CI 0.38 to 0.65) in the random sample and in 105 of 132 cases (79.5%) (κ = 0.54; 95% CI 0.39 to 0.70) in the deceased patient reviews.
-
Assessment of intermethod reliability was undertaken and agreement on the presence of AEs between the Harm2 and two-stage retrospective review process was evident in 58 out 74 reviews, or 78.4% (κ = 0.45, 95% CI 0.07 to 0.62), indicating a moderate level of agreement.
-
Of the 21 screening criteria, only three were found not to be associated with AE determination in the randomly selected discharge reviews and nine of the overall 21 criteria did not reach a statistically significant level of association in the deceased patient reviews cohort.
-
Age is a significant factor in an individual’s risk of experiencing an AE during inpatient care across the general inpatient population, with risk increasing with age. The AE rate peaks in the over-85 years age group, in which 16.09% of all patients experienced an AE during the inpatient episode.
-
Three of the chronic disease variables explored were statistically associated with AE determination in the randomly selected discharge reviews. These were PVD (with an associated 2.5-fold increased risk) and hemiplegia and dementia (both with a 2.27-fold increased risk) and corresponding p-values of 0.043 and < 0.001, respectively. Hemiplegia was also associated with significant risk, but failed to reach statistical significance in the over-55 years age group, probably as a result of its low frequency within the data set.
-
Adverse events occurring in the deceased patient cohort are different from those in the general inpatient population and are less likely to arise from diagnostic, assessment and medication errors and surgical procedures and are more likely to arise from hospital-acquired infections.
Chapter 5 Evaluation of the operational elements of the Harm2 process and impact of the harm study across NHS Wales
Aim
To embed the Harm2 tool in organisations and determine the organisational response.
Objectives
-
To implement a retrospective case note review study across the 12 study sites, reviewing a random sample of 20 discharges per organisation per month for a period of 16 months (July 2014–October 2015).
-
To develop mechanisms in study sites to flag the identification of AEs in a timely manner for the purposes of increased organisational ownership of AE data and information.
-
To develop and test data management and analysis tools for routine use in study sites to monitor the rate and organisational profile of AEs.
-
To map and document the case note review process across the NHS sites and the flow and use of data through a service evaluation and impact assessment.
Background
The review infrastructure was developed from phase 1 of the study, but the research team invited one senior member of the governance or leadership team from each health board to sit on the study management group. This ensured that organisations had the opportunity to keep updated on study progress and any operational issues associated with the study. Any problems with, for example, capacity or note pulling were flagged to organisations early so that the research team could liaise with NISCHR CRC or the health boards to resolve any issues. It was in this forum that data feedback tools and mechanisms were discussed and introduced.
Development of data and feedback tools
Phase 2 streamlined data entry and feedback mechanisms in order for organisations to have access to the data as soon as possible after the reviews had been completed. We explored options for electronic data capture, but access to reliable Wi-Fi and information technology equipment was variable, particularly in the case of external reviewers coming into organisations to undertake reviews. The team also worked collaboratively with the Welsh National Harm and Mortality Collaborative, with time protected within that agenda to explore accumulating data and innovative methods for presentation of feedback data using tools, such as the signatures of harm.
We developed feedback for organisations in the format of an organisational report. This was sent out as data were accumulating and outlined the number of reviews undertaken nationally, the number undertaken within that particular study site and the AE rates nationally and locally. In addition to rates, organisations were provided with a graph representing all the criteria that were found to be present and the subsequent conversion to AEs, facilitating identification of risk and data from which signatures of harm could be generated. A list of all study numbers was provided with details on which criteria were present in the harm identified and the attendant level of severity. Reports and explanatory materials were sent into organisations via the organisational leads who sat on the study management group in most cases, along with the identifiers so that they could re-pull the notes and review the episode of care within their own governance structures if they felt that this was warranted. An example of an organisational report is shown in Appendix 12.
We also, during phase 2, implemented an additional feedback process that was not included as part of the formal research process but arose from organisational request. We asked the review teams to leave a copy of the summary of the review session, with identifiers of the numbers of the notes where criteria were present and AEs detailed, with the clinical governance teams at the end of the review session. This was undertaken to highlight any serious concerns so that prompt follow-up could be actioned. In a small number of cases the research office called senior members of the medical team in individual organisations to raise specific concerns about specific cases.
The close working relationships developed between the research team and organisations during the study period were invaluable in informing our approach and, as a final step in this work, we wanted to formalise some of this learning and insight and we undertook a service evaluation and impact assessment. The evaluative process aimed to examine and learn from the operational elements of the study, the perceptions of our key stakeholders and the implications for the health boards that participated in the study. This element was conducted by AJL, an experienced evaluative nurse researcher with a background in researching health-care safety, who had been involved in the original proposal but had not been involved in the collection or analysis of the quantitative harm data. The reporting follows the Consolidated Criteria for Reporting Qualitative Studies (COREQ) guidance. 59
Evaluation questions
-
What have been the challenges experienced by those undertaking the routine note and mortality audits?
-
What are the implications of the electronic record for future audit studies?
-
How are harm data being used and disseminated within the organisations?
-
What is the added value of the harm data?
-
What, if any, changes have been effected as a result of the study?
Methods
There were two main groups of participants: a sample of the RF1 reviewers who had undertaken the notes audit and key contacts from the health boards. The names of these key contacts had been provided by the health boards at the commencement of the study and were normally a medical director and a named leader of the quality assurance team. As the study progressed, some members of this group changed and the list was expanded in cases in which the medical directors delegated the role to an assistant. We also interviewed an adviser to the WG.
The entire population of reviewers was approached by AJL, to whom they had been introduced during the last year of the study. Members of the health boards were initially approached by e-mail by one of the core research team (EB), who introduced AJL. Two of those approached immediately declined to participate, one citing a change of role in the organisation. In some cases, invitees asked relevant others (e.g. audit managers) to participate in the interviews and dates and times were duly arranged. Reminder e-mails were sent out in the days before the interview, again attaching the participant information and consent form. Two interviews were conducted remotely by Skype™ (Microsoft Corporation, Redmond, WA, USA) and telephone, respectively; the remainder were conducted face to face in a place chosen by the interviewee(s). Two were set up by the health board as group interviews: one with three people and one with two. In one case a respondent had to pull out shortly before the interview and another dialled into a videoconferencing facility call to a meeting room in their own board room. One respondent who had not seen the data at the time of interview (they had not been forwarded by the medical director) kindly agreed to a follow-up telephone interview. A copy of the output sheets was sent and a follow-up interview was conducted by telephone.
Table 52 shows the total number of interviewees by health board, although it should be noted that some RF1 reviewers carried out their work in as many as three organisations. Because of the small number of health boards involved, we have deliberately anonymised both role titles and the precise names of committees, and quotations from respondents are attributed only by number, and are preceded by reviewer, health board or WG. For the same reason, we have adopted the use of the grammatically inaccurate but gender-neutral pronoun ‘they’ rather than ‘he’, ‘she’ or the clumsier ‘s/he’.
Role of respondents | Health board | WG | |||||
---|---|---|---|---|---|---|---|
A | B | C | D | E | F | ||
Medical director | – | 1 | – | – | – | 1 | – |
Assistant medical director | 1 | – | 1 | 1 | 1 | 1 | – |
Quality/governance/improvement/audit manager | – | 1 | – | – | 1 | 2 | – |
Senior nurse manager | – | – | 1 | – | – | – | – |
Policy adviser | – | – | – | – | – | – | 1 |
Reviewers | 1 | 6 | 3 | 4 | 6 | 4 | – |
The semistructured interviews, conducted by AJL between September and November 2015, were based on a guide, although we were ever alert to tangential issues raised by interviewees. The reviewers’ interview guide was piloted with EB, who had undertaken some reviews personally. The interview guide for the health boards was reviewed after the first interview and the order of questions was slightly altered. Interviews were arranged to last for 45 minutes, but the actual time available was confirmed at its commencement and the questions tailored to fit accordingly. In practice, interview duration ranged from 25 to 80 minutes. These were subsequently digitally recorded, transcribed and anonymised by the interviewer.
Analysis
This was approved as and intended to be an evaluation of the Welsh Harm Study, rather than qualitative research, but the approach used was a constructivist one, with themes expected to emerge from the data rather than imposed a priori. The first stage of data analysis involved familiarisation with the data, aided in this case by the fact that all interviews were conducted and transcribed by AJL personally. Once a conceptual framework emerged, all transcripts were analysed thematically using a Microsoft Excel spreadsheet, using one row per respondent, and adding columns for themes and subthemes as necessary. Validity was enhanced by sharing the transcripts with SM, who independently analysed a small sample, and discussing the single inconsistency. Synthesis was carried out by considering all of the responses within a thematic column, noting areas of agreement and of exception and selecting exemplar quotations. Emerging findings were also discussed with EB to determine whether or not they were consistent with the experience of her close working with reviewers and health boards during the life of the study. The qualitative element and its aims and objectives were outlined to research and development departments within participating health boards and deemed to be service evaluation.
Findings: evaluation of the operational elements of the Harm2 process and impact of the harm study across NHS Wales
Aim
To determine the organisational response.
The reviewers’ perspective
The 14 respondents who were the source of these data were all CRC research infrastructure staff (hereinafter referred to as the reviewers), 12 of whom came from a nursing or midwifery background and two of whom came from related health-care backgrounds. Their time with CRC varied from 1 to 12 years, with a mean of 4.5 years. Only three had not been involved in the project since the beginning of phase 1. Taken together, the respondents had undertaken data collection in all 12 hospitals in six health boards.
The data collection process
It was reported that the appropriate files had almost always been identified before their arrival in the health boards, which was a result of close working between the review teams and the audit departments. In only one health board was there a problem, represented by the quotation from reviewer 7. Occasionally, the patients whose notes had been identified did not meet the criterion of having had a stay longer than 24 hours, as a result of the coding system in this site. By liaising with the records department, this coding issue was quickly resolved by changing the criterion to a 36-hour minimum at this specific site. There were some concerns about the the degree to which the notes were in fact randomly selected, and a comment by health board audit staff recorded in the quotation below indicates some degree of occasional selection bias, again reported and addressed. One group of reviewers noticed a surprisingly consistent selection of cases across the hospital such that they could predict in advance how many patients from each specialty they would receive, while others were sometimes surprised at the number of patients who had been admitted for the same procedure. There were no mental health notes per se in the sample as these are held in the mental health hospitals, rather than on the index acute hospital sites, although people with long psychiatric histories did feature in the general sample when admitted to acute wards:
The validity behind the randomisation of some of those has not always been felt to be uniform across the health boards sometimes. Sometimes the person pulling the notes has said ‘don’t worry – we haven’t picked the thick ones for you . . .’ We recognised that internally and challenged it and fed it back to the study team so that was tackled.
Reviewer 5
Medical records at [name] hospital in particular went through a very bad phase and had to actually because of shortage of staff stop issuing notes for anything other than clinics and admissions or anyone who wanted any records had to go and pull their own so you can imagine what chaos that caused . . . as we’ve gone on that problem has been resolved but that was a reflection of where the organisation was and the department was staffing-wise. I suppose that when they’re in the position of not even being able to send notes to clinics then research is not their priority.
Reviewer 7
Whenever possible, the reviewers undertook the reviews in groups or in pairs. Notes were reviewed under variable conditions, but all health boards tried to find suitable accommodation within space constraints.
Quality of the case notes
With the exception of one health board, the condition of the case notes was said to be poor, with those of deceased patients being significantly worse than those of the randomly selected admissions. In many cases, unsecured packs of loose papers, which might on occasion relate to several admissions, were placed into the flap of the case note cover. In such cases it was difficult to identify the sequence of the paperwork (which might moreover be undated) and this made it difficult for reviewers to state with confidence that a risk assessment or a do not attempt resuscitation (DNAR) form had not been completed, as it was entirely possible that the relevant paperwork had fallen out of the notes:
In the notes for deceased patients all the papers inside would be loose. There would be no order necessarily and in some of the health boards we felt the papers had just been thrown in. They were not filed. We saw that in quite a few deceased notes and we all spoke about it because we didn’t think it was particularly great. We wouldn’t know if anything was missing . . . we could be saying there was no DNAR form or no Waterlow score but it could’ve been there and could’ve slipped out.
Reviewer 2
Pieces of paper misfiled or missing or the dates not on them or just crumpled up. I don’t know if that’s just storage or because people have been in hospital for a long time so a lot of people are handling them. I think there’s a lot of cases where they tried to save money so they stopped putting dividers in and that makes things particularly difficult because people are in a hurry and they just shove it in the first place they can and if you’re looking at a couple of inches of notes you’re just going to put it in the front.
Reviewer 12
On occasion, papers such as prescription sheets lacked patient identifiers, and sometimes material relating to other patients was found in the file being reviewed. In some cases, file covers were ripped and held together with perished elastic bands that broke as reviewers removed them. On occasion it was simply not possible to identify the relevant admission from a thick volume, or indeed several thick volumes, of case notes and these had to be abandoned after 30 minutes of searching, although this was said to be rare.
There was variation in how the information in the case notes was presented. In one health board and in specific areas of others, a unified or collaborative notes system was adopted, with doctors, nurses and other health professionals writing in the same case notes. Collaborative notes allowed reviewers to follow a patient’s pathway and identify both problems and their resolution. When medical and nursing notes were held independently, tracing the journey of a patient, and checking whether or not a medical prescription had translated into nursing or other action, required much flicking backwards and forwards from section to section. Moreover, detailed records written by such people as physiotherapists might be held in their own department and did not find their way into the case notes, the only record being ‘seen by physiotherapist/nutritionist, etc.’ A quotation by reviewer 8 offers a picture of the difficulties presented by reviewing independently held medical and nursing notes, while a quotation by reviewer 12 illustrates the improvement brought about by unified or collaborative notes:
It’s all in sections and although you would start to look at the nursing records and tick off that you’ve seen those maybe when you then get to the clinical notes something might make you think I need to go back and look at that in the nursing notes and then you’re going backwards and forwards.
Reviewer 8
It was much easier when you could follow how the care was progressing and what the care package was and how they were thinking and the communication between the teams. It’s much more effective that way than having the nursing notes and the doctors’ notes and they don’t always marry and you’re trying to follow the care process and I think it must be quite difficult for clinicians as well because they’re time starved and they’re trying to find out what’s been done and what hasn’t been done.
Reviewer 12
The use of abbreviations was a challenge to all, including experienced nurses who had worked in many specialties. One group of reviewers created a list of abbreviations to which they added as they worked, but this was not exhaustive. One experienced nurse reviewer commented:
I think some people were just making them up.
Reviewer 4
Nursing notes were often found to be verbose and uninformative, with each entry, in the experience of one team, on each shift on each day beginning with ‘I introduced myself to the patient’. It was said that the quality of nursing notes tended to improve and become more business-like when collaborative notes were in use.
Not all data, however, were available in the paper notes. Discharge letters and laboratory results might be found on the clinical system accessed via the Welsh Clinical Portal, although two research teams were given no access to health board computer systems and were reliant on printed copies being inserted into the notes. Reviewers explained how they knew where to find which data:
We just worked it out as we went along. Because we work in all the health boards anyway we’ve got access to the clinical systems so if I was looking for some notes and it didn’t quite seem to add up or there was an admission missing or there was something – I might log on and see if there was any documentation uploaded that wasn’t in the notes.
Reviewer 1
The room we were using to review the notes we didn’t have access to the electronic system so we only ever used the written case notes.
Reviewer 8
Digitised notes
It was clear that, in discussing digitisation, there were two quite different strands. The first related to live electronic data (mainly operation notes, discharge letters, radiology data, laboratory results and, in some health boards, entire stays in intensive care) recorded onto the Welsh Clinical Portal; the second related to paper notes that had been scanned page by page, a process that resulted in a file consisting of images of written notes. The former could be tabulated and graphed, whereas the latter could be neither searched nor manipulated.
The one advantage of both electronic entry and digitisation was that, provided the reviewers were given access to the electronic system, there was the option of undertaking the notes review in the research offices, although, if being carried out in the hospital, access to a computer was necessary. Once accessed, the live data were easily found and were always clear, being typed rather than handwritten. However, the scanned material was more difficult to handle. In some cases, there was a ‘back scan’, a scan of all notes up to a given date, so reviewers had to work out where to find the data they were looking for. In some cases, the scanned notes were organised by date of admission and contained separate folders for the medical and nursing notes. In other cases, however, images of handwritten nursing and medical notes and prescription sheets were entered without any clear rationale as to what went where, and reviewers were often faced with 200–300 pages of scanned material including pages scanned in upside down, with no option to rotate the image. The problem was exacerbated by the system automatically logging reviewers out after 20 minutes, leaving them to log back in and start again. There was no ability to carry out a word search in scanned material:
Digitisation can be viewed as a predominantly positive thing . . . but the way it has happened is that the notes have been scanned in as an image which you can’t search within. If you wanted to have a look at certain documentation they are not always ordered within the right subheading so you can’t go to nursing notes and see all the nursing notes – you are looking at all sorts of information and 140 pages . . . Sometimes there seemed to be no rhyme nor reason to the way it was scanned in . . . it used to time out after 20 minutes and you’d literally have to sign in again – it was not user friendly.
Reviewer 5
The time taken to do a review varied from 15 minutes to 90 minutes depending on the LOS, the complexity of the case and the state of the notes. In the case of digitised records, the estimate was that they took 50% longer. One health board had invited the reviewers to give advice on their planned approach to digitisation and they were happy to do this.
The Harm2 criteria
All who had been involved in phase 1 of the study found the Harm2 tool to be a great improvement, finding the revised harm criteria clear, especially in conjunction with the accompanying guide, and reviewers grew in confidence as the study proceeded. The research teams valued the opportunity to discuss cases among themselves (which is why they carried out the reviews in groups) as well as the telephone meetings with research staff, which allowed them to air issues about which they were unsure. Sometimes, however, the patients and their situations were less clear, so that a judgement of harm was difficult:
They were clear. They were well explained and everywhere I went I took my Harm tool with me and we encouraged everyone to do the same.
Reviewer 6
It was really helpful and I think critical that we did our reviews in twos or threes sometimes so very often there would be a lot of discussion between us although the person doing the review would make the final decision. It was useful to bounce things around the team.
Reviewer 8
During the training of the nurses and in the monthly calls with the research team that took place throughout the life of the project, the criterion ‘unexpected death’ caused a great deal of discussion. The patient might not have been admitted explicitly for end-of-life care, but might have had serious conditions and comorbidities and it might have become clear during the course of their hospital stay that they were unlikely to recover:
There was quite a lot of discussion about that one because it says unexpected death, i.e. not an expected outcome of the disease during hospitalisation so I suppose that clarifies it a bit but people did struggle with that one.
Reviewer 7
Clinical reviewers from a non-nursing background struggled more than the nurses, being unused to the usual care pathways and the common medications. The following quotation is from a clinical research officer and for reasons of anonymity is unattributed. However, even experienced nurses struggled when confronted by a complex specialty, of which they had no experience and with whose pathways they were unfamiliar:
Some criteria like errors in medication unless someone had written there was an error in medication, I was not in a position to comment. I know some of the nurses used to debate whether certain prescriptions were part of a normal clinical pathways but this would just highlight my own feelings of inadequacy being on the study because I could never comment on something like that.
In response to the question ‘Has there ever been an occasion on which you were unable to record an event which you felt was relevant?’, the answer was overwhelmingly ‘no’ because of the existence of the ‘other’ category. Within the context of the interview, it was difficult for people to recall how they had used this category, although one thought that they had cited evidence of lack of communication.
The review process was inevitably more difficult and took longer in respect of deceased patients. The admissions tended to be lengthier, their conditions were more complex and in most cases the outcome seemed fairly inevitable. There also seemed to be more harm criteria evident among the deceased patients. The more complex the case, the greater the temptation for the reviewers to qualify their findings and give more explanation:
We knew when we were coming to a set of notes that we could potentially find harm more so in the deceased notes . . . I think often they were more complex. I generally identified more issues with communication and documentation. I used to find that’s where the biggest gaps in care were. The documentation was quite poor at times. I suppose they had more going on.
Reviewer 2
A huge number of the cases at (name of hospital) were hospital-acquired pneumonias (HAPs) or community-acquired pneumonias which then came into hospital and got a second infection so I found that difficult to get my head round because I was thinking well that patient is immunocompromised anyway and it’s not really a harm, but eventually I started recording everything that was an HAI [hospital-acquired infection] then wrote a little note that said I don’t really think this was a harm event.
Reviewer 1
Several respondents spoke of the difficulty of reaching judgements on occasions when long-term mental health patients were admitted to a general hospital, in one case having taken poisons. This patient was said to have had a history of such admissions and, although there was no identifiable harm in the care received during the index admission itself, reviewers felt that discharge home on the following day was inappropriate. They did not know whether community mental health teams had been notified, but assumed that they would ‘learn the information from somewhere’:
Basically someone who’d taken overdoses and kept presenting at the poisons unit and being discharged and eventually a week later or so they successfully overdosed and died.
Reviewer 1
Respondents were then asked about the converse situation: whether or not there had ever been an occasion when they recorded a harm event that they felt was unjustified. Although some respondents replied in the negative, a few pointed out that in contrast to the clarity of the criteria, the patients and situations could be more ‘messy’:
Somebody falling in hospital. There had been a falls assessment and it was on record that they needed observation to do everything yet the patient continued to do something that they shouldn’t have and fell and broke something.
Reviewer 5
Sometimes it’s difficult because you’d say that’s harm because the drug hadn’t been given for an hour but then from nurses’ documentation you’d see they were very short staffed. So you’d say is it actually harm that the patient hadn’t had the drug for 2 hours but they had it and you can understand it – maybe a cardiac arrest or something.
Reviewer 9
One respondent told the story of a woman who had had a severe reaction to an antibiotic to which she was allergic. Her skin sloughed off and she did go on to die. However, the patient was in the intensive care unit, other strategies had failed to resolve the infection and there was a documented discussion with the patient detailing the risks before the drug was given so it was a calculated risk accepted by the patient, although in the end the patient was undeniably harmed. On another occasion, a patient had a routine tonsillectomy and was warned on admission that bleeding might occur during the following 48 hours and was given advice on what to do if it did. When the bleeding did occur, the patient followed the advice given, and was readmitted and dealt with successfully. Reviewers queried whether or not this qualified as harm in view of the surgical consent, the advice and the successful resolution. A third criterion that was questioned related to perineal tears and whether or not these qualify as harm. This was taken to a meeting and it was decided that grade 1 and 2 tears were acceptable, but grades 3 and 4 were not. When unsure, respondents reported that they tended to record everything and then put mitigating factors in the clinical summary.
We asked whether or not reviewers had any concerns about the loss of contextualisation when using the criteria and most felt that they did not because they were able to include the contextualisation in the free-text box of the RF1 and the clinical summary box of the harm tool. However, the small size of the box meant that they could not always say all that they wanted to say in defence of the care given:
Sometimes you’d want to write more so I’d end up writing in the box if I couldn’t find exactly what I was looking for . . . I’d usually say it was harm or potential harm and then write and explain myself in more detail.
Reviewer 9
With all my reviews I tried to give a back story – perhaps more information that was required so I was on the other end of the spectrum, really . . . It’s all a balancing act.
Reviewer 5
When asked if any other changes should be made to the criteria, one respondent queried whether anyone on antidepressants fell into the ‘mental illness’ category on the cover page or whether this should refer to people with severe and enduring mental health problems. The tool was said to be easier to use than the original RF1, although two people felt that there was room for streamlining it to make it clearer:
It had quite an ambiguous front cover criterion that various staff viewed differently. You had like learning difficulties, dementia or mental health diagnosis and if someone came in on antidepressants some people would query whether that counted . . . I thought overall the phase 2 tool was a vast improvement on phase 1. It always gave us room to explain ourselves so my clinical summaries were quite detailed.
Reviewer 5
The list of criteria was very much black or white and a lot of the situations were grey. And it probably would be better if instead of having antibiotics wrong/delayed/omitted, anticoagulants wrong/delayed/omitted it just said medications wrong/delayed and we added more to that. I remember having one drug (and I can’t remember what it was) but it didn’t fit into any of these criteria.
Reviewer 4
Grading preventability was deemed to be difficult. If a patient fell, reviewers pointed out, it is sometimes because they decided against advice not to walk alone. Despite actively looking for harm in the notes, reviewers did frequently mention the fact that overall they found the quality of care to be reassuringly good. Palliative care was mentioned as a particular exemplar in two health boards:
We did review some palliative care notes as well and the level of care there seems to be a step above the care on general medical wards. A lot more now medical teams are having discussions with families about patient care and especially about patients who are dying and being quite honest and up-front and discussing patient care with families. That was something that came across to me as a positive thing. Looking at the palliative care notes I think we have a lot to learn. They look at the individual as a whole . . . pain, nausea and emotionally how they were feeling.
Reviewer 11
Three reviewers reported coming across nurses attributing, for example, late medications or failure to contact relatives as early as they might have to shortage of staff. Staffing apart, reviewers were unable to record other hospital/trust factors. One expressed the view that issues such as ‘inadequate senior leadership’ was unlikely to find its way into the notes of individual patients.
A final word about the process related to the intense concentration needed to undertake such reviews and the emotional toll of undertaking mortality reviews, which has to be taken into consideration when planning a study such as this. Most were happy to have been involved in this study for only 1 or 2 days a month. At one point, one respondent had been asked to undertake these reviews as a full-time job. They declined. Another was eager to be assured that the study would not be repeated. A third went into detail:
Deceased patients on the whole have a longer length of stay and are more complex and also if you’re sitting there doing 10 of them it is quite emotionally draining . . . certainly . . . where they had a bigger team they put a sort of supervision session on for them. Because you sit there reading about death after death and of course you know at the beginning of the journey how it’s going to end and it can be quite harrowing.
Reviewer 7
Key learning for health boards
We invited reviewers to state what, on the basis of their reviews, would be their key messages to the health boards in which they had worked. Several reviewers were of the opinion that, although the level of care was generally good and the best interests of the patient were obviously central, the poor condition of the notes would raise concern should the family or a coroner ask to see them. In addition to the unsatisfactory management of the hard copies set out above, respondents cited poor handwriting, undated entries and prescription sheets without identifiers. It was pointed out that the condition of the notes constitutes a legal threat and also mitigates against good reviews:
I appreciate we are in stringent budgets of time, money resources but I am sure there is published work out there that accessing legible notes aids the clinical outcomes and treatment at all levels from all members of staff involved in patient care and that somehow ring fencing sums to employ a team of medical records staff . . . so there is time to refile dodgy notes.
Reviewer 5
There is an issue that comes up with the deceased patients which is this patient is deceased we can just shove everything in the notes – they’re not going to be looked at it again and really these are the ones that need to be in the best shape if the notes were going to be needed for any kind of legal case.
Reviewer 6
The members of one team of reviewers in particular were strong advocates for unified notes and wished to encourage more departments to adopt them. From the point of view of following a patient journey the unified record makes it easy for mortality reviewers, coroners (and, in the future, medical examiners) to check that nurses’ concerns are picked up by doctors and that medical prescriptions are enacted by nurses. Whatever the system in place, it was pointed out that recording care takes time and notes suffer when staffing is poor:
I would definitely say that collaborative notes would help a lot and you’d have to look at the conditions under which people were working as well. Do they have time to write in the notes and document their concerns? Do they have time to call social services? There were instances where people had done something but it wasn’t recorded in the notes but it was written later saying I ‘as requested I did this but was unable to document’.
Reviewer 12
Reviewers were of the view that the number of risk assessments (e.g. falls, thromboprophylaxis, pressure damage, nutrition) that were supposed to be carried out on admission was unrealistic:
My feeling is there are so many risk assessments they are never all going to be completed. We are talking 10, 20 different ones.
Reviewer 1
As indicated above, concern was expressed by several reviewers about the care received by people with mental illness on admission to general wards. It should be stressed that no reviewer was given the notes of people in mental health facilities but only records of their admission(s) to acute hospitals:
It seemed that these people were just being pushed around. The care that they had in the community was fine, there was no problem with that, but you could see them coming in previously with the same problems and had been discharged with the same problems and nobody seemed to know where to go. It was difficult because the actual care that the patient had seemed okay – the problem was that care around it. No one seemed to be taking responsibility for it.
Reviewer 4
The perspective of government and health board recipients of the data outputs
Thirteen interviews with health board and WG personnel were carried out in total: two with medical directors, five with assistant medical directors, one with a senior nurse manager, four with quality/improvement/governance/clinical effectiveness managers and one with a policy adviser to the WG. When asked about their reasons for agreeing to participate in the Welsh Harm study at its outset, respondents pointed to the fact that they had been engaged in the assessment of harm for many years. All health boards in Wales had participated in the 1000 Lives Campaign,11 one feature of which was the use of the GTT to assess triggers for harm. In addition, health boards have participated in a mandatory review of all in-hospital deaths since 2010.
Respondents reported that one of the motivators for participating in the Harm study was that they had wished for a more meaningful tool than the GTT. Their experience of this instrument had ultimately given rise to frustration, the results having formed a flat-line graph that had reflected neither deterioration nor improvement and that had ultimately failed to trigger change. There was an expectation of learning from the review process itself and a hope that the Harm2 tool would ultimately be a more sensitive and meaningful tool that health boards might use:
We were getting really frustrated with it because we weren’t finding anything new from the GTT and from [audit manager] point of view it was taking much more time to get everything sorted. We were getting difficulty getting people to participate. They were dropping off. Their interest was waning so we were looking to kind of invigorate that or have something in its place.
Health board 8
I would have imagined that there would be research publications teaching or helping us to understand the best ways to measure harm within the organisation. The second thing is we were learning from participation – i.e. looking at the data collection sheets and stuff and discussing the applicability live.
Health board 7
In addition to the wish to continue to learn about harm in their own organisation, respondents were motivated to join the study by their conviction that the Welsh NHS was a worthwhile research population that lent itself to a systematic study of harm. They also saw the study as having a good provenance and recognised the potential for well-trained staff, with externally funded time, to undertake careful reviews, to yield robust data that would offer added value to the organisation. The promise of harm signatures and a comparison with all-Wales data was attractive in pointing to issues that needed to be addressed:
I think Wales as a community has an excellent opportunity to provide within a reasonable cohort – i.e. a population of 3 million – information that we can make sense of in terms of trying to get some understanding of what harm is occurring to patients in a systematic way.
Health board 3
When [SM] came to me and said we are doing this with Cardiff University I thought well we have the experience of doing this and I think it would be a really valuable thing to do because we really do need to get some mechanisms around measuring harm in a meaningful way and then the next thing for us I suppose is how do we convert that knowledge into the changing our behaviour and we can see an effect on that measurement.
Health board 2
Because of the aforementioned universal experience with the GTT, no health board reported encountering any problem in drawing a random sample for the study. The actual tracing of the notes did, however, present more difficulties, especially in the case of the deceased. Typically, notes are sent for coding at the close of a patient episode and are then entered into the mortality review processes before being sent to the file library. As, however, the study involved a sample drawn from patients admitted several months before, the notes might be anywhere on this journey. To allow for those that could not be traced, the randomisation process had to identify a larger than needed sample:
Particularly the notes of dead people are not properly filed back because by and large we are the only people who call on them and [name and name] who do all the notes collecting for us walked a very long way to retrieve some of these notes which were hidden in nooks and crannies across the organisation. So for patients who are deceased – their notes also go to coding but they get stuck in the same nooks and crannies and when they get back to the file library they are often just put in a massive big box and [name and name] have to sort through boxes of notes looking for names or numbers as opposed to having a numerical order on a shelf and knowing exactly where to go.
Health board 5
There are always challenges in finding notes but the advice received was you obviously target a larger sample to accommodate the required sample and that’s what we’ve done.
Health board 9
As noted in the preceding section, some of the data were not to be found in the paper notes at all, but formed part of the electronic health record. While medical respondents, in particular, were enthusiastic about the ability to manipulate the live data during the mortality review process, the digitisation of paper notes, by contrast, detracted from it for reasons spelled out in detail by the RF1 reviewers. The quotation from a senior source at health board 4 confirms their experience:
(With) the Welsh portal you can graph and tabulate the whole thing through the entire admission so you can see what the sodium trend was and you can see when it started to go off and how does that correlate with the [modified early warning] score and did anyone escalate it . . . The beauty of digitisation is you can start to slice it in different directions so you can say I want to know the broad chronology or I wonder what the [electrocardiograms] are like. I want to cross tabulate the rise in troponin with some other biological or haematological variant.
Health board 1
Ours are not electronic records – they are scanned digital images and the audit team finds them an absolute nightmare because you can pick up a set of physical case notes and are guided by the colour coding but with digital records you have nothing. You cannot search for words – you have to go through the photographs one page at a time . . . digitisation is not going to make life easier for researchers and clinical audit staff.
Health board 4
Such is the difficulty of working with scanned notes that in another health board deceased patients whose notes had been digitised were said to be de facto ‘at the back of the queue’ for review. The explanation for this was that, given a backlog, the paper notes formed a more compelling and visible workload and were significantly easier to review.
Mortality data
After it was revealed early in the first interview that neither of the interviewees had received any of the outputs (the third, a designated recipient, had had to pull out of the interview at short notice), each subsequent interview was begun by confirming that the interviewees had seen the outputs. All had seen at least one set of data, but only five remembered seeing copies of the most recent outputs containing figures for 12 months. The discussion on the mortality data elicited an unusually consistent view, which was unsurprising given that it was a topic on which there had been much discussion across Wales. It proved impossible to separate out respondents’ views on the Harm2 data from views around mortality reviewing more generally. The requirement to review each death, it was explained by a WG adviser, was to determine the standard of care by seeking to assess whether or not death was expected and inevitable, and to reassure relatives that everything that could have been done was done. If this was not the case, opportunities for learning should be maximised:
When we started looking at mortality we want to get a better view around, not just was the death avoidable but also was it a good death or a bad death because a lot of the things that come across my desk when we get a lot of correspondence from aggrieved relatives and things is not necessarily that the patient died but actually the care until their deaths left many questions unanswered for relatives – lack of communication, poor care, etc. So what is it we need to learn about making end-of-life care better as well and could we have done something that might have changed the outcome or at least been such a hasty outcome so it’s more trying to make sure that we have a process that reassures the families, gave us assurance around the quality of care but also was a mechanism for driving improvement . . . So the whole thing is what are we trying to do better in terms of our care for those at the end of life.
WG
The mortality review process, it was hoped, would stimulate conversations with families about whether or not they had any questions or concerns about the death of their relative.
Although the detail varied from one organisation to another, all health boards followed the recommended two-stage process, albeit not always on all deaths. In three health boards, the first review, using the seven core questions of the Universal Mortality Review (UMR; which some had amplified), was carried out by the junior doctor signing the death certificate, but in others the review was carried out by senior medical personnel. If a cause for concern was identified (and health boards quoted the conversion rate from first- to second-stage review as having a range of 10–40%), the notes were then passed to an independent consultant reviewer (when junior staff had carried out the first review) or, in the case of a senior first-stage review, were reviewed in greater depth. In either case, concerns identified in the stage 2 review were communicated to the consultant in whose care the patient had died. No health board had any statistical data arising from stage 2 [other than incidents entered into Datix® (Datix, London, UK)] but the concerns identified by these senior doctors were gathered in thematic form. As all respondents expressed the view that the reviews should in an ideal world be multidisciplinary, given the numbers involved (almost 300 a month in some large health boards), the logistical and workload implications were seen to be too great. Thus, medical review was the norm in all but one health board and, unless unified or collaborative notes were in use, harm linked to suboptimal nursing care would be unlikely to be picked up:
Nursing harms I think will be very poorly picked up in the case notes and the only times they get written down by doctors are if the outcome is something that involves a doctor doing something or, if I was being cynical, to make it quite clear where the responsibility for something lay. ‘Patient didn’t get their 10 o’clock medication.’ It isn’t normally put in there in a kind of supportive and engaging way it’s a don’t blame me for the fact that patient didn’t get their antibiotic.
Health board 6
There was a strong body of opinion that if detecting and eradicating harm was the key objective, then mortality review, whether or not carried out by the RF1 reviewers or by the health boards themselves, was not the most appropriate process. The reasons for this view, that the deceased are not an appropriate group in which to study harm, lie, first, in the admission to acute hospitals of patients whose deaths were either expected at the point of admission, or which at some point became inevitable, and, second, in the manifestations of the dying process itself. In relation to the first point, respondents reported that many admissions to acute hospitals were inappropriate inasmuch as it was clear that the person was terminally ill and would not benefit from medical intervention. These admissions were often the consequence of a dearth of alternative provision in the community and might come from residential homes, as emergency or general practitioner (GP)-mediated admissions, from nursing homes and even from the health boards’ own community hospitals:
You could argue that that patient shouldn’t have been in hospital full stop . . . they should be at home having a near normal quality of life with good palliative care.
Health board 3
We were finding patients in the community hospital were being blue-lighted into the acute hospital in the middle of the night and were dying within 3 hours of admission so it would have been better if they have just stayed where they were so that led to feedback to the doctors there about anticipatory care plans.
Health board 4
As to the nature of the dying process itself, it was argued by almost all respondents that some of the harm criteria are in fact a natural consequence of underlying pathological processes and the associated increase in frailty and immobility. Two triggers were frequently cited in this argument. One related to falls, which were often temporally, rather than causally, associated with impending death; the other was pneumonia, which might even, in complex cases, be a fall-back diagnosis unsupported by microbiological assay. The quotation from health board 10 comes from two respondents involved in one group interview:
I think if you look at falls and you compare fall rates between those who are in retrospect clearly dying during their last admission and those who are not, I think that regardless of your falls strategy your falls rate will be higher in that population because falls I suspect are as often as not the marker of impending mortality. The same will also be true of the other harm events particularly what people call hospital-acquired pneumonia, which I think is a dreadful term but the acquisition of pneumonia is often an agonal event. It is a terminal mode of dying.
Health board 1
An awful lot of the infections in the mortality group of patients are part of the natural process of dying . . . hospital-acquired pneumonia. It’s called the old man’s friend. People die of something don’t they? If they’ve been in hospital a long time that tends to be what goes down on the death certificate whether it’s true or not.
Health board 10
They are not confirmed by the lab.
Health board 8
It’s a catch-all diagnosis.
Health board 10
Overall, the hugely time-consuming mortality review process was picking up little besides the fact that very frail and elderly people, with complex patterns of morbidity, were dying, arguably inappropriately, in hospital:
I think if you’re going to detect harm and then make changes to prevent things happening I don’t think you should pick the deceased as your group. I think you should review all of your deaths but for a slightly different reason. I’m not saying that you shouldn’t do something when you detect harm in that review but I don’t think they are the right group as a sample to be saying you are going to do a harm study on. We need to be looking at admissions for the living to get our harm signature.
Health board 1
I’m involved in the mortality review thing in Wales and it seems fairly considered that it’s a pointless thing to do . . . because we’re dealing with a large number of expected deaths in a subselected group that doesn’t reflect the population that you’re concerned about. For example if you’re taking frail elderly deaths – I was involved in the RAMI [risk-adjusted mortality index] stuff personally and all we saw was it’s a proportion of people who die in which location but the total number of people dying aren’t changing. Crude mortality in England and Wales – if you add in hospitals and nursing homes it’s exactly the same crude mortality it’s just there are more nursing homes in England per head of population.
Health board 7
Harm2 data
As was the case with the mortality reviews, health boards also had their own data on harm. At the national level, the Datix incidents are aggregated onto the National Reporting and Learning System and, in addition, serious incidents and the results of their root cause analyses are reported directly into the WG. Most respondents pointed out that the incidents recorded on Datix suffered from the disadvantage of requiring staff to decide, and have the time, to make the report.
In addition, respondents made reference to complaints and to patient satisfaction surveys as sources of harm data. However, a number of respondents either thought, or knew, because they checked patient numbers against their own systems, that the Harm2 instrument had picked up issues in patients unidentified by health board systems:
It is picking out harms where defects in care had been determined to have contributed to a death. I cannot say with confidence that other mechanisms would so dispassionately come to such a conclusion. Without looking at it my prejudice is that the patients you have identified in this probably have not been identified through other mechanisms. That has been new information for us.
Health board 2
One respondent, however, while acknowledging this, also stated that there were some cases in which harm identified by the health board using UMR had been missed in the Harm2 study. When asked if there were missing criteria in the Harm2 tool, the respondent stated:
No, I think it is a reflection of manually reviewing case notes. Case notes are disorganised, unstructured collections of information and people will see things in them that other people don’t. Things like legibility to the way things are filed to the way people interpret what people have written and it is the nature of the paper case note, I think.
Health board 6
We invited respondents to give their views on the added value of the output data derived from both mortality and harm reviews. For the vast majority of respondents, the key value of the outputs lay in their ability to validate or triangulate existing internal processes. If the harm signature peaked in places already known to present problems, such as infections and communication, then health boards could be assured that their own processes were picking up the appropriate issues. The same was true at government level:
We learnt that handover was difficult – we certainly learned that sepsis was an issue . . . We do have the outputs from the stage 1 reviews and those are showing very similar patterns to what are being described in the information that you are coming up with. We are learning that sepsis in terms of chest infections, hospital-acquired pneumonia is being found here to be the major source in those patients that die as a major contributing factor to possible avoidable death.
Health board 3
It’s given us that reassurance around the 1 in 10 or whatever we’re not an outlier compared to other international studies.
WG
Such reassurance, at both the country and health board level, was not a small matter. As indicated, mortality reviewing in Wales was being carried out differently by the various health boards and there had been much lively discussion of the pros and cons of the processes adopted. One respondent stated:
As you know we’ve had various iterations of mortality review going on in Wales . . . One of the challenges has been because the mortality reviews have been adopted very differently across Wales, although there is now an endeavour to standardise it, we have very good rates of stage 1 UMR rates, pretty close to 100% (it took a while) and we do that by getting the junior doctors to do the stage 1. There is a proposal floating about and in some health boards they’ve adopted a system where they have a very senior doctor doing the stage 1 review but their rates of completion are quite low . . . What I find helpful is that these reports have triangulated quite well with what we’re learning from our own process and its given the board some assurance that the process we’re using for identifying potential harm is reasonably robust.
Health board 6
The above quotation also nicely demonstrates the second strength of the Harm2 study, which was that it was applying a consistent methodology across all health boards in Wales such that data could be compared, hospital by hospital, and aggregated with confidence.
The third added value of the Harm2 study lay in its independence. This implied less that the reviewers were free from any suspicion of organisational pressure than that they were not blinded to problems that local reviewers might accept as normal:
I think the other governance mechanisms have a greater risk of suffering from a cultural normalisation. You know the windows are always broken so why would I complain about the windows being broken. Nothing ever happens. What this has the potential to do is it’s relatively protected from that. It is picking out harms where defects in care had been determined to have contributed to a death. I cannot say with confidence that other mechanisms would so dispassionately come to such a conclusion.
Health board 2
A fourth value was its external funding, which meant that the RF1 reviewers could devote more time to the process than those who had to balance their reviewing work with patient care. Such was the work involved in the admittedly sophisticated Harm2 review process, it was argued, that it put the tool out of reach of clinicians for routine use for every inpatient death:
There is no way on God’s earth we could get our staff to do it. They are quite comprehensive. They are really good for a research study by research nurses when it’s all they do but we could never use that tool as a general tool . . . it is just not practical to use something as complex and detailed as that.
Health board 8
A fifth advantage of dedicated time for review was the thorough nature of the process. Although the RF1 reviewers had the time to track the patient journey through both medical and nursing notes, health board reviews were largely a review of medical care only, unless something alerted stage 2 reviewers to delve further.
The overarching mortality review thing is a narrative doctor type approach to was there any grand problem in care and its very doctor focused as opposed to multidisciplinary. Some places are doing multidisciplinary but generally it’s a kind of doctor focused thing as a prelude to the medical examiner’s role.
Health board 7
Not only do doctors think that pressure ulcers are nothing to do with them, they certainly don’t think it’s their job to describe them in a case note. So I have to say that the most common and significant harms that I know are not reported by doctors. I think the UK rate reported through [Hospital Episode Statistics] is 0.1% whereas we know it’s 16% or something.
Health board 6
Finally, and by far the most useful addition to the information already held by health boards, was the fact that the Harm2 study used the same tool to assess the rate and types of harm in the living and in those who died, allowing comparison of the two rates. Given the significant amount of time that health boards were investing in the mortality review process, it was important to know whether these findings could be extrapolated to the 98% of patients who were discharged from hospital or whether the harm profile in the dying was quite different:
We’ve very much focused on the death review off the back of Palmer12 but if the characteristics of harm are the same then it’s potentially going to be extrapolatable (sic) back into the wider community because we’ve got 1.5–1.8% of hospital episodes end in death in the acute sector. So one of the problems I have is . . . if we’re focusing on the deaths what are we not focusing on? If the characteristics are the same then the learning will be more widely spreadable if it’s the same pattern whereas if we understand that there is a different characteristic of harm associated with death, then we can focus on that characteristic and that would potentially have the biggest yield.
Health board 11
To me it is some sort of reassurance that our mortality reviews are not missing harm in the patients that live. Because we do need to look at everybody. In a way this preoccupation with those that died I can see why we do it but we need to be looking at those that live as well.
Health board 4
In addition to the plaudits, there were also some criticisms of the study. One related to the overall length of the study and the lack of timeliness of the feedback. Part of this related to respondents receiving output data neither directly from the research team nor from recipients within their own organisations, despite having titles that indicated responsibility for quality and patient safety. The other related to the different priorities of universities and health boards, and the latter’s perception of the former’s propensity to hold on to data for too long:
The biggest criticism I have always had of researchers generally is their propensity to hold onto things. It is a real danger. What they eventually try to do is create a product that they can own and sell and in this type of environment I think we should use the study to prove what we need to know to tell us what we need to do and then as organisations in health care we need to get on and do it. Not for it to remain in the domain within the university. Let’s get it into organisations in parallel with mortality reviewing.
Health board 1
One respondent pointed out that the screening tool included triggers that would seldom, if ever, be reported as critical incidents. The respondent below cited criteria by numbers and, thus, the quotation is reproduced in full with the addition of the explanatory wording:
Just yesterday I was asked by [name] to say which of these were harm incidents and which should result in incident reports as we were kind of trying to cross-reference with Datix and that’s quite an interesting question because number 1 [unplanned admission within 30 days] and 2 [unplanned admission within 30 days post discharge] never get reported. Number 3 [hospital-incurred accident or injury] nearly always. Number 4 [adverse drug reaction, side effect or drug error] sometimes. Number 5 [unplanned transfer to intensive care/high-dependency unit] never. Number 6 [unplanned transfer to another acute care hospital] never. Number seven [unplanned return to theatre] never. Number eight [unplanned removal, injury or repair of organ or structure during surgery] very rarely. Number nine [other patient complications to include MI, DVT, CVA (cerebrovascular accident)] never really, only if the DVT arises as a lack of prophylaxis. Neurological deficit never. Unexpected death never reported except by psychiatrists. Inadequate discharge planning – there have been one of two of those over the years. Cardiac arrest hardly ever. I could go on. So what we consider critical incidents is nothing like that list.
Health board 10
The respondent quoted went on to point out that when undertaking mortality reviews, unlike the Harm2 reviewers, clinicians would only ever identify actual, rather than potential, harm. Omitting medication for several days was viewed as harm only if the patient suffered as a result. Other interviewees, while acknowledging the value of alerts to potential harm, nevertheless struggled to interpret them:
You have to have the error and the consequence together before people treat it seriously. We should be on the one in six drug errors, I agree, but the word harm means the patient’s been harmed, not that the drug has been delayed. We get this argument all the time with our legal department.
Health board 10
I talked about the triggers in the Global Trigger Tool and the difference between the criteria and whether it’s possible harm or an actual harm. When do you really need to worry about this because the possible harm is a possible harm? So really getting your head around ‘is that something I really need to react to?’ is also something that I struggle with more than I expected to.
Health board 4
Two reasons for the focus on actual serious harm were given. One was the size of health boards that were likely to see hundreds of incidents a year across multiple sites. The other related to the modest staffing of quality and safety teams who had to focus their efforts on the most important issues.
Management of the Harm2 outputs
What happened to the data on arrival varied from one organisation to another. In three health boards, senior medical and governance staff responded instantly to data suggesting that patients had been harmed in their care. In two organisations, respondents immediately located and reviewed the notes so that they could answer the detailed questions that would inevitably be raised by the data. A third health board compared the harm data in the outputs with information on the clinical incident recording system, Datix, and also determined whether or not these cases had triggered a second-stage review within their own mortality review process.
I just went through the G, H and I categories I thought we have a lot of Is here and so I had those notes called so I could scrutinise them and draw some conclusions to discuss with the medical director.
Health board 5
What we’d agreed that we would do was marry up the information of the deceased cohorts with our mortality review process and certainly in all instances check incidents on Datix.
Health board 9
Most of the comments from those who had carried out this checking review related to the point made above – that harm in the dying is difficult to tease out from catastrophically compromised physiology:
One man died of uraemia but again had metastatic bladder cancer. He was 98 [years], had carcinoma with multiple metastases, so this was an expected death. Yes, there was a delay in care but that was caused by risk management. The researchers do not contextualise these issues they just identify the criteria so from a research perspective they are right removing bias just looking for the criteria but the delay in care was due to risk management but this is a man who was 98 [years] who became ill very quickly and died very quickly and we would not have wanted to handle it any other way. Family involved and that sort of thing.
Health board 5
So that’s the bit I’m struggling with trying to quantify how much the hospital-acquired pneumonia is just a failure to recognise the dying process and an inappropriate escalation of care and how much of it is something we could potentially turn around. The trigger is the trigger but it’s understanding what that’s telling us about the context. You could argue that inappropriate treatment is harm in that situation.
Health board 11
In two health boards the data outputs as sent had been taken to committees, one to a mortality review committee and one to an outcomes group. A third health board presented the data to the Quality and Safety Committee and a fourth made reference to it within a more general report on mortality review. Others had felt that the data, as presented, were unsuitable for presentation to a group that did not have in-depth knowledge of the study, being difficult to grasp quickly and interpret meaningfully.
In the appendix most of the definitions are covered but I think it probably needs to be elaborated for the uninformed reader. Given this has the potential to end up at a board and as such a public document, it probably requires more of an easy read English explanation.
Health board 2
I think my only comment with things like that would be I know a fair bit about the study and so do people around this table but if you put down that report to somebody completely cold they’d look at it and just go, ‘what?’ There’s a lot of explanation they don’t understand what the criteria mean . . . even our Medical Director looked at it and said you want to talk to me about this? . . . and he’s a really bright bloke but it’s not intuitive, this, for people to understand. If you’ve been doing this sort of research and you live and breathe it then its fine isn’t it, but it’s not something you could present to a clinical group who didn’t have in depth knowledge of the project as data like that. I don’t think.
Health board 8
In three health boards, the data had been sent to clinical directors, but one medical recipient, who had discussed it with colleagues, supported the view that it was too complex for them to assimilate sufficiently quickly:
I think it’s brilliant work. I just wish it was more digestible to my average colleague. I kind of sat there for half an hour looking at one sheet of paper going mmm mmm whereas some of my colleagues would say oh it’s just another thing and it goes in the bin. You have to be understandable in 5 seconds what you’re trying to say.
Health board 7
What recipients really craved was unambiguous scientific data that would help focus their improvement efforts proactively rather than, as was happening at present, following one directive after another:
In an ideal world you use semiscientific data to direct your resource to your biggest problem which would be a nice world to live in because the world we live in at the moment is sort of fire fighting so 1 minute it’s dementia then the next minute it’s medicines error then it’s C. Diff [Clostridium difficile] whichever hits the panic button and people rush around doing those things trying to improve safety in those areas.
Health board 10
A number of respondents, from the WG to health boards, had been excited by the prospect of harm signatures for each health board that would point to areas for improvement. For reasons of confidentiality, of course, health boards were sent only their own detailed data together with the all-Wales figures for overall harm, and one criticism of the outputs was that they failed to give benchmarking data that might alert organisations to the fact that they were outliers in respect of the presence of a specific harm criterion. As an example, in the case of hospital-acquired infection, it would be useful to know whether or not an individual health board or hospital was significantly above the Welsh average. This was a problem the Harm2 study shared with other harm tools, such as Datix incident recording, which delivered numbers of incidents that had occurred but left health boards wondering how to turn this into useful information that would lead to meaningful improvement:
When [name] first talked about the results in the first part of the study she spoke about harm signatures which is a great concept but it is most helpful and meaningful if you compare it to something else. If you look at your own signatures you don’t know how different you are to anyone else. I mean we all have health care associated infections and its most helpful knowing what are the things where we are significantly different so I think doing this just on its own just looking at your signatures yes you always look at your peaks you always want to know what those things are but if you know you’re very different here that’s what really helps.
Health board 4
The Datix system, which has got thousands and thousands of cases in it, suffers from the same problem which is if you ask it what themes are critical incidents it will come out with a graph and it’ll say medication errors, falls, miscommunication la de dah de dah and that’s the same every month so what do you do when you print off that report? You think OK nothing much has changed . . . that’s a problem of databases universally isn’t it. You collect lots of data but how do you turn it into useful information?
Health board 10
Reference was made above to the fact that independent review, while in many ways positive, had the negative consequence of divorcing learning opportunities from clinical practice. In two health boards the view was expressed that the centralisation of their own stage 2 review process was also guilty of this. In these organisations, the stage 2 reviewing had recently been devolved to sites or directorates to maximise the learning for clinicians, although it was recognised that persuading doctors to take on this onerous task was not easy. An additional benefit of devolving review lower down the organisation was that, especially in tertiary centres, it is difficult for non-specialists to make judgements about the quality of (medical) care:
What we then moved to doing was identifying the mortality needs for the specialties and feed the notes to them. So we’ve taken some time to get to that but we’ve got that. There are a lot of difficulties around support for that process. Very person dependent because we’re doing it manually but we’ve got it up and running. The other thing that is a confounder is the different cultures on the three sites but we’ve now got it down to a common process that has some outputs.
Health board 2
In a huge centre as we are there are so many tertiary specialties it would be very difficult for anyone who did not have an in-depth knowledge of cardiac surgery to be able to critique that work. What we do need to do, and we’re gradually moving there, is for individuals to be really enabled to be honest and open about the care in which they are participating and to have that enquiry as a matter of course. Have we provided the best care that we could and if not what do we learn from it?
Health board 5
Ultimately, respondents were looking forward to the establishment of a medical examiner who would review all deaths in Wales. In addition, as indicated, respondents looked forward to the further development of live and manipulable data on the Welsh Clinical Portal that would make case review simpler and more meaningful.
Although the statistical data provided by the Harm2 study were novel, their value in the early months was tempered by the relatively small sample size at health board level. It was only after the accumulated data for 1 year were distributed that respondents felt the numbers were such as to warrant serious discussion:
I think in terms of the numbers, I’m not saying individual patients aren’t of value but for the quality steering group to really make any sense of it they don’t want ‘this month we’ve had three cases.’ They are not going to be able to strategically steer on that basis . . . This [output data] is very helicopter – It indicates what questions I would want to ask, like why is there a difference between the hospitals? But then when I look at the absolute numbers they may be within confidence intervals so I’d want somebody to tell me whether it was outside 0.05 p-values or whether and even then it could be still by chance. What are the outliers here? Between our two hospitals – you know the numbers are very small.
Health board 1
To conclude, at the level of individual boards (as opposed to country level) the study had just begun to produce useful data as it came to an end:
And please do more of it, because this sort of data to me is gold dust – it needs to get into the culture of everything we’re doing and I’d really hope for harm studies to expand and maybe get to be a routine part of the way that we’re working. It’s been really useful from my personal perspective.
Health board 7
Summary of key findings
The reviewers found few problems in using the Harm2 tool, finding it clear and unambiguous, although the judgement as regards preventability did present challenges that were relieved by discussion with colleagues. The greatest difficulties, however, arose from the condition of the notes themselves which, particularly in the case of patients who had died, were found to be in some disorder. The form of digitisation found in one of the health boards, in which pages had been scanned onto the system as images, exacerbated the problems and made it difficult to uncover the narrative of the patient’s stay and whether, for example, actions prescribed by doctors had been carried out by nursing staff. The Harm2 data were well received by those senior staff in health boards to whom they were sent, but were neither shared nor discussed to any great or consistent extent beyond this group and it was thus not possible to trace any clinical changes to practice as a result of the work. Rather, recipients used the data to assure themselves that serious AEs had been picked up by their own systems and/or by retrieving the notes to determine whether the index event had changed the patient outcome. In the vast majority of cases, where AEs had been identified in the mortality sample, it was claimed that multiple pathology in frail individuals had made the outcome inevitable. These respondents questioned the usefulness of mortality review in detecting harm in the wider population. Although the health boards had no harm data that directly compared with those arising from the randomly selected discharges review sample, respondents were reassured to find that the themes coincided with those featuring in their own Datix systems and complaints records.
Chapter 6 Summary of findings, discussion and conclusions
Overall study findings
Adverse events are common in Wales and around half could potentially be prevented if care was delivered to an expected standard in normal circumstances. The same proportion of AEs results in excess bed-days in inpatient settings to manage the clinical complications. In Wales, where AEs resulted in prolonged hospital stays of > 24 hours in duration, the median excess LOS was 6 days. In assessing the episodes of care in which at least one AE was identified, the overall quality of care was still rated as being excellent or good in 71% of cases.
Measuring harm in Wales: phase 1, comparison of the Harvard and Global Trigger Tool methodologies
In phase 1, the overall percentage of patients identified as having experienced an AE using the two-stage retrospective review process was 10.3% (95% CI of 9.4% to 11.2%), with moderate inter-rater reliability at the screening stage (κ = 0.47) and good inter-rater reliability at the AE determination stages (κ = 0.63). The percentage of AEs from individual study sites ranged from 7.9% to 16.1% with an attendant reported preventability of 40% to 87% (mean 51.5%, 95% CI 46.88% to 56.12%).
In this Welsh cohort of patients, length of hospital stay and being an elective admission (and, therefore, more likely to undergo invasive surgical procedures) were independently statistically associated (p < 0.001) with experiencing an AE. The case mix between medical and surgical patients appears to influence the rates significantly, with events occurring more frequently in surgical admissions. Thus, the percentage of patients affected was higher in sites whose sample contained more surgical patients. We did not observe any seasonal trend in the determination of AEs across NHS Wales, although a small increase in AEs was detected over the first 2-year period. It would be difficult to establish if this was an actual change or related to the growing experience of the reviewers in detection over time.
The breakdown of problems in care underlying AEs revealed that failure in clinical monitoring (33.5%), failure in infection control (29.4%), incidents directly arising from surgical and other invasive procedures (21.2%) and incidents arising from the administration of drugs and fluid (18.7%) were key contributors. Of these, 15.7% resulted in a level of disability not present on admission, 74.0% resulted in an extended LOS and 8.9% were associated with, but not necessarily causally related to, the inpatient death. An assessment of the psychological impact of the AEs identified in Wales was completed in only two-thirds of cases by the research physicians (309/450) and 36% of patients were found to have experienced at least a moderate degree of emotional trauma.
Comparison of the Global Trigger Tool and the two-stage retrospective review process rates of harm across NHS Wales
We set out to compare two approaches to harm measurement: the two-stage retrospective review process and a derivative of the IHI GTT. We conclude that, after comparing the extent and nature of AEs identified by the two methodologies in matched-pairs analysis, the gold standard approach offered advantages over the GTT in terms of the consistency of the approach, the frequency of AEs identified and the categorisation of AEs from the spectrum of triggers identified. We detected a difference in overall rates of 2.3 percentage points, with the reported percentage of patients experiencing an AE being 10.3% and 8.0% in the two-stage retrospective review and GTT processes, respectively.
The two-stage process was project managed across all sites, with the research team ensuring that protocols were adhered to and problems addressed at an early stage. The GTT review, however, was an independent NHS-led process and, although a national protocol for the use of the tool was available and training provided to all organisations, there was considerable variation evident across NHS Wales in the way the tool was being used, the composition of the review teams and the way in which the data generated were entered and used. When implementation is not robust and as recommended, errors in identification and therefore measurement may go undetected. When we examined the 153 AEs identified through the GTT but not the two-stage process, we estimate that the research team potentially may have missed < 10% of these, with the majority of the remaining AEs resulting from the inappropriate attribution of multiple positive triggers to a single AE in an episode of care. The professional composition of the review teams also influenced the determined rate. Strong multidisciplinary teams generated rates of harm comparable to, or higher than, the research team. Teams led by non-clinically trained personnel typically missed a significant percentage of AEs in inpatient records.
There are two primary purposes of investing in infrastructure to support case note review. The first is to provide a level of assurance, at the corporate level, that the problems that patients commonly experience as a result of care are not significantly higher than expected or commonly reported. The second is to examine the outcomes of patients who experience AEs to learn from what happened and to raise awareness, or inform improvement activity or remedial action. In terms of providing assurance, the data provided by the Harvard method are comparable with previous studies. The GTT data can also be benchmarked, but the findings in the Welsh context suggest considerable under-reporting and significant variation in rates of harm reported across organisations, providing few data for assurance purposes. The actual difference identified, of 2.3 percentage points, is not significant, as the internal validity of the tool in this context is in question. Similarly, the GTT was limited in its scope to reliably identify recurring AEs where improvement priorities could be generated, whereas the Harvard method provided a classification of main problems in care, highlighting actual clinical harm and key areas across an organisation in which the delivery of processes of care may be substandard.
Testing and understanding risk associated with screening criteria
Using a trigger list of criteria, which are indicative of risk of health-care-related harm, has been found to be an efficient way of sifting out a selected group of cases for review. Experience with the GTT indicated that reviewers tended to identify one or more of the same four or five triggers across cases, leaving the other triggers virtually redundant.
Analysis of patterns of use of the criteria aligned with harm events indicated that a number of these, such as transfer to another hospital, were ambiguous in that they could reflect both problems in care and also good-quality care, if that transfer was part of good management. In-depth thematic analysis of four of these criteria found that they were poor at picking up omissions in care or specific subcategories of harm, and new categories of screening criteria were identified. Importantly, the new subcategories generated include acts of omission, such as missed and delayed diagnosis or treatment, as well as breaking down injury into categories such as falls, pressure ulcers and injury resulting from the use of equipment. This subcategorisation, we believe, will make it more likely that harm risk and AE data will be used by a variety of clinical and professional groups within organisations, thereby enhancing the opportunities for organisational learning.
Furthermore, as we empirically identified acts of omission that were associated with the risk of developing AEs in practice, we directly addressed criticisms of pragmatic tools that they do not identify acts of omission and are sensitive only in picking up acts of commission. The new screening criteria were included in the development of the Harm2 tool.
The development of the Harm2 tool
The Harm2 approach was developed in order to circumnavigate issues arising from using both the GTT and two-stage retrospective review process in routine adverse event surveillance in Wales. In the meantime, NHS Wales made a decision to phase out the use of GTT from 2013 onwards. There were also concerns related to the suitability of the two-stage retrospective review process as a tool for ongoing harm surveillance. Information generated by this process is both quantitatively and qualitatively rich, but the method is labour intensive in both physician and nursing time. Furthermore, the administration of the review and subsequent data entry takes as long as the review of the inpatient episode of care, and the data sets generated are unwieldy for routine use. The key time investment in case note review methodology is the time taken to read and review the full episode of care. Our study showed that this can vary from 10 minutes, in the case of a straightforward surgical procedure, to > 1 hour in a complex episode of care. In the two-stage process this reading and review process is undertaken twice (once by the nurse who screens the notes and once by the physician reviewer) for each episode of care assessed, an approach judged not to be viable in a routine surveillance programme.
Key considerations in the development of the Harm2 tool was the need for a one-stage process, which would provide a robust measurement of AEs and their preventability and to provide adequate detail of this harm in the form of a narrative account to allow its contextualisation. The tool was designed to provide hospitals with key information that would drive change in clinical processes and practice through the identification of common problems in care. The Harm2 tool is a condensed version of the traditional gold standard two-stage approach. As in the GTT, cases are initially screened to detect those that are at higher risk of health-care-related harm, with the reviewer going on to identify if a harm actually occurred in the high-risk cases. Unlike the GTT, the reviewer reviews the entire case record and also provides a narrative account of any AEs found.
Measuring harm in Wales: phase 2, the Harm2 tool
In phase 2, the percentage of patients identified as experiencing an AE was 11.3% (95% CI 10.22% to 12.40%) in the sample of randomly selected discharge reviews and 30.1% in the randomly selected deceased patient reviews (95% CI 28.13% to 33.82%), with 59.6% (95% CI 55.29% to 63.91%) and 61.7% (95% CI 57.49% to 65.91%), respectively, being identified as preventable. The inter-rater reliability of the Harm2 tool was moderate in both the randomly selected discharge reviews (κ = 0.50) and the randomly selected deceased patient reviews (κ = 0.54) cohorts. Comparison of a 10% sample of nurse-led reviews with two-stage physician reviews also indicated moderate reliability (κ = 0.45). Comparing the breakdown of problems in care categories in randomly selected discharge reviews and randomly selected deceased patient reviews, problems in assessment, investigation and diagnosis (p = 0.036), problems in the administration of medications including i.v. drugs and bloods (p = 0.0042) and problems arising from surgery and other invasive procedures were less common in the randomly selected deceased patient reviews cohort than in the randomly selected discharge reviews. In addition, infection control problems were significantly more common (p < 0.0002) in patients who had died.
Identification of patient-level factors predisposing individuals to risk
Attempting to predict which patients are most likely to experience AEs in inpatient care is both attractive and risky, as we, and others, have consistently demonstrated that AEs occur across the age and inpatient spectrum. 14,16,32–39 Risk factors have been proposed but, to date, little work has been undertaken to test theoretical assumptions. 60 For example, physiological reserves have been proposed as an alternative hypothesis to age per se. 60–62 Work undertaken to date has been dependent on hospital administrative data sets to generate variables to test in regression models with varying outcomes. 35,63
Elective admission and length of hospital stay was independently statistically associated with experiencing an AE in phase 1 of the study. However, longer length of hospital stay, associated with a nearly 40% increased risk of incurring an AE, is a difficult variable to interpret as a risk factor for AEs. The increased LOS may arise from longer duration of exposure or may just reflect the fact that the LOS was longer because the patient was injured and required further inpatient management. Similarly, being an elective admission and, therefore, more likely to undergo surgical intervention was associated with a 17% increased risk of AE determination.
In phase 2 we prospectively collected Charlson Comorbidity Index chronic disease data from all patient records included in both samples. Dementia, PVD or hemiplegia was associated with at least a twofold increased risk of experiencing an AE. Also significant were age and Charlson Comorbidity Index scores, with ORs of 1.12 and 1.10, respectively. Further examination of the nature of the events in patients with dementia, PVD and hemiplegia indicates that the majority of events, in all likelihood, probably result from underlying physical and cognitive impairment and comprise falls and pressure ulcers, with a smaller number of other events arising from drug administration or reactions or hospital-acquired infection. The crude AE rates in these groups were 22% in patients with dementia, 24% in patients with PVD and 22% in patients with hemiplegia. The current assessment of risk in these patients, compounded with failings in providing a safe inpatient environment, appears to predispose these patients to ward-based AEs. 64
In the sample of deaths, age was not a significant factor, with the mean age of inpatient deaths being 78.9 years, and no chronic disease category was associated with increased risk in patients who died during the inpatient episode. Furthermore, the risk associated with elective admission, rather than being increased, was twofold lower than the risk associated with emergency admission, reflecting the lower level of surgical-related AEs in this cohort.
Difference in the composition of adverse events in the sample of randomly selected discharge reviews and deceased patient reviews using the Harm2 tool
In phase 2 of the study, we explored differences in the extent and composition of AEs in patients who constituted a random sample of all hospital discharges and a random sample of patients whose hospitalisation ended in death. Significant differences arise in the comparative analysis of the breakdown of problems in care categories in these independent cohorts. A number of striking differences are evident, particularly in problems in care relating to problems with infection control and problems arising from surgical or other invasive procedures. Furthermore, the areas in which significant differences were not detected were in failure in clinical monitoring and failure to respond to changes in the patients’ conditions. These problem in care categories include aspects of care such as venous thromboembolism, prophylaxis, falls and pressure sore risk management and monitoring, the monitoring of vital and neurological observations and monitoring food and fluid intake and output, in which differences would not be expected.
Innovative methods highlighting organisational risk and patient experience
Organisational signatures of harm
As part of the quality assurance process of checking the RF1s as they came into the research office, we became familiar with the types of AEs identified across individual NHS hospital sites. Distinct patterns emerged that indicated to us that AEs might be characterised at the individual hospital level and could be used to highlight areas in which improvement may best be targeted for both patient and organisational benefit. We termed this notion ‘organisational signatures of harm’. Signatures of harm are generated from the top criteria screening positive and then subsequently converting to determined AEs. These are then simply illustrated in a pie chart or a Pareto chart. This process highlights to organisations the origins of harm to patients and can be strikingly different across NHS sites. Harm signatures had immediate resonance with study sites, as they were recognisable issues to some individuals in leadership roles within organisations, and this in itself we believe may heighten accountability and responsibility for addressing the problems they highlight.
Signatures of harm have the potential to inform and subsequently monitor AEs in organisations over time. It is difficult to demonstrate quantifiable changes in AEs rates over time and, therefore, the beneficial impact of programmes of quality improvement and patient safety. Signatures of harm may be a simple methodological approach whereby quality improvement can be assessed by changes in the composition of AEs as a result of improvement efforts, which can subsequently be tracked over time.
Similarly, right at the start of the study we introduced a narrative on each record reviewed. Despite not describing the events in the way an incident report would, we believe that this form of narrative brings something of the patient story, and, therefore, context to the impact of harm events on patients, families and organisations. Pressure ulcers, for instance, are an area in which there is universal recognition that improvements can be made. 65,66 If captured through case note review, as they are generally noted in only the nursing records of care, they are generally classified as minimal impairment and no psychological injury. Narrative brings context to specific events, such as the man in his thirties who acquired grade 3 pressure ulcers on his heels after extensive surgery for oral cancer. His main complaint at follow-up was the pain and immobility he was experiencing from this tissue damage. The use of both data showing the extent of a problem and narrative that shows its impact can play a role in highlighting the impact that common AEs have on patients and their families.
From ward to board, narrative in a useable format has the potential to become a mechanism in which a wide range of patient experiences can raise the profile and relevance of patient safety issues in clinical practice and contribute to improvement priorities.
Study strengths and limitations
Study strengths
We report on AE rates in a UK setting generated from a sample eight times larger than the samples from which AEs rates have been determined previously. 14,16 The sample was derived from every health board providing acute service across NHS Wales and was longitudinal in nature, with a duration spanning 4 years. The rate of AEs reported is in line with previously reported AE rates and attendant preventability and severity in smaller UK settings and the mid-point in the rates reported in epidemiological studies internationally. 14,16,32–39
The infrastructure supporting the review teams came from both the academic study team and the clinical research infrastructure of the National Institute of Social Care and Health Research. Training and support was also longitudinal with regular face-to-face meetings with regional leads and teleconference meetings with review teams in which progress against objectives was assessed, problems described and difficult cases discussed, promoting learning and ongoing professional development. The expertise within the review team grew over the study period and the inter-rater reliability we report is consistent with other studies of this nature. 14,16,32–39 The collaboration the study team had with NHS colleagues was instrumental in informing the development of the Harm2 tool, but input was provided from a wide range of stakeholders.
Study limitations
The methodological limitations of using a retrospective case note methodology to identify AEs in health care are well rehearsed. 17,56,67 Information bias, moderate reliability and hindsight bias are commonly cited issues and would have influenced this study in the same ways as previous studies. This study differed from most other epidemiological studies on AEs undertaken to date in a number of ways. All previous studies using the two-stage methodology examined prevalence in snapshot random sample surveys. The Dutch group has undertaken significant longitudinal assessment but in three distinct prevalence surveys39,68,69 rather than a quota of records reviewed monthly over a period of several years. Our sample in phase 1 was longitudinal over a 2-year period and was drawn from all health boards providing acute inpatient services in NHS Wales.
We were, however, reliant on a sampling strategy of inpatient notes that underpinned the measurement of harm as part of a National Patient Safety Campaign. The method identified and reviewed a randomly selected set of 20 inpatient notes out of a pool of 30 identified records on a monthly basis and the research team reviewed the same notes. The notes were unavailable to the study team if a GTT review team was not convened or because of resource constraints in medical records departments. The number of GTT reviews therefore undertaken by an individual health board was dependent on commitment to the methodology and study, in addition to its capacity to bring together senior health-care professionals to undertake the reviews. The range in the individual site sample size was considerable (from 174 to 560) and this, along with the robustness of each organisation’s randomisation process, was influential in determining the rate of AEs detected site by site. One study site provided < 200 reviews, with the remaining 10 providing > 300 and six sites providing > 400 reviews. Two sites were identified as having problems with their randomisation, which was described by the review teams and amended on request. The robustness of the randomisation process has implications for the study’s generalisability. This, however, was acknowledged from the outset of the study as we followed the quality improvement methodology and sampling strategy already adopted by NHS Wales.
The highest AE rate at an individual study site was double that of the lowest rate (16.1% vs. 7.9%). We cannot rule out that this variability was, in part, a result of differences in sample size along with differences in sampling strategy. An additional bias was introduced because the sampling strategy did not take into account the size of the hospital and the nature of the services provided. We did not identify any discernible patterns in the rates of harm by level of care provided in district general hospitals or tertiary-level provision. These findings have significance for operationalising a system of AE surveillance, but do not have significance in relation to the overall rates of AEs that we report.
We encouraged the RF1 screening process to be more than a tick-box exercise by encouraging the use of free-text boxes aligned with each of the 18 criteria to highlight why a particular criterion had screened positive. We rationalised MRF2 referral, as it is logistically difficult to pull sets of notes twice for review as well as being labour and resource intensive for physicians. Similarly, we did not assess a percentage of all negative screens. We cannot rule out that this process resulted in an underestimate of the rate of harm across NHS Wales, as physicians may have identified AEs that nurses did not, but we do, however, suggest that it is unlikely that significant harm is underestimated, given the experience of the reviewers and research team who made an individual assessment of all review forms.
Limitations of the service evaluation element of the study relate to its restriction to the key contacts in health boards. However, in view of the lack of wider circulation of the harm data, it is clear that a larger group of interviewees would have yielded no further information. We did, however, approach the entire known population of reviewers, and interviewed as many as were willing and available within the timescale.
Challenges in current case note review methodology
The reviewers’ experiences suggest that maintenance of patient records is, with a few exceptions, poor, which makes case note review difficult and puts health boards at risk of reputational damage. Both reviewers and health board audit staff found digitised notes involving non-searchable images difficult to use, which raises questions about their efficient use by clinical staff if and when a patient is readmitted, or admitted to another facility.
Reviewers stressed that collaborative notes made the retrospective assessment of interprofessional communication easier for reviewers, which suggests that they are also likely to aid such communication in real time.
In undertaking the research, reviewers found the Harm2 document easy to use on the whole. The main difficulties experienced related to judgements of preventability and severity of harm.
The assessment of preventability
The level of preventability was categorised according to the level of evidence, supporting the reviewer’s judgement in a scale originally derived from the legal definition of negligence. 70 The scale ranges from virtually no evidence for preventability to virtually certain evidence of preventability, and just under one-quarter of all AEs identified were at least ‘probably preventable’ and defined in the scale as ‘more than 50–50 but close call’. As assessment is restricted to information documented in the record of care, it is subject to both information and hindsight bias and it may be that, rather than an assessment of evidence, the level of reported preventability reflects the level of certainty of the reviewers. 16 Given these limitations, however, our assessment of preventability lies on the mid-point of rates of preventability reported previously. 14,16,32–39
Reviewers found the assessment of preventability challenging, and this may reflect the limited empirical evidence on the validity and reliability of the available definitions of preventable harm. Nabhan et al. ,71 in a systematic review, concludes that no single definition is supported by high-quality evidence, and synthesis of the literature suggests that preventable harm can be best defined by three criteria: (1) the incidence that can be reduced by detecting and intervening or preventing a causal event or chain of events; (2) the causal event or chain of events by its nature can be detected before the harm takes place; and (3) there is evidence that an intervention is efficacious in reducing or eliminating the harm by virtue of eliminating the offending cause or disrupting a harmful chain of events. 71 The definitions of preventable harm in two-stage retrospective review studies are generic and mostly aligned to statements about ‘best practice’ and there is a case for reiterating the conclusions of Sari et al. 16 that ‘further research is needed to assess the effect of using summaries or the full medical record to assess preventability and these other factors on the degree of agreement in assessing preventability’.
The severity of harm rating system
Other challenges in deriving a full assessment of the impact of AEs in health care through case note review is seen in the assessment of emotional or psychological harm. Although physical impact is well described, the psychological impact on patients is poorly described in the literature, with more written in this respect about the second victims of severe AEs, namely the health-care professionals responsible for providing or overseeing care. 72–74 An assessment of the psychological impact of the AEs identified in Wales was only completed in two-thirds of cases by the research physicians (298/450), and the same proportion was expected to have experienced minimal or no emotional trauma (183/298). This is a finding that does not conform to the reported physical and organisational outcomes of AEs and reiterates the need for both researchers and health-care organisations to understand, from the patient’s perspective, the wider and societal impact of harm in health care.
The NCC MERP’s Categorizing Medication Errors Index44 was used to classify the severity of the AE and, in the second phase of the study, the percentage of patients who experienced disability and whose AEs was associated with, but not necessarily causally related to, the AE were significantly lower than expected and reported in the first phase of the study. This is surprising given that other end points (such as preventability and the breakdown of problems of care) are in line with both phase 1 findings and the wider body of literature. Furthermore, issues become apparent with this classification when identifying problems in care in patients who die in hospital. When an AE occurs in a terminal episode of inpatient care, there are limited options on how the impact on the patient can be described, with classifications such as temporary harm, prolonged hospitalisation and permanent harm having no relevance. Although not posing issues across all ‘problem in care’ categories, this is challenging if the patient dies within a short period of time of acquiring, for example, HAP. In the deceased patient cohort, we identified 525 events in 315 patients and > 100 of these were HAP. As a result, with no classification system that considers issues of preventability, in 34.4% of cases the non-physician reviewers classified the AE as leading to death. On review, it is evident that in specific situations it is not possible for the reviewers to differentiate between NCC MERP’s Categorizing Medication Errors Index44 classifications in the event of death and it was not a component of the training delivered. This also conforms to the abovementioned reports from health boards that there is high-level noise from the dying process that could be misattributed to failures in care and AEs. We therefore recommend that the NCC MERP’s Categorizing Medication Errors Index44 is not used in assessing the attribution of AEs to patient outcomes, such as death, from case note review, particularly in the event of no assessment of preventability as a result of the risk of misclassification.
Comparison with previous findings
The interpretation of headline findings in relation to other studies undertaken internationally is not straightforward. Different studies emphasise different end points in their reporting and these include the level of preventability36 and confidence in management causation. 39 The percentage of AEs detected across NHS Wales is 7.7% (95% CI 6.91% to 8.49%) when events classified as being caused by health-care management are reported and therefore complications are excluded, and 5.29% (95% CI 4.67% to 5.99%) when AEs that are deemed to be preventable are reported.
At the individual hospital site level in phase 2, there was less variability in the organisational percentages of AEs, probably because of improved randomisation and, while these percentages are higher than those reported in phase 1 of the study using the two-stage retrospective review process, they are lower than the rate and levels of preventability in a random sample of 28 Swedish hospitals36 reporting a physician-determined AE rate of 12.3% and attendant preventability of 70%. All end points we report are consistent with other studies and this gives us confidence that the gold standard approach, which was amended in phase 1 and condensed in phase 2, generated valid and reliable information providing a platform for improvement work in Wales.
The ongoing measurement of AEs is important and a consistent message across AE studies is clear: AEs have a considerable impact on patients, families and health-care providers. 14,16,32–39 In the largest study undertaken to date across the Dutch health-care system, Zegers et al. 39 report that 12.8% of all AEs result in permanent disability or are associated with death. Excess LOS arising from the management of AEs is significant and our Welsh estimate of 6 days is in line with other European studies reporting 6 days in both Sweden36 and Spain. 38 Furthermore, readmission is common and we found a readmission rate of approximately 2.5 readmissions for every 100 patients managed within a secondary care setting.
The comparison of the Global Trigger Tool and two-stage retrospective review process
The harm rate of 9% in Wales using the GTT is considerably lower than any of the US-based studies examining harm in secondary care settings4,21,41,75 and it is very unlikely that this reflects a true difference in AE rates between the US and UK health systems. Some GTT studies do not exclude AEs present on admission and it remains unclear what the GTT is actually measuring. Although the GTT undoubtedly identifies significant problems in processes and clinical care in order to prioritise improvement efforts,21,75 it may also detect signal noise, which does not translate into actual AEs. This may be compounded by the limited time to examine the notes, which prevents contextualising these triggers and subsequent overestimation of harm rates. In the Welsh context there was significant variability in the GTT rates reported across Welsh health boards and less variation when the Harvard method and Harm2 tool were used.
Assessing risk for adverse events in inpatient settings
Previous studies report around a twofold increased risk of AEs in adults over the age of 65 years37,76 and we confirm that increasing risk is concomitant with increasing age. The assessment of comorbid states on the risk of being injured in health-care management is, however, more complex. In a US setting, Naessens et al. 77 examined the effect of illness severity and comorbidity by using patient safety indicators and reported AEs through incident-reporting systems. They assessed association with chronic diseases generated from diagnoses coded for Charlson Comorbidity Index chronic disease classifications, concluding that at admission underlying disease severity increased the likelihood of all types of AEs and increased incremental costs and LOS. The Dutch group also examined the role of chronic disease and comorbidity by supplementing their original data set from their first National Prevalence Study, with comorbidity generated from diagnosis data from the national hospital administrative database, enabling the calculation of Charlson Comorbidity Index scores. All variables were modelled using multiple regression and they concluded that it was not possible using inpatient data available on admission and collected during the course of the two-stage process to develop a satisfactory predictive model for AEs in older patients. The variables that remained significant in the model were age (OR 1.04), elective admission (OR 1.65) and management within the surgery department63 (OR 1.53). Similar findings using similar methodology were generated from the Spanish study of AEs, the conclusion drawn being that the true risk of AEs depends on the number of exposures to potentially iatrogenic actions rather than age or the presence of comorbidities. 76 We would concur with previous national studies that comorbidity in itself does not play a significant role in increasing an individual’s risk of experiencing an AE. There are interactions, however, that remain to be explored, and increased risk is evident in patients who have heightened vulnerability as a result of physical or cognitive decline. Relying on the use of the Charlson Comorbidity Index as a measure of comorbidity, however, we cannot rule out that any associations we report may have been underestimated.
Comparison of live discharges and mortality admissions
The Dutch study recently retrospectively compared the rates and distribution of AEs between these two groups of hospitalised patients and reported an AE rate that was twofold higher in the deceased patient cohort. Aligned with our findings, AEs were more common in patients who were emergency admissions and were less frequently associated with surgical procedures than in the live discharged patients. 67 We report a threefold higher rate of AEs in our deceased cohort and propose that this may be a result of the composition of the AEs analysed. We report on all AEs arising from inpatient management, whereas the Dutch study removed all AEs from their data set when management causation was not rated as 4–6 on the management causation scale. This means that a significant number of events, such as HAP, would not have been included in rates reported along with harm associated with known complications of surgery.
The composition of problems in care in the discharges sample and the deceased sample is significantly different and we concur with the Dutch experience that the yield of AEs is greater in deceased patients but that the distribution of events across the problem in care categories varies. This is a result, in part, of the lower percentage of surgical procedures undertaken in this group but, in our experience, the relative differences in the importance of infections, specifically the predominance of HAP in the deceased cohort, is also very significant. Baines et al. 67 concluded that mortality review is an efficient way of studying AEs, but it is important to be aware that some events are underexposed and others are overexposed, and caution should be applied if data generated are being used to inform programmes of quality improvement.
The assessment of the Harm2 tool and the practicalities of case note review
The Harm2 tool was developed from tools in existence, but the key properties, along with an assessment of its performance in routine use across NHS Wales, are outlined in Table 53. In summary, the Harm2 tool is a condensed version of the Harvard method but can be operationalised pragmatically. It has a clear remit in filling currently unmet need in clinical practice for a harm surveillance tool that is easy to use, has face validity with clinicians and stakeholders, and can be used at the organisational and national level for both assurance and organisational and professional learning purposes. The measurement outcomes are aligned with international findings using the Harvard method and we report an acceptable level of inter-rater reliability. Subject to a few minor changes to the tool, such as the replacement of the NCC MERP’s Categorizing Medication Errors Index44 with the classification of severity from the MRF2 and the removal of components such as the Charlson Comorbidity Index (which are of ongoing value only in a large national study such as this), the tool is ready for application in clinical practice. The Harm2 tool has a non-time-limited approach to case note review as this facilitates deeper understanding of the circumstances surrounding the AE and robust assessment of its presence, preventability and impact both on the patient and on the organisation.
Characteristic assessed | Key points arising from assessment |
---|---|
Aim | The aim of Harm2 was to develop a tool to measure harm in health care that was aligned in definitions and epidemiological end points to previous studies, thus facilitating benchmarking both in the monitoring of trends within an organisation over time and with other studies reporting AE rates nationally |
Principles of design | A review of existing measurement tools for AEs was undertaken and triangulated with experiential knowledge of how these tools actually work in practice, along with an assessment of data outputs with their relative strengths and limitations |
Item selection and instrument development | All core elements of the tool were taken from one of three tools: the GTT, the Harvard method (elements of both had previously been psychometrically tested) and the PRISM tool. Other additions to Harm2 in phase 2 were included purely for research purposes and this included the Charlson Comorbidity Index and a measure of avoidable mortality |
The tool was constructed so it followed a logical flow of demographic and clinical information, identification of the AE, assessment of the preventability and severity of the AE, the identification of contributory factors, points for organisational learning and assessment of overall quality of care | |
Criterion validity | We compared the outcomes of the Harm2 tool with phase 1 outcomes |
The percentage of service users with AEs identified was 10.3% (95% CI 9.4% to 11.2%) using the Harvard method and 11.3% (95% CI 10.22% to 12.40%) using the Harm2 tool | |
There is no gold standard measure of AEs that we can correlate our findings with. Few studies have assessed criterion-related validity and, when they do, they propose implicit professional judgement, such as overall assessment of quality of care. Lower overall assessments of quality of care were associated with preventable AEs in both the Harvard and Harm2 data sets | |
Construct validity | AEs identified using the Harm2 tool were associated with a number of factors and patient variables that conform to theoretical and construct expectations. AEs increase with age and are associated with increased LOS and readmissions, are different in medical and surgical specialties and are more common in patients who die in inpatient settings |
External validity | The perceived benefit by stakeholders of the measurement structure in Harm2 was its alignment with previous UK studies such as those undertaken by Sari et al.16 and Vincent et al.14 in England and other national studies such as the Dutch and Spanish AEs studies. The common end points and measurement structure used in Harm2 will make outcomes reported familiar to the quality improvement community internationally |
Reliability | The inter-rater reliability of the Harm2 tool was 0.50 (95% CI 0.38 to 0.65), with agreement being achieved in 88.4% of cases, and the intermethod reliability was 0.45 (95% CI 0.07 to 0.62), with agreement reached in 78.4% of cases |
The Harm2 tool in this setting performed as well as the GTT and Harvard method previously reported in a wide variety of settings | |
Errors of measurement | The percentage of AEs detected is very dependent on the robustness of the sampling strategy, which influences outcomes |
The NCC MERP’s Categorizing Medication Errors Index44 did not provide the granularity required to fully assess the impact of AEs on patient populations | |
Instrument data preparation | Data entry is timely and simple and datasheets generated are easily manipulated and analysed. Codes were applied to data fields for entry into an Microsoft Excel spreadsheet |
Availability | Feedback from organisational leads indicates that the feedback mechanisms need to be honed but the notion of organisational signatures were universally welcomed. One area for future development includes the further in-depth study of preventable AEs at individual study sites improving the face validity of information being reported on. There is scope to further develop summary data from Harm2, including the separate reporting of AEs that were preventable in nature and resulted in significant clinical impact to the patient |
Interpretation of data collected | Both organisational percentages of harm and the flagging of case notes for further investigations were well received. Furthermore, organisations like to see their percentage of harm events in relation to a national average as some kind of assurance mechanism |
Clinical usefulness | Data on AEs that are directly related to clinical care will increase the likelihood of continued measurement and monitoring of patient outcomes, which can lead to the prioritisation of quality improvement targets and professional development activity both organisationally and nationally |
The Harm2 review teams and training
The Harm2 tool is pragmatic and the non-physician reviewers found it straightforward to use in this research setting. Our reviewers were nurses conversant with research methodology and experienced in data retrieval from inpatient records. This had obvious advantages within this study setting but with training on the principles of AE identification and the assessment of preventability and categorisation, it is a process that could be undertaken effectively by experienced nurses from a clinical practice setting. In our study, we found it possible for four nurses to undertake a sample of 20 reviews in one afternoon or morning, which has realistic resource implications. In addition, the Harm2 form takes between 5 and 15 minutes to input, and 1 month’s worth of data can therefore be entered into the organisational data repository in 5 hours. Harm2 is less resource-intensive than GTT reviews as per the IHI recommendations, which require two independent reviewers per case note review.
Table 54 outlines an indicative estimate of the cost to run a harm-monitoring programme using the Harm2 tool in an individual organisation. The time allocation in the example has doubled from what our experienced reviewers were able to complete within their allocated schedule and allows for training time and discussion in the early phases of implementation.
Resource required | Cost (£) | |
---|---|---|
Per month | Per annum | |
Two band 7 nurses costed at top of scale to complete 20 reviews and all relevant documentation over a 2-day period | 465 | 5586 |
Administrative support costed at the top of band 2 for pulling and filing notes for 2 days a month | 102 | 1214 |
Optional support in the preparation of notes for review and data entry | 102 | 1214 |
Total nurse and administration time | 669 | 8014 |
The basic staffing resources required without on-costs for a routine surveillance programme at an organisational level will run at around £8000 if completely nurse led but could rise to £20,000 per annum if on-costs and consultant time to confirm complex AEs and respond to any clinical queries are included. This equates to around 0.5 whole-time equivalent of a band 7 nurse per organisation, per year. The NHS is well resourced with patient safety advocates and experts and, whether used as continuous surveillance or periodic monitoring, the Harm2 process is one that NHS organisations can prioritise, adopt and integrate into their patient safety and quality activity.
The training methods we employed included both didactic teaching and experiential learning and this is an approach that worked well in our context. Novice reviewers gained from being paired with experienced reviewers. Once teams were undertaking reviews in practice, peer supervision was an invaluable process. Expert reviewers are not ‘taught’, but learn through experience and the experience of their colleagues. Our phase 1 and 2 reviewer cohort benefited from a collegial approach and the bringing together of the regional teams on a regular basis promoted learning and development. We had a reasonably stable reviewer workforce and most found it a challenging, interesting and enjoyable experience over the 4 years of active data collection. If this work is undertaken routinely within an organisation, we strongly advocate stability within the case note review teams, as experience will improve reliability and highlight patterns and trends over time which will highlight local learning, which in turn will drive improvement. A team approach within organisations is recommended as it is more sustainable and reliable given the intense, reflective and emotionally draining nature of the task.
The importance of a robust sampling strategy
Other important points of learning arise from our data. First, understanding the context and sample from which AEs are identified and subsequently quantified is essential. Reviewing notes from convenience samples in medical records departments will influence the breakdown of problems in care significantly and the number of AEs detected. Our sampling strategy of 20 records per month was adequate for surveillance purposes but, if new to the practice, organisations may wish to frontload more reviews early on, to generate an initial signatures of harm within their organisations. It is possible to begin to observe trends and patterns and develop an initial harm signature within 200 reviews.
Second, the sampling strategy we employed taken from the IHI GTT methodology is one that meets the need of organisational learning and monitoring but may lack the generalisability found in other sampling methods. Smaller samples generated from individual hospitals can appear to be outliers in terms of their AE rate and caution is needed when interpreting reported rates in any assessment of organisational quality or safety. Hospital comparison may be possible if (1) the samples are truly random and (2) the surveillance of AEs is undertaken longitudinally and the sample size increases over time. Attention, however, may be more appropriately directed to the organisation’s response to AEs rather than crude AE rates. This shifts the focus to the information generated from the process to highlight areas of clinical or management concern that can be addressed through local programmes of quality improvement with subsequent follow-up and monitoring of challenging or problem areas.
The development of data outputs
The organisational reports that we developed had a mixed reception in practice but the concept of pie charts showing graphically where harm is occurring within organisations is attractive and can be used regularly in committees and boards to promote discussion and assessment. Reasons for the different perceptions of data outputs varied. As the data were cumulative, the small numbers at the outset discouraged recipients from sharing the outputs and this seemed to become a pattern that did not change as the numbers grew, which may indicate that regular incremental feedback has less impact than final results.
For some, the information we provided did not provide the granularity of information necessary to focus on AEs that (1) were flagged as preventable and (2) had the greatest impact on patients. Furthermore, because this was an external process, organisations had to re-pull the notes to generate any learning or triangulate with other patient safety data, factors that clearly depended on organisational resource and commitment to follow-up. If such a methodology were undertaken within the service, these issues would not arise, as points of learning or concern would be fed directly into the clinical governance structures at the time the reviews were being undertaken. Similarly, data would be organised and summarised to meet the ongoing needs of the organisation. Signatures of harm accompanied by examples of narrative describing the episode of care could be an effective mechanism in raising awareness of the problems encountered by patients within individual organisation. Similarly, reviewers often find examples of excellent care and identify when patient pathways or teams are working well and this could be a mechanism through which best practice can be identified, celebrated and spread at the organisational level.
The importance of preventability
We consider the robust assessment of preventability to be important, and the clinicians and managers within Welsh health boards find it even more important. If potential targets for improvement cannot be identified and interventions proposed with potential short- to medium-term impact, there is little to be gained in undertaking routine labour-intensive review of inpatient episodes of care. Aranaz-Andrés et al. 38 report that the measures of frequency, severity, impact and preventability are the most efficient strategies to inform improvement, in order to improve the safety of care for patients. To engage clinical teams and managers in the process of quality improvement, it is important that AEs identified have face validity in order to identify opportunities to mitigate impact in patients. While we recommend that all AEs are recorded as every patient injury or complication is important, further organisational breakdown of those with high levels of preventability will ensure the identification of areas in which targeted action and active monitoring may lead to short-term gains in patient safety work.
The record of inpatient care
The condition and organisation of case notes was found to be almost universally poor across NHS Wales. One organisation had already addressed this internally prior to the study and had significantly better records of care than the other health boards in Wales. Organisations are transitioning to digital records, which also posed challenges to external reviewers when these electronic files could not be manipulated and searched, or when timely information was not to hand, to review and assess care. The implications for health boards are significant and if external reviewers cannot undertake comprehensive assessment of care there is a risk of attendant clinical harm to patients. As work on the quality and organisation of medical records needs to be a national-level priority, in the shorter term case note review processes can be facilitated by an administrative assistant who organises the notes prior to the review session. Administrative staff can, in addition, populate fields of the Harm2 form ahead of the review session so that the reviewer starts completing the form at the stage of harm screening and determination, thereby increasing the efficiency of the process.
Implications for practice
Despite growing awareness of patient safety issues across NHS health-care systems and internationally, there is little evidence of a commitment to the monitoring of patient safety outcomes. The NHS is heavily dependent on voluntary reporting of unspecified patient safety incidents into a system that is characteristically flawed by design, by under-reporting and by its inability to generate data to drive patient safety. 78 We have demonstrated that organisations can introduce into their patient safety measures a monitoring tool that will, over time, provide a level of organisational assurance, giving summary estimates of current risk to specific inpatient populations. This monitoring, whether undertaken monthly or periodically, is both inexpensive and efficient. As an example, the cost of treating a pressure ulcer varies from £1214 (category 1) to £14,108 (category 4)79 and the median cost of managing a surgical wound infection in an English hospital is £5239. 80 A very modest reduction in these AEs will make such monitoring cost neutral.
The Harm2 tool is not complex and with the research elements such as the Charlson Comorbidity Index removed, it is a logical and easy tool to use. When training Harm2 reviewers, they reported that the determination of AEs was easier when guided through the form to consider issues such as where did the event take place, what was the level of preventability and what were the contributory factors and points of learning for the organisation or health-care providers. What is very clear is that perception of both complexity and resources seem to be related to both organisations’ and individuals’ commitment to the patient safety agenda. Some organisations used the study harm to investigate their own processes and practices and were open to the role of academia in effectively measuring harm in the system. Others noted that there is ‘no way on God’s earth we could get our staff to do it’, highlighting that no improvement activity is without its challenges in implementation.
The implications for patient safety are clear; knowing both the percentage of inpatients experiencing an AE and the proportional breakdown of the attendant identified ‘problems in care’ allows for crude extrapolation of how big an issue is within a health-care system. This facilitates assessment of problems, prioritisation of interventions and assessment of progress. Areas of risk deemed to be important can be triangulated with other organisational data and an appraisal instigated of current monitoring and intervention efforts. Harm2 offers an organisational response to common issues in practice that are identified corporately, but when named, accountability for action both in terms of their assessment and any required intervention can be distributed to relevant clinical areas and teams. In the longer term, it is feasible that raising awareness of current organisational priorities for patient safety activity through the use of harm signatures and effective organisational communication could improve reporting and increase the learning generated from specified incidents, such as harm caused by hospital-acquired infections, inpatient falls or failures in thromboprophylaxis.
Looking ahead: monitoring benefit and harm for patients and populations
Many countries, and Wales in particular, have made enormous progress in the last 20 years in developing measures of clinical outcome and routinely monitoring those outcomes. In contrast, the measurement of harm has been sporadic, inconsistent and has not achieved the same level of rigour. We have relied instead on incident reporting systems, which, despite being valuable as safety warning systems, have conclusively been shown to be completely inadequate as a measure of harm. The studies described here have shown the potential value of ongoing systematic and rigorous evaluation of the harms of health care alongside the many benefits. As one participant commented:
And please do more of it, because this sort of data to me is gold dust – it needs to get into the culture of everything were doing and I’d really hope for harm studies to expand and maybe get to be a routine part of the way that we are working. It’s been really useful from my personal perspective.
We believe that Wales, and indeed other countries, now need to consider how harm can be routinely monitored within health-care systems. We appreciate that we cannot necessarily continue to carry out record review on the same scale and with the same intensity as in the current studies. We do, however, believe that an ongoing assessment of harm within health-care systems is absolutely essential, and in Wales it would be relatively simple to devise. It would be relatively straightforward to devise a means of continuing the current approach by sampling across the country on a regular basis. This would have huge benefits for both local and national learning, for the prioritisation of safety programmes and for monitoring change over time. Wales would become, if this was instituted, an immediate international leader in patient safety. Over time this approach could evolve into a system in which both benefits and harm were monitored within electronic medical records.
Harm in health care reflects an incomplete view of patient safety issues, and future studies would benefit from setting measures of harm alongside measures of the beneficial effects of health care, first at the level of populations and then, more ambitiously, for individual patients. Ultimately, the aspiration should be to mirror our experience as patients and be able to reflect for any one individual the overall balance of benefits and harms of health care and the accompanying experience for patients and families. 10
Finally, this work would not have been possible without the input of many individuals in the WG, and the many hospitals across Wales, which are committed to patient safety and the principles we promote to identify risk in order to mitigate it. The operationalisation of a research study is, however, very different from the operationalisation of an improvement programme. As significant development has been made on the tools and methods of AE surveillance, there is significant work remaining on embedding processes in practice and ensuring that data and narrative generated meet the service needs in individual journeys of improvement.
Conclusions
Some health boards in Wales have been monitoring patient safety outcomes for the last decade using a variety of tools. Working in collaboration with NHS Wales, we report percentages of AEs using the gold standard Harvard methodology, which conforms to previous studies in a UK setting and national studies undertaken in Sweden, Holland and Spain. We confirm the 1 in 10 finding of AEs in inpatient populations, half of which are potentially amenable to intervention. If the occurrence of these common events identified through case note review is crudely extrapolated across the total number of hospital discharges in Wales per annum, they would equate to around 19,900 events relating to failures in clinical monitoring, 17,500 AEs resulting from failures in infection control, 12,650 events relating to problems with operations and procedures, and 11,000 AEs relating to the administration of drugs, fluids or blood. Given the impact of these common problems on patients, health-care professionals and service providers, there is much to be gained from reducing their current levels.
There are groups of patients in whom the risk of AEs is greater than in the general inpatient population, and these include older patients with physical and cognitive impairment as a result of pre-existing diagnoses, such as dementia or PVD. AEs in these groups probably result from underlying physical and cognitive impairment and comprise falls and pressure ulcers, with a smaller number of other events arising from drug administration or reactions or hospital-acquired infection. The current assessment of risk in these patients, compounded with failings in providing a safe inpatient environment, appears to predispose these patients to ward-based AEs. Our data suggest that these groups could be targeted in organisational quality improvement efforts to improve the safety of the ward environment and thereby reduce the high levels of AEs in vulnerable groups.
Both the Harvard and the IHI GTT use multidisciplinary teams involving nurses screening episodes of care for AEs, before physicians confirm the presence of an AE. We have demonstrated that with training, which in most cases did not exceed 2 days in duration, nurses can determine the presence of AEs that generate findings that conform to other methods when this element of the case note review process is undertaken by physicians.
Our data demonstrate consensus between professional groups on the determination of harm events in around 80% of reviews undertaken. Differences are noted, however, and a number of AEs that clearly were events were not deemed to be significant by the physician reviewers, and vice versa. This has important implications for the composition of review teams and we suggest that, although experienced nurses with administrative support can undertake the majority of the review process, the presence of a physician in a review team will bring clinical expertise and a more holistic assessment of the health-care management process and clinical aspects of care. AE determination involving a single professional group will lead to underdetection of the spectrum of AEs commonly seen in clinical practice.
There were problems with both the GTT and Harvard methods of case note review when implemented across Wales. The GTT under-reported AEs and the Harvard method is labour intensive in both review and data management time. We developed a nurse-led hybrid tool of the GTT pragmatic process and Harvard methodology measurement structure, the Harm2 tool, which we implemented in six Welsh health boards. The reported extent and nature of harm conform to phase 1 findings when the Harvard methodology was used. The Harm2 tool performed with the same levels of moderate reliability as the GTT and Harvard method in the determination of AEs/problems in care. Harm2 can be further streamlined without changing measurement structure for routine implementation in clinical practice. A pragmatic approach with a robust measurement structure offers the potential for routine monitoring of AEs in NHS organisations to offer both organisational assurance, opportunities for learning and the identification of priorities for improvement.
We compared the rate and composition of AEs in two independent samples: a random sample of hospital discharges and a random sample of inpatient deaths. The rate and composition of AEs in these groups are significantly different and the composition of the sample in case note review examination will influence the findings generated. Surgical events occur infrequently during final episodes of care, but infections are common. We therefore report that AEs occur three times more frequently in patients who die in hospital but they are different in nature from the AEs that occur in a random sample of inpatient discharges and, furthermore, commonly occurring events, such as HAP, may be misclassified as AEs during the dying process. Careful consideration must be given to desired measurement end points and what the data generated are intended to inform before samples are drawn for case review studies. AEs occurring in patients who die in hospital are not representative of AEs occurring in 98% of the inpatient population who are discharged from acute care. We report that AEs arising from surgical and other invasive procedures are significantly under-represented and this group may not be an appropriate inpatient population to inform quality improvement priorities.
A harm-monitoring or surveillance programme can be realised in NHS settings with modest resources and will provide reliable estimates of the extent and nature of AEs either periodically or longitudinally. In addition, from data generated from the screening component of the Harm2 process, we have developed simple visual methods in the form of signatures of Harm to profile the origin of harm events at the organisational level. Harm signatures, when tested with organisations, have face validity and confirmed organisational priorities for improvement. Furthermore, they can be used to prompt assessment, intervention and ongoing monitoring over time, enhancing the breath of information generated through the quantification of AEs found.
Reviewing the record of care generates a rich source of information on how clinical managerial and organisational processes may impact on patient outcomes. They are, however, an imperfect source of data and the difficulties faced by reviewers in navigating and finding information in case notes highlights the difficulties that may be encountered in navigating them in clinical practice. The written record of care needs to be an improvement priority across NHS Wales.
Future work
The concept of risk signatures is developmental but promising. Public Health Wales intends to provide support to Welsh health boards in order to understand patient safety issues at the organisational level through risk signatures, and an in-depth evaluation of this approach is warranted. Formal testing of their alignment with other patient safety data, such as incident reports and current improvement efforts, will provide a level of validation and an assessment of their utility in quantifying ongoing risk or progress made over time.
Our experience of three harm identification and measurement tools in Wales suggests that the determination of preventable harm is challenging for reviewers and may reflect the current definitions in use, which are not theoretically underpinned and do not consider the causal mechanisms through which AEs occur. This limits the direct clinical applicability of the quantification of AEs in health care and work is needed to explore the issue of preventable harm aiming to provide guidance and a more theoretically underpinned classification system. Drilling beneath headline rates to understand the how and the why of commonly occurring AEs could significantly improve our understanding of how to use harm and mortality data to prioritise where efforts are best targeted to realise quantifiable improvements in patient and organisational outcomes.
Future large-scale studies should attempt to specify in advance at least a large percentage of specific types of AEs, which should enable more precise tracking of both specific types of harm and the overall level of AEs. This will never be a complete solution, as there will always be problems that are rare or elude precise definition and that will require a generic ‘other’ category.
There are areas of clinical practice that have not benefited from the range of quality improvement initiatives and patient safety work seen in the general inpatient population. These include child health, maternity services and mental health, yet much of the risk described through the Harm2 tool is pertinent to these population groups. Small-scale studies are needed to test the utility of this approach within these specific patient groups and to ensure inclusion, rather than exclusion, as is currently the case, from patient safety monitoring.
Iatrogenic injury is challenging for health-care organisations, staff, patients and their families and carers and there is a paucity of information on how organisations respond to the emotional and psychological impact in their aftermath. With calls for increased openness and candour and national policy on ‘Putting Things Right’ for patients, there is a need to assess how these systems currently operate and what support patients and their extended families articulate as being needed, mapped against what they were offered during the period following a preventable AE.
Acknowledgements
Special thanks are given to the management and research staff of NISCHR CRC for their commitment and diligence in undertaking the screening RF1 and Harm2 reviews across NHS Wales and the research physicians undertaking the MRF2 reviews. They provided the reliable infrastructure through which this study was possible. We are similarly thankful to health boards across NHS Wales who hosted the research teams and facilitated the retrieval of records and the NHS Wales Harm and Mortality Collaborative for providing fertile ground for this study to happen in Wales. Our local collaborators in research sites were Dr Graham Shortland, Mr Kamal Assad, Dr Bruce Ferguson, Dr Phil Kloer, Dr Brian Tehan and Dr Grant Robinson. Other key members of health boards included Dr Jason Shannon, Dr Dave Hope and Dr Steve Edwards, and we are additionally grateful for the support given by Dr Chris Jones, Medical Director of NHS Wales.
Sincere thanks to Professor Stephen Palmer, and the steering group and management group members have also played a significant contribution to the study; we would especially like to thank Mr Stuart Stevenson, Dr Alan Willson, Dr Gareth Parry, Dr Jason Shannon, Mrs Joy Whitlock, Mrs Kate Hooton, Mrs Ann Biffin and Dr Luke Cowie for their contribution.
Finally, we would like to thank all the administrative staff that organised and entered over 8000 reviews into databases over the last 5 years. Our sincere thanks to Dawn Cassley, Robert Reader, David Robson, Rhian Marks, Phillip Combstock, Dafydd Rees and Mary McClutchen.
Contributions of authors
Dr Sharon Mayor designed both phases of the study, led on the developmental work around signatures and the Harm2 tool, performed the analysis on both phases of the study and wrote the report.
Mrs Elizabeth Baines project managed both phases across NHS Wales, designed training and manuals, undertook interim analysis, and contributed to report writing and organisational feedback.
Professor Charles Vincent contributed to study design, advised on methodology, sat on the steering group and contributed to the report.
Professor Annette Lankshear undertook the evaluation, wrote up phase 2 and contributed to the report.
Professor Adrian Edwards, co-applicant, was steering and operational group member for phase 2 of the study and reviewed the report.
Professor Mansel Aylward advised on NHS issues and the operationalisation of the study across NHS Wales.
Dr Helen Hogan, steering group member, reviewed early analysis and contributed to and revised the report.
Professor Paul Harper advised on statistical methods for phase 1 of the study.
Mrs Jan Davies, steering group member, facilitated links with strategic development in patient safety and reviewed the report.
Dr Ameet Mamtora undertook thematic analysis of screening criteria in part fulfilment of his Bachelor of Science in Clinical Epidemiology.
Dr Emily Brockbank undertook thematic analysis of screening criteria in part fulfilment of her Bachelor of Science in Clinical Epidemiology.
Professor Jonathon Gray developed the concept of comparing the GTT and two-stage retrospective review process, was a steering group member and reviewed the report.
Data sharing statement
All data with the restriction of any data fields that will identify any particular study site can be obtained by contacting the corresponding author.
Disclaimers
This report presents independent research funded by the National Institute for Health Research (NIHR). The views and opinions expressed by authors in this publication are those of the authors and do not necessarily reflect those of the NHS, the NIHR, NETSCC, the HS&DR programme or the Department of Health. If there are verbatim quotations included in this publication the views and opinions expressed by the interviewees are those of the interviewees and do not necessarily reflect those of the authors, those of the NHS, the NIHR, NETSCC, the HS&DR programme or the Department of Health.
References
- To Err Is Human. Washington, DC: National Academies Press; 1999.
- Crossing the Quality Chasm: A New Health System for the 21st Century. Washington, DC: National Academies Press; 2001.
- de Vries EN, Ramrattan MA, Smorenburg SM, Gouma DJ, Boermeester MA. The incidence and nature of in-hospital adverse events: a systematic review. Qual Saf Health Care 2008;17:216-23. http://dx.doi.org/10.1136/qshc.2007.023622.
- Classen DC, Resar R, Griffin F, Federico F, Frankel T, Kimmel N, et al. Global trigger tool’ shows that adverse events in hospitals may be ten times greater than previously measured. Health Aff 2011;30:581-9. http://dx.doi.org/10.1377/hlthaff.2011.0190.
- Pronovost PJ, Wachter RM. Progress in patient safety: a glass fuller than it seems. Am J Med Qual 2014;29:165-9. http://dx.doi.org/10.1177/1062860613495554.
- Dixon-Woods M, McNicol S, Martin G. Ten challenges in improving quality in healthcare: lessons from The Health Foundation’s programme evaluations and relevant literature. BMJ Qual Saf 2012;21:876-84. http://dx.doi.org/10.1136/bmjqs-2011-000760.
- Morgan L, New S, Robertson E, Collins G, Rivero-Arias O, Catchpole K, et al. Effectiveness of facilitated introduction of a standard operating procedure into routine processes in the operating theatre: a controlled interrupted time series. BMJ Qual Saf 2015;24:120-7. http://dx.doi.org/10.1136/bmjqs-2014-003158.
- Wears RL. Improvement and evaluation. BMJ Qual Saf 2015;24:92-4. http://dx.doi.org/10.1136/bmjqs-2014-003889.
- Shojania KG, Marang-van de Mheen PJ. Temporal trends in patient safety in the Netherlands: reductions in preventable adverse events or the end of adverse events as a useful metric?. BMJ Qual Saf 2015;24:541-4. http://dx.doi.org/10.1136/bmjqs-2015-004461.
- Vincent C, Amalberti R. Safety in healthcare is a moving target. BMJ Qual Saf 2015;24:539-40. http://dx.doi.org/10.1136/bmjqs-2015-004403.
- 1000 Lives Campaign Improves Patient Safety Across Wales. n.d.
- Palmer S. A Report to the Welsh Government Minister for Health and Social Services to Provide an Independent Review of the Risk Adjusted Mortality Data for Welsh Hospitals, Considering to What Extent These Measures Provide Valid Information, Focusing Initially on the Six Hospitals With a Welsh Risk Adjusted Mortality Index (RAMI) Score of Above 100 in the Data Published on Friday 21 March 2014 n.d. http://gov.wales/docs/dhss/publications/140716dataen.pdf (accessed 14 May 2016).
- Vincent C, Aylin P, Franklin BD, Holmes A, Iskander S, Jacklin A, et al. Is health care getting safer?. BMJ 2008;337. http://dx.doi.org/10.1136/bmj.a2426.
- Vincent C, Neale G, Woloshynowych M. Adverse events in British hospitals: preliminary retrospective record review. BMJ 2001;322:517-19. https://doi.org/10.1136/bmj.322.7285.517.
- Vincent C. The Measurement and Monitoring of Safety. London: The Health Foundation; 2013.
- Sari AB, Sheldon TA, Cracknell A, Turnbull A, Dobson Y, Grant C, et al. Extent, nature and consequences of adverse events: results of a retrospective case note review in a large NHS hospital. Qual Saf Health Care 2007;16:434-9. http://dx.doi.org/10.1136/qshc.2006.021154.
- Zegers M, de Bruijne MC, Wagner C, Groenewegen PP, Waaijman R, van der Wal G. Design of a retrospective patient record study on the occurrence of adverse events among patients in Dutch hospitals. BMC Health Serv Res 2007;7. http://dx.doi.org/10.1186/1472-6963-7-27.
- Hutchinson A, Coster JE, Cooper KL, McIntosh A, Walters SJ, Bath PA, et al. Assessing quality of care from hospital case notes: comparison of reliability of two methods. Qual Saf Health Care 2010;19. http://dx.doi.org/10.1136/qshc.2007.023911.
- Landrigan CP, Parry GJ, Bones CB, Hackbarth AD, Goldmann DA, Sharek PJ. Temporal trends in rates of patient harm resulting from medical care. N Engl J Med 2010;363:2124-34. http://dx.doi.org/10.1056/NEJMsa1004404.
- Hogan H, Healey F, Neale G, Thomson R, Vincent C, Black N. To what extent are inpatient deaths preventable? The author’s reply. BMJ Qual Saf 2013;22:607-8. http://dx.doi.org/10.1136/bmjqs-2013-001857.
- Kennerly DA, Kudyakov R, da Graca B, Saldaña M, Compton J, Nicewander D, et al. Characterization of adverse events detected in a large health care delivery system using an enhanced global trigger tool over a five-year interval. Health Serv Res 2014;49:1407-25. http://dx.doi.org/10.1111/1475-6773.12163.
- Runciman W, Hibbert P, Thomson R, Van Der Schaaf T, Sherman H, Lewalle P. Towards an international classification for patient safety: key concepts and terms. Int J Qual Health Care 2009;21:18-26. http://dx.doi.org/10.1093/intqhc/mzn057.
- Zhan C, Miller M. Excess length of stay, charges, and mortality attributable to medical injuries during hospitalization. JAMA 2003;290:1868-74. https://doi.org/10.1001/jama.290.14.1868.
- Sari AB, Sheldon TA, Cracknell A, Turnbull A. Sensitivity of routine system for reporting patient safety incidents in an NHS hospital: retrospective patient case note review. BMJ 2007;334. http://dx.doi.org/10.1136/bmj.39031.507153.AE.
- Hulley S, Cummings S, Browner W, Grady D, Newman T, Hulley S, et al. Designing Clinical Research: An Epidemiological Approach. Philadelphia, PA: Lippincott; 2001.
- Thomas EJ, Petersen LA. Measuring errors and adverse events in health care. J Gen Intern Med 2003;18:61-7. https://doi.org/10.1046/j.1525-1497.2003.20147.x.
- Reason J. Human error: models and management. BMJ 2000;320:768-70. http://dx.doi.org/10.1136/bmj.320.7237.768.
- Brennan T, Leape L, Laird N, Hebert L, Lacalio A, Lawthers A, et al. Incidence of adverse events and negligence in hospitalized patients: results of the Harvard Medical Practice Study. N Engl J Med 1991;324:370-6. https://doi.org/10.1056/NEJM199102073240604.
- Leape LL, Bates DW, Cullen DJ, Cooper J, Demonaco HJ, Gallivan T, et al. Systems analysis of adverse drug events. ADE Prevention Study Group. JAMA 1995;274:35-43. https://doi.org/10.1001/jama.1995.03530010049034.
- Woloshynowych M, Neale G, Vincent C. Case record review of adverse events: a new approach. Qual Saf Health Care 2003;12:411-15. https://doi.org/10.1136/qhc.12.6.411.
- Neale G, Woloshynowych M. Retrospective case record review: a blunt instrument that needs sharpening. Qual Saf Health Care 2003;12:2-3. https://doi.org/10.1136/qhc.12.1.2.
- Wilson RM, Runciman WB, Gibberd RW, Harrison BT, Newby L, Hamilton JD. The Quality in Australian Health Care study. Med J Aust 1995;163:458-71.
- Thomas EJ, Studdert DM, Burstin HR, Orav EJ, Zeena T, Williams EJ, et al. Incidence and types of adverse events and negligent care in Utah and Colorado. Med Care 2000;38:261-71. https://doi.org/10.1097/00005650-200003000-00003.
- Davis P, Lay-Yee R, Briant R, Ali W, Scott A, Schug S. Adverse events in New Zealand public hospitals occurrence and impact. N Z Med J 2002;115.
- Baker GR, Norton PG, Flintoft V, Blais R, Brown A, Cox J, et al. The Canadian Adverse Events Study: the incidence of adverse events among hospital patients in Canada. CMAJ 2004;170:1678-86. https://doi.org/10.1503/cmaj.1040498.
- Soop M, Fryksmark U, Köster M, Haglund B. The incidence of adverse events in Swedish hospitals: a retrospective medical record review study. Int J Qual Health Care 2009;21:285-91. http://dx.doi.org/10.1093/intqhc/mzp025.
- Aranaz-Andrés J, Aibar-Remòn C, Vitaller-Burillo J, Ruiz-Lopez P, Limon-Ramirez R, Terol-Garcia E. ENEAS work group: incidence of adverse events related to health care in Spain: results of the Spanish National Study of Adverse Events. J Epidemiol Community Health 2008;12:1022-9. https://doi.org/10.1136/jech.2007.065227.
- Aranaz-Andrés JM, Aibar-Remón C, Vitaller-Burillo J, Requena-Puche J, Terol-García E, Kelley E, et al. ENEAS work group . Impact and preventability of adverse events in Spanish public hospitals: results of the Spanish National Study of Adverse Events (ENEAS). Int J Qual Health Care 2009;21:408-14. http://dx.doi.org/10.1093/intqhc/mzp047.
- Zegers M, de Bruijne MC, Wagner C, Hoonhout LH, Waaijman R, Smits M, et al. Adverse events and potentially preventable deaths in Dutch hospitals: results of a retrospective patient record review study. Qual Saf Health Care 2009;18:297-302. http://dx.doi.org/10.1136/qshc.2007.025924.
- Weingart SN, Davis RB, Palmer RH, Cahalane M, Hamel MB, Mukamal K, et al. Discrepancies between explicit and implicit review: physician and nurse assessments of complications and quality. Health Serv Res 2002;37:483-98. https://doi.org/10.1111/1475-6773.033.
- Naessens JM, O’Byrne TJ, Johnson MG, Vansuch MB, McGlone CM, Huddleston JM. Measuring hospital adverse events: assessing inter-rater reliability and trigger performance of the Global Trigger Tool. Int J Qual Health Care 2010;22:266-74. http://dx.doi.org/10.1093/intqhc/mzq026.
- Unbeck M, Schildmeijer K, Henriksson P, Jürgensen U, Muren O, Nilsson L, et al. Is detection of adverse events affected by record review methodology? An evaluation of the “Harvard Medical Practice Study” method and the “Global Trigger Tool”. Patient Saf Surg 2013;7. http://dx.doi.org/10.1186/1754-9493-7-10.
- Mattsson TO, Knudsen JL, Lauritsen J, Brixen K, Herrstedt J. Assessment of the global trigger tool to measure, monitor and evaluate patient safety in cancer patients: reliability concerns are raised. BMJ Qual Saf 2013;22:571-9. http://dx.doi.org/10.1136/bmjqs-2012-001219.
- National Coordinating Council for Medication Error Reporting and Prevention n.d. www.nccmerp.org (accessed 23 January 2017).
- Hartwig SC, Denger SD, Schneider PJ. Severity-indexed, incident report-based medication error-reporting program. Am J Hosp Pharm 1991;48:2611-16.
- Griffin FA, Resar RK. IHI Innovation Series White Paper. IHI Global Trigger Tool for Measuring Adverse Events (Second Edition). Cambridge, MA: IHI; 2009.
- von Plessen C, Kodal AM, Anhøj J. Experiences with global trigger tool reviews in five Danish hospitals: an implementation study. BMJ Open 2012;2. http://dx.doi.org/10.1136/bmjopen-2012-001324.
- Delivering Safe Care, Compassionate Care: Learning for Wales From the Report of the Mid Staffordshire NHS Foundation Trust Pubic Inquiry. Cardiff: Welsh Government; 2013.
- Francis R. Independent Inquiry into Care Provided by Mid Staffordshire NHS Foundation Trust. January 2005-March 2009. London: The Stationery Office; 2010.
- More Care, Less Pathway: A Review of the Liverpool Care Pathway. London: Department of Health/Her Majesty’s Stationery Office; 2013.
- Hospital Episode Statistics: Admitted Patient Care 2013–14. Leeds: Health and Social Care Information Centre; 2013.
- What Do We Know Now that We Didn’t Know a Year Ago? New Intelligence on End of Life Care in England. Leeds: NHS National End of Life Care Programme; 2012.
- NCC MERP Taxonomy of Medication Errors n.d. www.nccmerp.org/sites/default/files/taxonomy2001-07-31.pdf (accessed February 2017).
- Improving Healthcare White Paper Series Number 10. Providing Assurance, Driving Improvement. Learning from Mortality and Harm Reviews in NHS Wales. Cardiff: 1000 Lives Plus; 2013.
- Achieving Excellence: The Quality Delivery Plan for the NHS in Wales 2012–2016. Cardiff: Welsh Government; 2012.
- Hogan H, Healey F, Neale G, Thomson R, Vincent C, Black N. Preventable deaths due to problems in care in English acute hospitals: a retrospective case record review study. BMJ Qual Saf 2012;21:737-45. http://dx.doi.org/10.1136/bmjqs-2011-001159.
- Altman DG. Practical Statistics for Medical Research. London: Chapman & Hall; 1991.
- The Research Cycle: Measuring Harm. Geneva: World Health Organization; n.d.
- Tong A, Sainsbury P, Craig J. Consolidated criteria for reporting qualitative research (COREQ): a 32-item checklist for interviews and focus groups. Int J Qual Health Care 2007;19:349-57. http://dx.doi.org/10.1093/intqhc/mzm042.
- Thornlow DK. Increased risk for patient safety incidents in hospitalized older adults. Medsurg Nurs 2009;18:287-91.
- Rothschild JM, Bates DW, Leape LL. Preventable medical injuries in older patients. Arch Intern Med 2000;160:2717-28. https://doi.org/10.1001/archinte.160.18.2717.
- Gray CL, Gardner C. Adverse drug events in the elderly: an ongoing problem. J Manag Care Pharm 2009;15:568-71. https://doi.org/10.18553/jmcp.2009.15.7.568.
- Van De Steeg L, Langelaan M, Wagner C. Can preventable adverse events be predicted among hospitalized older patients? The development and validation of a predictive model. Int J Qual Health Care 2014;26:547-52. http://dx.doi.org/10.1093/intqhc/mzu063.
- Boaden A. Fix Dementia Care: Hospitals. London: Alzheimer’s Society; 2016.
- Black J, Edsbert L, Baharestani M, Langemo D, Goldberg M, McNichol L, et al. Pressure ulcers: avoidable or unavoidable? Results of the national pressure ulcer advisory panel consensus conference. Ostomy Wound Manage 2011;57:24-37.
- Guy H, Downie F, McIntyre L, Peters J. Pressure ulcer prevention: making a difference across a health authority?. Br J Nurs 2013;22. http://dx.doi.org/10.12968/bjon.2013.22.Sup12.S4.
- Baines RJ, Langelaan M, de Bruijne MC, Wagner C. Is researching adverse events in hospital deaths a good way to describe patient safety in hospitals: a retrospective patient record review study. BMJ Open 2015;5. http://dx.doi.org/10.1136/bmjopen-2014-007380.
- Baines RJ, Langelaan M, de Bruijne MC, Asscheman H, Spreeuwenberg P, van de Steeg L, et al. Changes in adverse event rates in hospitals over time: a longitudinal retrospective patient record review study. BMJ Qual Saf 2013;22:290-8. http://dx.doi.org/10.1136/bmjqs-2012-001126.
- Baines R, Langelaan M, de Bruijne M, Spreeuwenberg P, Wagner C. How effective are patient safety initiatives? A retrospective patient record review study of changes to patient safety over time. BMJ Qual Saf 2015;24:561-71. http://dx.doi.org/10.1136/bmjqs-2014-003702.
- Mills DH. Medical insurance feasibility study. A technical summary. West J Med 1978;128:360-5.
- Nabhan M, Elraiyah T, Brown DR, Dilling J, LeBlanc A, Montori VM, et al. What is preventable harm in healthcare? A systematic review of definitions. BMC Health Serv Res 2012;12. http://dx.doi.org/10.1186/1472-6963-12-128.
- Clancy CM. Alleviating “second victim” syndrome: how we should handle patient harm. J Nurs Care Qual 2012;27:1-5. http://dx.doi.org/10.1097/NCQ.0b013e3182366b53.
- Grissinger M. Too many abandon the “second victims” of medical errors. P T 2014;39:591-2.
- Wu AW. Medical error: the second victim. The doctor who makes the mistake needs help too. BMJ 2000;320:726-7. https://doi.org/10.1136/bmj.320.7237.726.
- Good VS, Saldaña M, Gilder R, Nicewander D, Kennerly DA. Large-scale deployment of the Global Trigger Tool across a large hospital system: refinements for the characterisation of adverse events to support patient safety learning opportunities. BMJ Qual Saf 2011;20:25-30. http://dx.doi.org/10.1136/bmjqs.2008.029181.
- Aranaz-Andrés JM, Limón R, Mira JJ, Aibar C, Gea MT, Agra Y. ENEAS Working Group . What makes hospitalized patients more vulnerable and increases their risk of experiencing an adverse event?. Int J Qual Health Care 2011;23:705-12. http://dx.doi.org/10.1093/intqhc/mzr059.
- Naessens JM, Campbell CR, Shah N, Berg B, Lefante JJ, Williams AR, et al. Effect of illness severity and comorbidity on patient safety and adverse events. Am J Med Qual 2012;27:48-57. http://dx.doi.org/10.1177/1062860611413456.
- Mayer E, Flott K, Callahan R, Darzi A. National Reporting and Learning System Research and Development. London: Imperial College London; 2016.
- Dealey C, Posnett J, Walker A. The cost of pressure ulcers in the United Kingdom. J Wound Care 2012;2:261-2. http://dx.doi.org/10.12968/jowc.2012.21.6.261.
- Jenks PJ, Laurent M, McQuarry S, Watkins R. Clinical and economic burden of surgical site infection (SSI) and predicted financial consequences of elimination of SSI from an English hospital. J Hosp Infect 2014;86:24-33. http://dx.doi.org/10.1016/j.jhin.2013.09.012.
- 1000 Lives Campaign . Global Trigger Tool How to Guide n.d. http://www.wales.nhs.uk/documents/Global%20Trigger%20Tool-%20How%20to%20Guide.pdf (accessed February 2017).
Appendix 1 Modular review form: doctor assessment of harm from the Harvard method
Reproduced from Case record review of adverse events: a new approach, Woloshynowych M, Neale G, Vincent C, vol. 12, pp. 411–15, 2003, with permission from BMJ Publishing Group Ltd. 30
Appendix 2 Review form 1
Appendix 3 Overlap in the identification of adverse events using different tools and methods
In 3504 matched GTT and two-stage retrospective review cases, the nursing cohort identified 617 AEs, of which 341 were subsequently confirmed as AEs in the physician review. In the same records, the GTT identified 294 events through the NHS-led process, of which 153 were not identified through the two-stage retrospective review process (Figure 37). There is no direct overlap between MRF2- and GTT-identified AEs, as MRF2 assessment is not an independent process from the RF1 process.
On review of the 153 AEs picked up by the GTT and not the Harvard method, we determined that around 15 AEs could have been missed by the Harvard method, but the remainder of the events were accident and emergency department misclassifications mainly arising from multiple triggers being classified as multiple AEs.
Appendix 4 Global Trigger Tool triggers
Reproduced with permission from Global Trigger Tool How to Guide. 81
Appendix 5 Failures in care and their association with readmissions across NHS Wales
Theme | Definition of theme | Example from data |
---|---|---|
Failure of care | ||
Diagnosis
|
A diagnosis that was not made on the index admission therefore resulting in repeated readmissions with the same or similar symptoms | Pneumothorax missed during two previous admissions |
An incorrect diagnosis made on the index admission which resulted in a readmission with the same or similar symptoms | Patient admitted via fracture clinic . . . diagnosed with arthritis as per X-ray . . . readmitted and diagnosis changed to a fracture | |
Ineffective symptom management
|
The management of symptoms that was not performed adequately on a previous admission therefore resulting in a readmission | Admitted with acute exacerbation of COPD . . . frequent admissions pre and post this episode . . . ‘there appears to be no cohesive management in place’ |
Drug toxicity | Symptoms that arise because of the patient falling outside the advised drug therapeutic index therefore requiring a readmission | 69-year-old patient with AF on warfarin admitted on two occasions with epistaxis |
Communication | Breakdown in communication between health professionals which contributed to a patient readmission | Patient transferred between hospitals . . . ‘no documents regarding this admission in the medical records’ |
Falls outside the hospital | A patient known to be susceptible to falls who is readmitted after a further fall | Patient admitted on three occasions suffering several falls at home, resulting in dislocations and fractures after undergoing a total hip replacement |
Post operative
|
Complications that occur following surgery which result in a readmission | Patient underwent a laparotomy during index admission. . . 4 weeks later they were readmitted with breathlessness, right calf swelling and hypotension. . . ECHO revealed a massive pulmonary embolism |
Prolonged waiting times | A readmission while waiting for a procedure or outpatient appointment | Numerous admissions in the past 12 months for recurrent vomiting while awaiting input from another hospital appointment |
Social care inadequacy | Social care package that is inadequate for the patient resulting in a readmission | Patient admitted with basal pneumonia and confusion . . . discharge with care package . . . readmitted 2 months later with deterioration in confusion |
Hospital-acquired infection | A readmission because of an infection that presented post discharge from the index admission | Patient admitted with general deterioration . . . treated for hospital-acquired infection as a result of his recent discharge the day before |
Premature discharge
|
Discharging the patient when their symptoms had not been managed adequately therefore resulting in a readmission with the same or similar symptoms | Patient had been admitted at 00:45 with diarrhoea and vomiting and was discharged by 10:00 . . . subsequently readmitted later the same day with recurring symptoms |
Theme | Definition of theme | Example from data |
---|---|---|
Nature of disease | ||
Routine disease management | A readmission that occurs because of the natural management of a disease | Patient diagnosed with breast cancer required further surgery owing to spread of cancer to axillary lymph nodes |
Unrelated to index admission | A readmission that consisted of different symptoms compared with the index admission | Patient was admitted with a severe sore throat and diagnosed with quinsy . . . readmitted following a fractured ankle |
Hospital transfer | An unplanned readmission as a result of the transfer of a patient to another hospital | Patient admitted with chest pain and transferred to another hospital for an angiography |
Patient factor | Factors relating to the patient declining the advised medical treatments or the non-attendance of patients to booked clinic appointments | Patient admitted with a fracture of the left tibia but patient absconded . . . subsequently readmitted owing to worsening pain but self-discharged against medical advice |
Procedural issue | An issue relating to a procedure that requires the patient to be readmitted at a later date | Patient discharged after first per oesophageal gastric tube failed to insert . . . readmitted 6 days later for a second attempt under radiological guidance |
Planned readmission | Patients who have been readmitted as planned in advance | Patient admitted from outpatient department for gastric outlet obstruction and was readmitted as planned for a blood transfusion |
Appendix 6 Summary of the derived themes under criterion 18, other undesirable outcomes with an associated definition and example
Theme | Definition of theme | Example from data |
---|---|---|
|
Issues caused by staff shortages, and those caused by delays in the process of clerking, including delay in medication prescribing and response by the on-call team | 30-year-old female. Laparoscopic cholecystectomy. Operation cancelled in anaesthetic room because of unavailable equipment |
Resource constraints include bed availability, drugs and procedures, which contributed to the experienced delays either directly or because of cancellations | 59-year-old male. Aspiration of infected knee. Post-operative blood transfusion commenced late because of shift staff shortage | |
80-year-old male. Admitted due to TIA. Inappropriate referral to dermatology leading to treatment delay | ||
|
Issues related to the decision-making process of health-care professionals involved with the patient journey. The majority of reports were inappropriate decisions made relating to the treatment or intervention the patient received | 30-year-old male. Admitted for i.v. antibiotics. History of i.v. drug abuse, clean for 5 months in rehabilitation. Patient prescribed Oramorph® (Boehringer Ingelheim, Bracknell, UK) over methadone, the drug of choice for patients in rehabilitation |
21-year-old female. Admitted with pain and diagnosed with appendicitis. Discharged and treated conservatively with elective appendectomy booked. Patient readmitted three times before surgery was performed | ||
Communication | ||
|
Ineffectual communication between patient and professional, giving rise to the potential for the behaviour of staff to impact on psychological well-being, as well as the health-care process | 19-year-old female. Admitted with anaphylaxis. Patient had been previously given Epipen® (Mylan, Maidenhead) but stated they had not been told how to use it properly |
55-year-old female. Fractured ankle from fall, required surgical fixation. Patient’s first language is Polish, causing issues with preoperative understanding but no formal interpretation was sought | ||
|
Problems in the documented word that included insufficient records, discrepancies between notes, and notes that were missing | 85-year-old male. Admitted after 2 days of vomiting, history of bowel cancer. Cardiac arrest on third day of admission, unable to resuscitate. No medical notes written during admission until arrest. Lack of observations on chart and missing blood test results. No documented referral to acute care team |
|
Problem in the organisation and facilitation of further assessment or treatment | 35-year-old male. Seen in clinic after acute admission of abdominal pain. Prescribed steroids and urgent follow-up requested. Patient never received urgent colonoscopy and remained on steroids and not monitored for 11 months. Was then admitted with acute pancreatitis |
|
Issues arising as a result of insufficient investigations being carried out or by submitting patients to unnecessary treatments or interventions | 47-year-old male. Admitted with sore throat. Diagnosed with lower respiratory tract infection. Patient became septic during second week of admission and neck abscess found. Correct diagnosis took 9 days to be identified |
60-year-old female. Admitted with sepsis, pain, diarrhoea and vomiting. Had been discharged recently with treatment for atypical flu. Further investigations found pelvic abscess. Correct diagnosis missed during first admission |
Appendix 7 Harm2 tool
Appendix 8 Example of the description used for screening criteria in the Harm2 manual
Definition: an infection is considered to be hospital acquired once the patient has been in hospital for 72 hours or more. The evidence of infection may be clinical (local or systemic evidence) or combined with a positive microbiological culture.
Exceptions: infections acquired prior to admission to hospital unrelated to health-care management, for example a patient admitted with a chest infection with no previous health-care intervention.
Discussion and data retrieval: for patients admitted with infections check for evidence of prior health-care management and where infection may have been acquired; for example, a patient transferred from one hospital to another for ongoing management of a wound infection acquired during hospitalisation should be recorded as YES.
For infections manifesting themselves 72 hours post admission, check for evidence of an invasive procedure. Note any breeches in asepsis or any other potential infection risks a patient may be exposed to. Note if antibiotic prophylaxis is used in prosthetic surgery and look for delays in diagnosis and treatment of infections. The reviewer needs to look at the microbiology reports and the corresponding dates in the progress notes to identify both clinical and laboratory evidence of infection. What to look for as confirmation of infections:
-
Urine microbiological reports; look at the white cell count (WCC) for clues of contamination versus infection. A WCC of > 100 suggests infection. Epithelial cells > 100 suggests a high probability of contamination. Positive nitrates suggest possible infection. Therefore, the following are all good indications of infection:
-
WCC > 100
-
epithelial cells < 10
-
nitrates positive.
-
More than three organisms cultured generally indicates a contaminated specimen. Wound infections will have:
Clinical evidence (i.e. redness, discharge, etc).
Systemic evidence (i.e. fever of > 38 °C for more than 2 days following a procedure and an increased WCC with a positive wound culture reported as microbiological evidence).
Other infections including blood, chest, etc. will appear with similar signs and symptoms and microbiological confirmation. The reviewer should describe the event to include the date, the nature of the infection and the treatment.
Examples: wound infection and elective admission for cardiac surgery. Postoperatively the patient’s sternal wound developed an infection requiring return to theatre for debridement and resuturing.
Hospital-acquired pneumonia: elective admission with mutilating rheumatoid arthritis. Treated with antibiotics and epoprostenol infusion. Wounds redressed. Reviewed by registrar prior to discharge and noted to be hypotensive 94/50. Stat gelofusion given i.v. Last entry in nursing notes 23 March 2011. Doctor’s notes confirm patient discharged same day. Patient readmitted 26 March 2011; diagnosed with HAP.
Clostridium difficile: patient on chemotherapy. Hospital admission with fever and diagnosed with neutropenic sepsis. Also complained of diarrhoea and diagnosed with Clostridium difficile from inpatient stool sample. Treated with antibiotic therapy and chemotherapy treatment suspended temporarily.
Sepsis: admitted for Hartmann’s procedure for cancer. A week later became septic (pelvic collection and pneumonia) and had a cardiac arrest. Went to ITU for 3 days. Monitoring in atrial fibrillation (AF) – treated with amiodarone. Also chronic cardiac failure exacerbation. Treated with i.v. antibiotics. Poor glycaemic control. Transferred to community hospital.
Other: developed norovirus while on the ward. Ward was closed due to norovirus infection.
Appendix 9 Examples of Harm2 adverse events falling into the NCC MERP’s Categorizing Medication Errors Index in the randomly selected discharge reviews
Severity code | Summative overview of episode of care |
---|---|
E | Patient admitted for a suspected DVT 2 weeks after discharge for total left hip replacement. DVT confirmed and warfarin started. Referred to OT physiotherapy and INR clinic prior to discharge. There was a delay in DVT confirmation, 4 days, because no one was available to perform Doppler. Previous admission for THR appeared uneventful and patient discharged on rivaroxaban, DVT was noted as a risk on consent form Elective admission for right total knee replacement. Uneventful procedure, but patient experiencing nausea, vomiting and poor pain control postoperatively. Treated for dehydration and suspected HAP following an episode of tachypnoea, pyrexia and raised heart rate on postoperative day 2. Continued to feel unwell and diagnosed with withdrawal from gabapentin on postoperative day 3 (drug had been omitted since surgery and patient usually takes high dose). Gabapentin re-introduced prior to discharge |
F | 82-year old female admitted with sudden-onset dysphasia. Known to have AF previously. In 2012 was started on bisoprolol for rate control. Documented in notes previously that ‘may need warfarin’; however, on discharge says ‘no follow-up’, so limited information on why patient not anticoagulated and noted on most recent admission. SHO discussed with GP whether patient had been on warfarin in the past but no record of this either. Diagnosed with PAF and prescribed rivaroxaban on discharge Acute admission owing to 6-week history of vomiting and nausea. Bloods revealed acute kidney injury and hypernatraemia
|
G | Patient admitted with exacerbation of COPD and pneumonia. GP had commenced antibiotics, but patient was yet to start them. When in A&E it was decided to commence patient on urgent BIPAP. Taken to ITU and intubated, NG feed commenced and i.v. antibiotics given. Once extubated patient did well, repeat chest radiography showed an improvement in consolidation. Seen by respiratory nurse and commenced on Tosca transcutaneous monitor (Radiometer, Bronshoj, Denmark). Discharged home once improved with repeat radiographs and follow-up. Lung cancer diagnosis made within weeks of discharge This 94-year-old, who was normally very fit (with no significant previous medical history) was admitted generally weak and unwell and aching. Mobility poor and completely lethargic. She had not responded to a 7-day care course of steroids (prednisolone). Suffered a fall (slip) backwards on day 3 of admission at the bedside and suffered a head wound. Incident report form completed and family informed of fall. Deteriorated from this point and transferred to rehabilitation. No ECG or CT scan of chest, abdomen and pelvis had been ordered despite earlier reference to these being awaited. Subsequently had a vasovagal episode while mobilising, no actual LOC. Also complaining of mild supra pubic discomfort. CT ordered and mention of DNCPR decision in notes. Four days later, CT completed, evidence of retroperitoneal lymphoma, with gas in abdomen from unidentified site. Not for surgical intervention – deterioration over next few days and passed away peacefully |
H | This patient was admitted from A&E with a history of recently being discharged from ** where he had a right hip replacement. Following this he had a cardiac arrest due to increased potassium levels and also needed temporary pacing wires because of ventricular standstill. He also had a history of weight loss and indigestion pain and since going home has been experiencing coughing and increasing shortness of breath and chest pain. On admission, ECG left bundle branch block referred to cardiology and sent to CCU. Echocardiography showed pleural effusion secondary to chest infection. Patient put on i.v. antibiotics. Pericardial effusion not drained initially because of infection. Patient in and out of AF during admission and treated appropriately. Temperature kept increasing therefore blood cultures were taken and another set of antibiotics started. CT also showed pericardial effusion. Chest drain inserted later as patient increased shortness of breath drained and then removed then transferred to ** ward then home 4 days later 86-year old lady with dementia and recurrent UTIs admitted with fractured femur having fallen at home. She underwent a left hemiarthroplasty in theatre. Recovery was slow postoperatively, although the surgery was successful she was found to be unresponsive on the commode 5 days post operation. The arrest team was called and she made full recovery. This was thought to be because of dehydration and low HB for which she was already receiving i.v. fluids and blood transfusion. She fell in the ward while trying to mobilise without assistance. She developed a rash secondary to her antibiotic, which then cleared when antibiotic was discontinued. It was noted that her mental capacity had deteriorated considerably and she was eventually discharged to a nursing home once medically fit, but was deemed as lacking capacity |
I | Admitted from home via A&E h/o collapse and severe pain. Analgesia, fluids and i.v. antibiotics administered. CT undertaken. Catheterised. ECG. Transferred to theatre – ruptured right iliac aneurysm and small AAA. Transferred to ITU. Sedated and ventilated. Day 4/7 faecal matter aspirated from mouth. NG replaced remains on dialysis. Left leg ischaemic. Hypotensive. Returned to theatre on day 9 of the hospitalisation as a result of a nick/suture during AAA repair procedure. Gross faecal peritonitis. Remained hypotensive. RIP Patient elective admission for gastrojejunostomy because of duodenal lesion, background of chronic pancreatitis due to alcoholism. Found not malignant but chronic inflammation patient suddenly deteriorated on admission because of arterial embolus, likely cause was poor management of CVP line. Booked for elective surgery but unable to have as ITU full. Readmitted for surgery, but patient died when waiting for surgery |
Appendix 10 Examples of Harm2 adverse events falling into the NCC MERP’s Categorizing Medication Errors Index in the randomly selected deceased patient reviews
Severity code | Summative overview of episode of care |
---|---|
E | Patient admitted to frail elderly care unit from A&E. Short history of confusion and out of character behaviour with agitation and delusional ideas. Wife, unable to cope or manage him at home. Seen by GP previous day and prescribed trimethoprim for possible UTI. Patient previously well and independent prior to admission. Documented to have high alcohol intake and history of DVTs, AF and MI. On warfarin. Nursing notes only in this record. Patient nursed 1 : 1 and 2 : 1 because of aggression and unpredictable behaviour. Incident recorded where patient complained of being restrained by his carers, which caused bruising. This incident was investigated pending medical assessment of bruising – outcome not recorded. Patient had CVA during day 2 of admission. Treated conservatively – residual left-sided weakness and slurred speech. Patient had witnessed cardiac arrest on ninth day of inpatient stay and died following failed attempt at CPR |
Acute admission recent discharge from hospital with history of TIAs. Since previous admission, son had been living with mother in flat. This admission found unresponsive and ongoing UTI and urosepsis diagnosed but was on trimethoprim on admission and so not hospital acquired. Decreased mobility. Discharge planning was evident throughout but patient’s condition deteriorated – became weaker and decrease in appetite, confused, decrease in health. DNR form completed. Passed away. Pressure sore on heel noted – hospital acquired | |
F | Readmission from home (inpatient due to chest infection 7 days previously). Found by neighbours who raised alarm and patient found to be unresponsive. Consciousness level decreased and patient developed dense hemiplegia while under observation in A&E department. Diagnosed with CVA but chest infection not ruled out as possible cause for symptoms. Treated with i.v. antibiotics but failed to improve from respiratory and neurological point of view. Supportive care provided up until patient’s death. Decision made for no active treatment (ventilation or CPR in event of patient’s demise) with family members. Drug error with insulin infusion occurred during patient’s stay, which was not reported to one consultant or the coroner until death certificate issued but coroner was satisfied this did not directly contribute to death |
94-year-old lady with dementia and learning difficulties admitted from care home with dehydration and urosepsis. Pressure sore prevalent on admission – patient seen by tissue viability team. Patient treated with i.v. antibiotics – cannula tissued many times and patient developed cellulitis. Patient treated with sliding scale to control blood sugar and i.v. fluids. Patient became oedematous +++ before i.v. fluids stopped. Missed i.v. drug administration due to ward pressures/demands. Patient S/B dental service and radiograph requested – but not performed? Why? Minimal oral diet and fluid intake throughout admission – no evidence of dietitian input. Patient’s condition deteriorated during admission. Patient failed to respond to i.v. antibiotics. Patient commenced on all-Wales integrate care priorities for the last days of life | |
G | Patient with numerous comorbidities attending hospital for routine dialysis appointment. Sustained a fall in a car park, found by staff with fractured hip and required transfer to A&E. Hip arthroplasty performed that night under care of orthopaedics. Following successful surgery patient had an episode of coffee ground vomit. Refused OGD investigations in spite of warning of seriousness. Remained an inpatient for several weeks, attended for dialysis appointments but generally non-compliant with nursing and medical care including pressure area care and mobility. Developed pressure sores in spite of appropriate assessment and intervention by tissue viability service attempts. Required wound debridement and continued to deteriorate and developed wound infections and chest infections. Treated with antibiotics and continued to refuse nursing care, nutritional support and medication. Discussion with family and decision made not to escalate care. Died on ward |
This review involved the death of a 52-year-old woman from documented cervical cancer and metastases of bone, liver and lung. The care she received during the final inpatient episode was excellent; however, her initial diagnosis was delayed by 5 months because her referral did not reach the obs and gynae team here at the HB. It is important to state the consequences of this delayed diagnosis, as she was symptomatic on presentation to the GP | |
H | Complicated case of an 87-year-old with RA and HF being admitted from a nursing home with a fractured NOF. Initially due to poor underlying health status was poor anaesthetic risk and treated conservatively. Became septic from chest and prescribed Tazocin® (Pfizer, New York City, NY, USA), acknowledging C. diff risk. Became C. diff positive and treated with metronidazole. Condition improved and hemiarthroplasty performed – risk considered, P.POSSUM risk score calculated and mortality was 40–46% and morbidity was 92–3%. The family was aware. Complicated postoperative period with two further episodes of sepsis and sub-acute obstruction. Notes are chaotic and difficult story to follow, but mortality seemed to be an inevitable end point for this gentleman |
Male with history of myelodysplasia. Had received first cycle of chemotherapy and discharged. Follow-up arranged at local chemotherapy day unit 3/7 later. Acute admission via A&E following 999 call in the early hours. Complaining of sudden temperature of 38 °C, PR bleeding and urinary frequency. Neutrophil count, 0.0. Diagnosed with neutropenic sepsis, suspected diverticulitis and urinary retention. Platelets low. Neutropenic sepsis regime initiated – i.v. fluids and i.v. antibiotics. Platelet transfusion and patient catheterised. Admitted to medical/oncology ward. Patient very unwell – hypotensive, acidosis – transferred to HDU later that day. Remains very unwell, exhausted and decision to incubate, also NG tube passed. Medical team receiving advice from local haematologist and also haematology at XXXX. Clinically continues to deteriorate – prognosis poor. Family aware of situation. Iatrogenic incident involving syringe pump failure to deliver inotropic agent and patient arrested, patient chest radiography performed and no rib fracture following chest compressions. Blood pressure became very variable – unstable requiring intensive support but continues to deteriorate. Discussed with family – DNAR and d/w family again regarding prognosis and possibly of withdrawing treatment. Blood pressure low, 50/33 mmHg, heart rate 50 b.p.m. Reviewed by clinicians. Decision made to withdraw treatment. Noradrenaline and adrenaline discontinued. Patient appeared comfortable and died soon after | |
I | This patient was admitted acutely with back pain following a fall. She had been assessed in ** immediately following the fall 5 days previously, radiograph showed no bony injury. Her mobility had been decreasing following the first week of chemotherapy for Ca brain. She was admitted for pain control. E.coli UTI diagnosed after 5 days as inpatient – treated with antibiotics. Became rapidly unwell and pneumonia diagnosed following chest radiography. Patient’s relatives voiced concerns over care, regarding pain relief and patient’s constipation not being addressed. Patient passed away following day |
Recent excision of BCC × 3 to leg and SCC to arm. Two days post discharge from surgery banged leg and wounds bled uncontrollably. Admitted and bleeding controlled. Wounds found to be infected. Infection managed – reviewed by tissue viability and surgical team. Become anaemic and transfused some blood. Patient initially responded well. Chose to remain in hospital as unable to cope at home. Discharge to community hospital planned. Patient suddenly became unwell. HAP – deteriorated and died within 2 days |
Appendix 11 Examples of adverse events identified through the Harm2 tool at different points in the process of care
Origin of A&E | Summative overview of episode of care |
---|---|
Pre admission | 58-year-old lady admitted with painful NOF. Mechanical fall 9/7 prior to admission had visited A&E following fall. O/E full ROM, no radiography. Diagnosed with soft tissue injury and discharged. On admission, radiograph showed undisplaced fracture NOF. Listed for fixation. Uneventful procedure and good recovery and mobilisation prior to discharge |
Patient attended A&E with increasing abdominal pain and vomiting over past 2 days at home. Visited by a GP the day before this admission and told he had gastroenteritis. O/A kept NBM and seen by surgeons? Appendicitis. Taken to theatre diagnosed with ruptured appendix. Patient recovery slow but found to have infection in tip of drain. Discharged home on oral antibiotics on the advice of microbiologist for 6 days | |
Early in admission | The patient was admitted with a 4-day history of RIF pain associated with nausea, pyrexia and anorexia. Clinically presented as appendicitis and underwent a laparoscopic appendectomy, good postoperative recovery and discharged home with DN follow-up. Histology shows non-specific reactive changes (normal). Subsequent admissions over 2 months with similar symptoms and diagnosed with ovarian cyst on CT scan. Followed up by obs and gynae |
Patient admitted due to generally unwell with less mobility and cough. Seen by GP and antibiotics prescribed. Oral antibiotics changed and continued on admission. Patient experienced witnessed collapse due to missed epilepsy medication not written on prescription chart | |
During procedure | Admitted for elective radical nephrectomy for renal tumour. In theatre following removal of an adhesion a 7-cm defect was noted between the diaphragm and pleura. Chest drain inserted and defect was closed. Postoperatively the patient experienced pain at the chest drain site, but this was managed well by the pain team. Discharged home postoperative day 6 with outpatient follow-up |
Admitted for elective thyroidectomy with cervical lymph node resection. Returned to theatre approximately 90 minutes after operation because of wound swelling and respiratory distress. Returned to theatre, wound re-opened, fresh clot and blood found, clot removed and bleeding controlled. Nursed on ITU post procedure then transferred to the ward. Discharged home with outpatient follow-up. Magnesium and calcium low post surgery – replacement therapy given. Found to have Clostridium difficile – commenced antibiotics. Discharged home with outpatient follow-up | |
Admitted as an elective admission and underwent elective laparoscopic cholecystectomy. No complications during procedure, discharged home. Readmitted 3 days later with abdominal pain and vomiting. CT showed haematomas in RUQ and pelvis also low HB. Returned to theatre for laparoscopy and washout 700-ml clot removal. Also treated with antibiotics and blood transfusion. Developed large blisters around drain site – scarring occurred. In the first operation notes – comment ‘a little oozy tranexamic acid given’ but nothing documented on anaesthetic record or treatment sheet to indicate administration. Developed an ileus and chest infection post second operation. Discharged with OPA follow-up | |
ITU/HDU | Lady admitted to ITU following collision with a golf buggy. Reason for admission listed as poly-trauma. Injuries included left 3–12 posterior rib fractures, left 3–7 anterior rib fractures, bilateral pelvic rami fractures and right sacral fracture, 3 × lumbar fracture right-sided spinal femoral fracture, haemodynamically unstable due to bleeding, required massive transfusion. Extubated but deteriorated despite NIV. CXR showed bilateral basal changes. Haemophilus in NBL. i.v. Tazocin given re-intubated and commenced on meropenem for Pseudomonas HAP. E. coli isolated sensitive to Tazocin. Tracheotomy and progress impaired by severe pain through admission |
Obese patient. Attended for elective gall bladder removal. Surgery uneventful, post op patient complaining of pain. PCA set up, anaesthetist attended × 2 as patient still in pain. On return to the ward patient difficult to rouse, naloxone given – diagnosis documented as ?too much opioid. Abdomen tender. Patient noted to not have passed urine for > 15 hours. Catheter passed, following day patient developed distended abdomen and cardiac SVT. Treated with cardiac drugs, USS showed paralytic ileus. Commenced treatment. Improvement noted from this point. Discharged home 4 days later | |
General ward | Patient admitted with acute abdominal pain. Had been discharged the day previously for the same condition. On first admission had a urine dipstick positive to infection and constipation was only treated for constipation. On second admission was then treated for urosepsis. The urine sample on first visit was not followed up. There was no bladder scan to check for residual urine present |
Admitted with confusion and recent falls ?UTI – treated with oral antibiotics. Fell in hospital fractured right inferior pubic rami – managed conservatively. CT head > NAD. Reviewed by old age team – diagnosed probable dementia. Patient fell again – no injury sustained – risk assessments completed. SW, physiotherapy and OT input – patient discharged home with private carers following best interests meeting with family input | |
Female admitted with abdominal pain constant pain associated with hot/chills and rigors and nausea. Likely appendicitis for exploratory lap following unremarkable USS. Planned surgery cancelled as too late to go to theatre. Pyrexial CRP high 340, WCC 23, neutrophils 20.2 for i.v. antibiotics pre surgery and laparoscopic appendectomy; finding acute appendicitis – gangrenous appendix removed. For i.v. antibiotics post operation. Nursing notes show that gentamycin delayed because of prescribing problem with SHO. Venflon tissued 13:30 and SHO not available to recanulate until 16:15, so antibiotics delayed. Venflon tissued again the following day 02:00, no antibiotics overnight as venflon not replaced. Continued temperature spikes and abdominal pain. US pelvis shared collection of fluid in pelvis post operation. Patient encouraged to mobilise and increased analgesia. Patient mobile and well. Apyrexial on discharge with antibiotics | |
Discharge | Admitted to acute elderly unit from nursing home care with increased confusion, agitation and dehydration. Attributed to high opiate dose. Patient also in acute renal failure therefore opiates withdrawn, fluids and U+Es monitored and patient discharged back to nursing home once normalising. Patient readmitted on the same day with some problems but extreme pain also because of withdrawal of her long-term opiate analgesia. Remained an inpatient for 46 more days prior to her death in November |
Lady admitted for a laparoscopic cholecystectomy. Discharged home following day. There is documentation in the clinical notes stating that wound felt hot on day of discharge. Unable to locate any documentation relating to wound prior to discharge home. Patient readmitted with wound infection which burst and patient taken to theatre for exploration of infected wound. Required i.v. antibiotics and packing of wound. Antibiotics changed to oral once patient was systemically well and discharged home | |
Admitted for refashioning of a stoma to reduce prolapse. Discharged home after 2 days but then readmitted the following day for a further refashioning. Stoma was not functioning prior to first discharge. Patient discharged back to nursing home once the stoma became patient. Note: ‘blackened areas’ to stoma upon readmission | |
Patient was admitted with a rugby injury – a fracture to left ankle. Patient underwent surgery – no complications (ORIF) – radiograph reviewed by team – satisfactory – patient discharged home with follow-up appointment and equipment. Radiograph re-reviewed by senior member who felt the postoperative result was unsatisfactory, the patient was recalled for a redo fixation. Second operation was successful |
Appendix 12 Example of organisational report
Glossary
- Adverse event
- An injury to a patient related to medical management, in contrast to complications of disease. Medical management includes all aspects of care, including diagnosis and treatment, failure to diagnose or treat, and the systems and equipment used to deliver care. Adverse events may be preventable or non-preventable.
- Health board
- A NHS body responsible for planning, securing and delivering health-care services in its area.
- Improvement science
- An emerging concept that focuses on exploring how to undertake quality improvement well.
- Inpatient episode
- A complete episode of care irrespective of changes in consultant management.
- Inpatient management
- Care of patients through the secondary care health infrastructure.
- Intentional rounding
- A structured approach whereby nurses conduct checks on patients at set times to assess and manage their fundamental care needs. Concerns about poor standards of basic nursing care have refocused attention on the need to ensure that fundamental aspects of care are delivered reliably.
- Modified early warning score
- A systematic recording of vital signs with clear escalation criteria.
- Problem in care
- The classification of adverse events in the Harvard method.
- Quality improvement
- A systematic approach that uses specific techniques to improve quality within a health-care setting.
- Randomly selected deceased discharges
- A randomly selected sample of patients from the total pool of patients who died in any given hospital in a given month.
- Randomly selected discharges
- A randomly selected sample of patients from the total pool of all discharges in any given hospital in a given month.
- Risk-adjusted mortality index
- A death rate that takes patient risk factors into account.
- Skin bundle
- Skin bundle requires documented nursing intervention at least every 2 hours in the following areas to reduce the likelihood of damage: monitoring the patient’s Surface – ensuring the patient is on the right mattress or cushion; Keep moving – encouraging self-movement and repositioning; Incontinence – meeting the patient’s toileting or continence needs; and Nutrition – keeping the patient well hydrated and meeting their nutritional needs.
List of abbreviations
- AE
- adverse event
- AF
- atrial fibrillation
- CI
- confidence interval
- CRC
- Clinical Research Collaboration
- DNAR
- do not attempt resuscitation
- DVT
- deep-vein thrombosis
- GP
- general practitioner
- GTT
- Global Trigger Tool
- HAP
- hospital-acquired pneumonia
- IHI
- Institute for Healthcare Improvement
- ITU
- intensive therapy unit
- i.v.
- intravenous
- LOS
- length of stay
- MI
- myocardial infarction
- MRF2
- modular review form 2
- NCC MERP
- National Coordinating Council for Medication Error Reporting and Prevention
- NISCHR
- National Institute for Social Care and Health Research
- OR
- odds ratio
- PRISM
- PReventable Incidents Survival and Mortality
- PVD
- peripheral vascular disease
- RF1
- review form 1
- SD
- standard deviation
- UMR
- Universal Mortality Review
- WCC
- white cell count
- WG
- Welsh Government