Introduction

An electronic health record (EHR) system transition is a massive undertaking for any healthcare organization. Technological change of this magnitude disrupts operations, consumes vast resources, and poses tremendous risk to patient safety.1,2,3,4,5,6 Even without major roadblocks, experienced clinicians may take years to return to pre-transition efficiency.7 If EHR features fail to support end-user needs, costly redesigns and repeated updates can extend the transition process.7 Unfortunately, stories of EHR implementation missteps abound.8,9 EHR transitions have increased as more organizations retire “homegrown” legacy systems or upgrade commercially available products.5,10,11,12,13,14,15 Switching EHR systems may pose unique risks and implementation concerns, but standards for ensuring safety during this process have not been established.1,10,16

EHR usability (i.e., efficiency, effectiveness, and satisfaction)17 has a profound effect on clinician burnout and patient safety.1,18,19 User-centered design (UCD), a human factors approach to system design that includes iterative end-user testing, is the gold standard for optimizing safety and usability.20 Despite certification requirements enacted to improve usability, there is considerable variability in the quality and extent of usability evaluation conducted by EHR vendors.21 A systematic review of EHR implementations found that projects often failed to incorporate human factors methods that would inform implementation decisions regarding user interface and training adjustments.1

Medication safety is a central concern during transitions as EHR systems are known contributors to medication errors.19,22 One organization reported a fivefold increase in medication safety reports in the 3 months following transition from a legacy EHR to a commercial system.3 Medication administration is a high-occurrence, time-consuming nursing task with frequent errors.23 While barcode medication administration (BCMA) technology has reduced errors,24,25 it has not eliminated them.26,27 An analysis of sentinel events found the EHR’s medication administration record (MAR) used for BCMA specifically contributed to errors.22 In addition, BCMA workarounds, such as bypassing the medication or patient armband scans, are well documented.28,29,30 These adaptions, whether problematic or pragmatic in nature, occur when the workflows envisioned by system developers clash with actual nursing practice.31

The goal of this study was to identify potential BCMA problems with nurses’ transition from a legacy EHR system to a commercially available EHR product configured for local use. We employed simulation-based comparative usability testing of BCMA tasks to assess progress during the transition. We collected quantitative performance and qualitative perception data to (1) establish baseline performance data for both BCMA systems, (2) pinpoint potential risks for safety critical tasks, (3) identify focus areas for superuser training and go-live support, and (4) offer evidence-based recommendations for enhanced configuration changes.

Methods

Our simulation-based comparative evaluation included three rounds of data collection: (1) legacy system baseline performance (R1); (2) preliminary end-user performance in the new system prior to go-live (R2); and (3) follow-up evaluation 4 months post-implementation in the new system (R3). To ensure system stability across all three rounds and meaningful comparison of task performance, participants completed all tasks in a consistent training/testing version of the respective systems. Semi-structured interviews were conducted immediately after each session to capture nurses’ qualitative comments and identify themes related to system usability. This research was part of a larger study of EHR system usability approved by the organization’s Institutional Review Board in accordance with Human Research Protection Program guidelines.

Participants

We recruited a convenience sample of 15 registered nurses at the study site to ensure participants represented varying levels of nursing experience and different inpatient care areas. Table 1 provides detailed participant demographics. Among the 15 individual participants, 12 completed the legacy system evaluation (R1), 14 completed the new system evaluation pre-implementation (R2), and 9 completed the new system evaluation post-implementation (R3). Eleven of the nurses completed both R1 and R2, and seven nurses completed all three rounds. All nine of the nurses in R3 completed at least one of the prior evaluation rounds. All participants reported using legacy BCMA and half (n = 7) had used another BCMA system. Among those who completed the R2 pre-implementation evaluation, most (n = 11) had no prior experience using the new EHR.

Table 1 Participant Demographics

Procedure

Individual performance-based usability testing sessions were conducted according to industry best-practices20,32 in the usability laboratory of the Center for Research and Innovation in Systems Safety (CRISS) on the VUMC campus approximately 4–6 months prior to go-live (R1 and R2) and 4 months after implementation (R3). Participants were scheduled for 2-h sessions for R1 and R3, and a 3-h session for R2. Because R2 was conducted prior to all staff completing formal house-wide training, in R2 only participants received 1 h of training at the start of their session. Standardized R2 training included four interactive modules from the EHR vendor that demonstrated how to (1) navigate the workspace; (2) administer medications; (3) document scheduled, overdue, missed, and held medications; and (4) document IV lines/fluids, and medication drips. Participants then completed self-guided hands-on practice with patient armband and medication barcodes. Since all participants were experienced legacy system users and had completed formal training in the new system by R3, no additional training was provided for these sessions.

A research nurse with usability testing experience facilitated all study sessions. Participants interacted with both systems on the same desktop computer in the laboratory with a comparable handheld barcode scanner. Test patients were created in the training environment (R1 and R2) or test environment (R3) of each system to match task scenarios. Patient armband and medication barcodes were provided on laminated cards standardized across all sessions. Morae® usability testing software digitally recorded the participant’s interactions, including all mouse movement and clicks, typing, and screens displayed during the tasks. A web-camera recorded each participant’s facial expressions and a microphone recorded audible system alerts (e.g., scanner beeps) and verbal comments.

We created realistic scenarios for common and problem-prone administration tasks based on the medication orders and barcodes used in nursing orientation legacy training (Appendix 1). We used the same standardized tasks with both systems. Differences in system capabilities required a few minor modifications to task details in the new system. Presentation order for the scenarios was varied to reduce potential order effects on performance. The tasks tested were as follows: (1) Hold/Administer (hold two and administer three medications while addressing alerts); (2) IV Fluids (switch existing fluids to a new order at a higher rate); (3) PRN Pain (administer medication and document pain assessment/score); (4) Insulin (administer complex insulin doses); (5) Downtime/PRN (document previously administered medications and administer PRN); and (6) Message (send a message to pharmacy to adjust insulin schedules). We incorporated more challenging tasks into the scenarios based on existing training foci for known administration difficulties as well as feedback from the implementation team. For example, the Hold/Administer task required the nurse to adjust the administered dose in response to a partial package dose alert, and the Insulin task required interpretation of a complex sliding scale based on a blood glucose value.

After all tasks, participants completed the System Usability Scale (SUS), a validated instrument to measure perceived usability and satisfaction.33,34 Scores range from 0-worst to 100-best. At the end of each session, we conducted brief semi-structured interviews to explore participants’ perceptions and experience using the BCMA systems beyond the specific tasks evaluated. Questions elicited positive and negative aspects of BCMA system use, usability issues encountered in practice, and perceptions of the transition process’ impact on workflow.

We analyzed key performance metrics for each session recording: task completion rates, safety-related errors (i.e., any error that has the potential to impact patient safety if it occurred in the real world), other use errors (i.e., an interaction difficulty or error not expected to impact patient safety), and task completion times. Task completion success criteria were based on critical task actions and the “six rights” of medication administration (right patient, medication, dose, route, time, and documentation). A task was categorized as a failure if the user finalized the task with the wrong information entered for one or more of these criteria, abandoned the task prior to completion, or required facilitator assistance to complete the task. Omissions of secondary task details (e.g., failed to scan a bar code) were counted as errors, not failures. Only times for tasks that were successfully completed were included in the task time analysis. To establish an additional benchmark and facilitate comparisons given minor system differences, we used keystroke-level modeling35 to estimate the time it would take an expert to complete each task (details provided in Appendix 2).

Results

Task Performance

Table 2 provides detailed performance data by task and by round: task success and failure rates, safety-related errors, use errors, and task completion times. Figure 1 displays success rates by task and round. In most cases, nurses performed better on the post-implementation tasks compared to pre-implementation and legacy performance. Except for Downtime/PRN and Message tasks, success rates were maintained or improved post-implementation compared to baseline. Failure rates for the Downtime/PRN task markedly increased in R3, with successful completions dropping to a third of their R2 level. To confirm that participant dropout did not alter our conclusions, we performed a secondary analysis to compare the success rates for the seven nurses who completed all three rounds to the full sample and found performance was similar between groups (see Appendix 3).

Table 2 Task Performance
Figure 1
figure 1

BCMA task success rates.

The IV Fluids and Insulin tasks saw the highest combined average error rates across rounds. Overall error rates generally decreased in R3, and specifically, the seven nurses who completed all three rounds demonstrated a 50% reduction in safety-related errors during the post-implementation tasks compared to pre-implementation, and 53% compared to legacy system performance. Table 3 summarizes the reasons participants failed to successfully complete tasks and the types of safety-related errors identified. Errors spanned all “six rights” for administration except route. Documentation (e.g., duplicate documentation of downtime administrations) and process-related errors (e.g., missed scans) occurred most frequently. Notably, a dosing error related to the display of sliding scale insulin (SSI) instructions occurred at least once in every round. Legacy system limitations prevented line breaks in the free text administration instructions field, forcing critical information like SSI guidance to be displayed in a single line of text. As a result, multiple nurses misinterpreted the series of colons, semicolons, and equal signs used to separate the blood glucose ranges and corresponding insulin units.

Table 3 Task Failure Reasons and Types of Safety-Related Errors

Inherent system differences limited comparability of task times in two tasks: PRN Pain (i.e., legacy system included full pain assessment documentation within the MAR; new system only included the pain score field) and IV Fluids (new system integrated infusion start/stop time documentation within the MAR and orders prefilled the flow rate; legacy system lacked both capabilities). For the remaining tasks, average completion times increased in R2 when nurses encountered the new system but decreased in R3 to levels at or below R1. Notably, participants who successfully completed tasks did so faster in every task post-implementation compared to legacy system performance. Among the seven nurses who completed all three rounds, average task times decreased by approximately 38% compared to pre-implementation performance and 22% compared to legacy performance.

Satisfaction Scores

Participant’s perceived ease-of-use and satisfaction with the systems were indicated by mean SUS scores of 69.6 (SD ± 17.4, range 27.5–87.5) in R1, 53.9 (SD ± 17.9, range 27.5–90) in R2, and 63.6 (SD ± 19.6, range 30–85) in R3. Figure 2 displays the SUS score distribution across rounds, highlighting the change in scores for nurses who completed both R1 and R2, or R2 and R3. While about half of the 11 nurses’ who completed both R1 and R2 rated both systems comparably, the decline in R2 SUS scores was largely due to a subset of nurses (n = 5) who rated the legacy system relatively high but the new system substantially lower pre-go-live.

Figure 2
figure 2

SUS scores across evaluation rounds. The figure displays the SUS score distribution for all available data at each time point, with the left panel comparing legacy system scores to the new system pre-go-live and the right panel comparing new system pre- and post-go-live. Closed points denote participants for whom data is available for paired data analyses, and the black connecting lines denote the individual change in SUS core. The boxes along the vertical lines denote the median (bold lines) and interquartile ranges (outer box). Using the Wilcoxon signed rank test to compare paired data gives a p-value of 0.068 for the left panel and 0.399 for the right panel.

Interviews

Overall, most of the participants reported being excited about the new system, albeit nervous about the learning curve and how the change would impact workflows. Nurses consistently identified two factors that increased their confidence in a successful transition to the new system: (1) It is a proven system used nationally by many organizations, and (2) word-of-mouth from other nurses with prior new system experience has been positive.

Thematic analysis of the qualitative data uncovered three major usability themes expressed by nurses: (1) increased documentation burden; (2) excessive alerts/prompts; and (3) inefficient flowsheet design. Participants also identified several areas of dissatisfaction with BCMA processes in the new system that fell outside the scope of this evaluation (e.g., heparin co-sign processes, medication dispensing machine display issues, problematic blood administration documentation workflows). A detailed list of participants’ comments is provided in Appendix 4.

Informing Stakeholders

Stakeholders (nurse executives, informatics leaders) received a detailed review of the findings along with training emphases and targeted interface enhancement suggestions that would potentially address some of these priority areas following R2 and R3. Table 4 shows examples of the type of actionable recommendations informed by our quantitative and qualitative findings provided to transition leadership. As modifications to a user interface within a complex system can result in unintended consequences, additional post-implementation user testing was recommended to validate the effectiveness of the recommendations in a real-world use setting.

Table 4 Example Interface Design, Workflow, and Training Recommendations

The recommendations attempted to account for available local configuration options and limitations in our ability to modify the vendor’s system. These recommendations were not intended to disrupt the transition process, but rather to minimize short-term risk through improved awareness and training, and to suggest interface improvement opportunities during optimization. Similarly, we highlighted areas where nurses struggled or expressed confusion during the evaluation, as these represent areas where additional reinforcement during training and early implementation may be helpful. These needed not alter existing training curricula but could be effectively disseminated in supplemental forms (e.g., training tips distributed to superusers or additional practice scenarios for self-study in the system “playground”).

Discussion

Findings suggest that in less than 6 months post-go-live, these nurses had adapted to the new BCMA system and experienced enhancements in efficiency and effectiveness for the specific tasks evaluated. This study confirmed known legacy system problems (IV fluids, SSI), identified new system problems (downtime, messaging), and provided quantified performance data (error rates, time) against which to benchmark future system and workflow changes. Task time data helped inform expectations for learning curve workload increases during the initial go-live phase.

The study team made formal recommendations for system, workflow, and training changes; however, measuring the extent to which they were adopted and their impact on real world performance was outside the scope of this study. Go-live is just one point in time for a dynamic EHR ecosystem that is continuously evolving. This usability study not only provided transition insights into the new system, but an awareness of where the legacy system was not performing as expected. While it is important to prepare for transitions, it is prudent to use insights gained and study benchmarks to monitor and tune systems affected by modifications, updates, and user experiences going forward. Post-implementation medication error reporting studies3 are needed to help connect the dots between usability problems and medication errors.

Limitations

This study has limitations. It was conducted at a single academic medical center with extensive experience in clinical informatics and usability evaluations.36,37,38,39 We were only able to test a subset of BCMA tasks performed by nurses. Future research could expand the number of user groups, study sites, and range of tasks evaluated.

Implications/Future Directions

Application of usability testing methods to the EHR transition process can yield important evidence-based insights to inform end-user safety during this period of immense change. EHR vendors should incorporate usability evaluation of high-risk tasks into their customer implementation roadmaps. Adding formal usability requirements to EHR Requests for Proposals (RFP) could improve vendors’ current suboptimal approach to usability21 and shift industry expectations from a focus on user acceptance testing and whether the system is “working as designed by vendor” to high-quality user-centered design and “exceeding client requirements.” This study also identified serious usability issues in the legacy EHR while collecting baseline data. Thus, even after substantial experience using a system, design flaws in a well-established EHR can continue to create significant potential for errors and inefficiencies. Incorporating multiple human factors methods, including usability testing, heuristic analysis, and risk analysis, into implementation protocols for regular system updates may be necessary to ensuring safe use beyond the transition period. Healthcare organizations should freely share their evaluation findings, including all safety and usability issues uncovered, whether addressed through local configuration changes or requiring vendor intervention, to help improve the collective usability of and establish evaluation standards for commercial EHR products for all users.