Augmented Terminal Feedback Influences Cognitive Ability of Surgical Trainees

Formative assessments in the form of Global assessment (GAS) and procedural based assessment (PBA) are the current methods used for feedback in British laparoscopic surgical training. Video error signature feedback (VESF) has been proposed as an alternative approach to enhance motor skills in laparoscopic training through influencing cognitive approach. Twenty laparoscopic novice students were randomized into Current standard feedback (CSF) and VESF groups. Both groups tied laparoscopic double square knots in four sequential stages. Standard human reliability assessment method (HRA) was utilized to assess unedited video recordings for errors. A validated scoring system by expert trainers assessed proficiency gain. Similar assessment was performed for both groups. Unedited video recordings of the VESF group were annotated for errors at each stage and provided as feedback through video hosting website. CSF group received assessment sheet as their feedback, comparable to current practice. Error numbers, time execution and proficiency gain were the outcomes. Inter-rater reliability among trainers for error detection was established. Results A total of 6490 movements were studied with 1613 errors detected. VESF group committed significantly less errors as compared to the CSF group [1011/1613 (63%) vs 602/1613 (37%), p<0.01]. VESF group gained proficiency earlier. Time execution was similar. Inter-rater reliability for error detection was high (p= 0.96). Discussion VESF effects cognitive framework of a laparoscopic task in trainee’s mind, ultimately reducing errors. This work demonstrated the practical application of video error signature feedback by demonstrating a simple laparoscopic task and analyzing its learning process through novice brains.


Introduction
Formative assessments in the form of Global assessment (GAS) and procedural based assessment (PBA) are the current methods used for feedback in British laparoscopic surgical training. Video error signature feedback (VESF) has been proposed as an alternative approach to enhance motor skills in laparoscopic training through influencing cognitive approach.

Methods
Twenty laparoscopic novice students were randomized into Current standard feedback (CSF) and VESF groups. Both groups tied laparoscopic double square knots in four sequential stages. Standard human reliability assessment method (HRA) was utilized to assess unedited video recordings for errors. A validated scoring system by expert trainers assessed proficiency gain. Similar assessment was performed for both groups. Unedited video recordings of the VESF group were annotated for errors at each stage and provided as feedback through video hosting website. CSF group received assessment sheet as their feedback, comparable to current practice. Error numbers, time execution and proficiency gain were the outcomes. Inter-rater reliability among trainers for error detection was established.

Introduction
A constructive learning approach is fundamental to the development of capability and self-confidence for surgical trainees to perform surgical task in operating theatre. Improvement in this motor dependent skill relies on the feedback provided by the trainer through an ideal assessment process. It takes a trainee multiple years of training to master a surgical skill through this systematic method. In the current surgical training environment, skills are learned under the supervision of a preceptor and the 'formative assessment' of the technical competence is considered sufficient for skill acquirement. Interpreting 'assessment' as a 'feedback process' has been practiced throughout modern era of surgical education and considered sufficient for skill acquisition. This traditional approach of 'assessment based learning' is favoured due to its capacity of accreditation for candidates but also facilitates the provision of an institutional success of a training program. In contrast to this current practice, cognitive psychologists have recommended 'feedback' as a fundamental approach towards motor skill acquisition [1]. Laparoscopic surgery exposed this fundamental disparity in the current practices, as it provided a platform for more objective style skill learning through video recordings of the task.

Error Science
Study of 'error science' offered a solution in terms of interpreting assessment and converting them to feedback. Instead of concentrating on levels of competence, if the task is broken down in different stages and a generic map of potential errors is drawn then 'quantifying errors' will provide an objective way of checking whether a trainee has improved (by enacting less errors). The concept of learning from errors is well established in surgical training [2]. Acquiring solutions to understand errors in surgical tasks have led to the understanding of employing 'error assessment tools' into the laparoscopy principles of safety. Amalgamation of these efforts based on understanding of error science, 'technical errors' were the one, highlighted in these scales of assessment. Ergonomic principles of Human Reliability Assessment (HRA) have been applied in the past to analyse video recordings of surgical procedures and have proven to be a valid concept for systemic 'error identification and assessment' [3][4][5].

Learning Process
Assessment and feedback are two different but interrelated processes, which serve educational demand for skill training in their own way but cannot be replaced with each other. In literature, feedback refers to specific information trainees receive about their performance which is intended to improve future performance. Feedback is a foundation of effectual training for motor skill acquisition and is considered one of the most important variables, aside from practice [6]. There are two general types of feedback: intrinsic and extrinsic (Augmented). Intrinsic feedback is the physical feel of the movement as it is being performed. It is what is felt by the performer as they execute a skill or performance. This feel is a sensory information which comes from sources outside of the body (exteroception), or from inside the body (proprioception) [7]. Artificial feedback which supplements the intrinsic feedback is called extrinsic or augmented feedback [8]. It is movement related information about a task, such as the direction of error that a movement has produced. This information is generated as a result of the movement; hence it is not available before the execution of a movement. There are two types of augmented feedbacks; concurrent feedback, which is an augmented feedback provided to a performer during a task performance and Terminal feedback, which is provided after the completion of the task.

Translating Feedback into learning
The potential use of terminal feedback as a learning tool in simulation based surgical training is significant and it results in better learning as compared to concurrent feedback. The downfall of terminal feedback in clinical settings is that the errors cannot be allowed to progress due to patient safety [9]. Mentally, translating feedback into performance is a trainee dependant practice. There are two types of augmented feedbacks, which have been used in context of motor learning; knowledge of results and knowledge of performance. Knowledge of results (KR) is defined as the post response augmented information about the success of performance. Knowledge of performance (KP) is defined as extrinsic post response, kinetic information referring to the aspects of movement pattern [10]. Motor skill acquisition can be achieved in three stages of practice; Cognitive, associative and Autonomous. Development of skill to autonomous stage requires a constant process of formative assessment and feedback [11].

Video Feedback
Video feedback of a performer's movements is a common method used in sports and rehabilitation sciences. Performers can observe their overall movement pattern to gain an enhanced perspective of the spatio-temporal aspects of the action and coordination pattern. One critical aspect in the video feedback involves directing the attention of the performer to specific aspects of the movement that require modification or correction which improves focus. Failure to direct a performer's attention to specific points in video tape loses the beneficial effects of this method and can make KP ineffective for learning [12].

Video Error Signature Feedback
This study aims to establish a new method of creating an augmented terminal feedback [9], after laparoscopic task assessment. It involved the introduction of error signatures on an unedited video recording to create 'video error signature feedback' (VESF). It incorporated the methodology of human reliability assessment (HRA) to study the impact of errors on laparoscopic task performance when compared with the current gold standard through a randomised controlled trial. This concept aspires to be utilised for laparoscopic task assessment and augmented terminal feedback provision at 'Trainer to Trainee level'. This study also hypothesise that VESF is reliable augmented terminal feedback system in assessing enacted errors in a surgical task, when compared with current standards.

Study Participants
Twenty medical students and junior doctors who were laparoscopic novice were invited to this study. Any form of laparoscopic suturing training was considered as exclusion criteria. Candidates were randomised by utilising online 'Research Randomiser software' (maintained by Social Psychology network of Wesleyan University, Middletown, Connecticut, USA) into two groups e.g., Video error signature feedback (VESF) and Current standard feedback (CSF), on the basis of the feedback they received. There was similar assessment process for both groups.

Selection of task
A task of laparoscopic double square knot was chosen for this study ( Figure 1). It is a commonly used specialist laparoscopic task and was chosen due to:  Measurability standards when divided into subtasks and further subdivisions into steps.  Reproducibility  Possibility to calculate errors at task and subtask level, hence possibility of studying proficiency gain and time execution. Participants perform this laparoscopic task of performing double square knot in an endo-trainer box on a neoprene sheet with Vicryl  Each candidate performed this task of making laparoscopic square knot four times in each stage. Last task of each stage was recorded as the final task and recorded for analysis. Each stage was one week apart. Candidates received feedback within twenty four hours of the completing the stage and were advised to study feedback to create a 'mental understanding' of task and subtasks. Similar assessment of error detection was created for each candidate and studied by a team of assessors who were blinded to study groups. Studied groups were only provided appropriate feedbacks through different mediums. The only difference between groups was the kind of feedback they received and its provision method. It was studied that a novice candidate can perpetrate sixteen different errors during this task of double square knot. These errors were standardised to keep inter-rater reliability high.

Setup
Induction setup: Each candidate was allowed 30 minutes to perform two basic tasks in endotrainer box ( Figure 2). Digital camera was connected to a laptop computer. Monitor was placed in front of the endotrainer to mimic laparoscopic two dimensional (2-D) working environment ( Figure 3). Digital camera activity was controlled through Kinovea software. First induction task involved shifting small plastic cylinders in two small containers using both instruments i.e., right and left handed instruments, to educate about depth perception. Second induction task included bridging three rubber bands between three needles using both instruments to explicate bimanual dexterity.  Study Setup: Laparoscopic setup was created and kept ready for laparoscopic task performance after induction (Figure 4). Right and Left macro needle holders (Karl Stortz Gmbh & Co. Tuttlingan, Germany) were hand specific due to a thumb dependant jaw opening mechanism. Needle holders gained access through appropriate 5mm peripheral ports. Training to use these needle holders was provided during the induction. Polyglycolic (Vicryl) is a synthetic absorbable suture which is widely used during laparoscopic surgery procedures. Its tensile strength and memory also makes it a suture of choice for laparoscopic knot formation in laboratory settings. In the study settings, this suture was passed through the suturing base 'neoprene sheet' and needle was removed before the start of the task ( Figure 5, 6).

Outcome Measures
 Number of errors in the task was the primary outcome.
 Erros in subtasks (n=4), Proficiency stage and time execusion during the task performance were the secondary outcome.

Blinding
Candidates from both groups were blinded to the type of group until completion of stage 1 to minimise selection bias. Candidates were not known to each other and never called in to perform the task at the same time. Candidates were also blinded to the details of assessment process to reduce attention bias (They were never made aware of their level of competence). Assessor 1 and 2 were blinded to the type of group throughout the study. Assessor 3 was blinded to the candidates and their orientation to groups. Assessor 4 was blinded to assessment and feedback process.

Assessment Process
The task of laparoscopic double square knot was studied in detail and divided into four subtasks and twenty six individual steps. These subtasks were integrated with Juster scale [13] (Table 1) for the development of a proposed assessment method i.e., Error assessment sheet (EAS) sheet (Table 1), similar to generic task zones of Global assessment (GAS) in Laparoscopic colorectal training program (current gold standard) [13].  EAS was marked with randomized number of each participant. All assessors were experienced laparoscopic trainers with more than 20 years' experience in teaching laparoscopy technical skills. Assessor 1 (A1) and 2 (A2) supervised participants during the task performance. They marked each candidate's dependence on trainer (As per Juster scale) and recorded the video of the task and submitted to Assessor 3 (A3) for 'error analysis' and developing specific feedbacks. A3 never interacted with candidates. A3 studied videos and noted errors on specific subtask and step levels on EAS and submitted to Assessor 4 (A4) for data analysis. After submission of EAS sheets to A4, A1 was asked to mark all videos

Feedback Process
A3 created specific feedbacks i.e., CSF and VESF. For CSF group, similar EAS was utilized. Only 'feedback comments' were added at task and subtask levels with tips to improve the performance. There was no mention of specific mistakes. Candidates were instructed to concentrate on tips to improve their skill. This was done to analyze 'pure feedback' lead improvements in task performance. Candidates were left to guess about their competency. For VESF group, the concept of introducing signatures on the video of task attracted the potential of attracting both audio and visual senses. 'Error markers' were introduced in different shapes, colours, sizes and angles, to identify, highlight and explain the enacted error on the 'running' video ( Figure 7). VESF was created in two stages; First stage involved the introduction of error markers on unedited video through a video analysis software i.e., Kinovea Motion Tuner (version 0.8.15) by Rickard Anderson, USA. As a second step, voice tags were added by utilizing a video editing software i.e., AVS Video Editor by Online Media Technologies Ltd, London, United Kingdom.

Feedback Provision
A3 uploaded VESF to a video sharing website, YouTube (Google, San Bruno, California, USA). A video management account was created in online YouTube site. After uploading each video, 'unlisted' setting was selected (only the recipient of online video link could visualise this video). The online link address of the video was sent to each candidate from VESF group through a previously registered email address. CSF was provided as an attachment to the email sent to each candidate from CSF group. Both feedbacks were provided within 24 hours of task performance. Candidates were instructed to review their respective feedback multiple times before they attend for next stage task performance one week later.

Statistical Analysis
The statistical package for the Social Sciences software (version 17.0.0, SPSS Chicago, IL, USA) and Excel (Microsoft Excel, Microsoft Corporation, Redmond, Washington, USA) was used. For construct validity, a comparative test (Wilcoxon signed-rank test) was used for the error analysis and was demonstrated on box plot charts. Alpha (Cronbach) determined a model of internal consistency (based on the average inter-item correlation) and Intraclass correlation coefficients was utilised to compute interrater reliability estimates of task errors between the two trainers (A1 & A2). Based on previous similar study13, power calculation suggested that 20 candidates should enable the detection of 20% difference in error numbers between the two groups with 80% power at p<0.05.

Results
In this RCT, 6490 individual movements were studied. A total of 1613 separate errors were recorded during the observation of 80 tasks, 320 subtasks and 2080 steps in all stages. In stage one; groups were comparable in the number of enacted errors. Highly significant difference was noted in later stages. Overall, 1011/1613 (63%) errors were noted in CSF group; whereas, VESF group committed 602/ 1613 errors (37%) (Figure 8).  Proficiency stage was based on the Juster scale assessment per task. A score of 5 per subtask and 20 per task was considered appropriate for 'proficiency gain' status ( Figure 9). There was no significant difference in time execution between groups before and after stage two.

Discussion
Current progress in the literature regarding error analysis and its impact on improving technical safety during laparoscopy procedures is serious, intellectually coherent and occasionally inspiring. While it has a departure point (Patient safety) and a destination (Assessment and feedback toosl), the route is somewhat unclear; however, there is a constructive effort in literature to establish the productive strength of different methods of assessment and feedback for the technical and cognitive skills of laparoscopic trainees. Culturally, the developmental need for assessment process is complementing a trainee's task to understands his level of achievement; whereas, need for feedback understands errors in the task. To master a laparoscopic skill, a surgical trainee relies on many years of hard work. The impact of shortened surgical training time and ever changing technology introduces longevity in the proficiency achievement for speciality specific laparoscopic procedures. Psychologists consider error identification vital to motor skill development. Based on the results from this trial, it may not be unreasonable to quote that VESF helped to reduce unconscious tendency of a candidate to cause errors by enhancing the development of cognitive framework of a laparoscopic task. Which also translated the fact that if a trainer concentrates on the cognitive understanding of a surgical trainee along with highlighting 'tips' of improvement rather than mistakes, there is significant evidence that improvement in performance will be faster. Current training methods in practice are mainly focused around assessment rather than feedback system, contrary to the literature which favored a strong augmented feedback system (concurrent or terminal) for its impact on motor skill performance. VESF provided an opportunity for augmented terminal feedback and its impact was instrumental in reducing errors in performing a complex laparoscopic task by a group of novice candidates. In the current gold standard practice, the proficiency stage (Juster scale) of a task is measured as an objective method of assessment but the doubts about its subjectivity are far from over which can seriously challenge the validity of any study if this method of assessment is utilized. In operation theatre settings, it might be possible to record the voice or actions of a trainer to establish inter rater reliability of proficiency stage to overcome this issue of subjectivity. Although proficiency gain in a particular surgical task is the desired destination for any trainee; nevertheless, 'error rate' is required to overcome specific difficult stages of any operation. This study strongly favored the impact of augmented terminal feedback on error rate. This study fundamentally translated HRA into laparoscopic vitro settings for error analysis. Strict error assessment with feedback system had strong impact on error types at skill based, rule based and knowledge based errors. Each step in laparoscopic double knot tying is related to the next step. Errors in performing one step of subtask may not result only in the incompletion of that particular subtask but may also result in a hidden (latent) error. Both groups were assessed similarly by visualizing their performance video and noting their errors, time on EAS (proficiency stage was noted during the performance). Due to this strict methodology, the only difference between the groups was the type of feedback and feedback provision method. EAS was made as similar to current gold standard (GAS) as possible. There was also notable improvement in CSF group. It was difficult to ascertain as to whether this improvement was as a result of an assessment process or the reflection of the memory (Knowledge of results) 1 which also plays a role as feedback. Conscious effort to establish different aspects of validity proved it as a strong method of assessment and inter rater reliability (0.96) overcame the issues of subjectivity over error detection. Video hosting websites for any form of feedback provision have never been used before in any literature .YouTube ® as video feedback provision method is feasible and highly recommended due to safety, privacy, easy usage and availability.. This study provided a unique opportunity to establish the compatibility of already available sources to surgical training. It is possible to create a unique YouTube channel by an assessing authority to assess procedural videos and create video feedbacks. For trainees, they could keep a track of their videos and learn from the past errors and judge their own improvement (objective learning).

Limitations
Assessing an un-edited video of a task is a time consuming exercise and assessing the trainee's dependence on trainer while performing a task is virtually impossible. The average time taken to create a VESF (38 min) was more than the time taken to create a CSF (15 min). This is due to non-availability of any software which could analyze, insert error markers and add audio tags with ease at the same time. Two different softwares were utilized consecutively to overcome this problem. This problem can be overcome by developing purpose built software for VESF. Also, to assign a certain level of competence of a trainee from an un-edited video is a subjective exercise to a blind assessor, hence different set of assessor were used which is practically not possible in operation theatre. This introduces subjective element into surgical training which could be reduced with the training for a similar procedure done by different trainers over a period of time.

Conclusions
This Study verified the practical application of augmented terminal feedback by demonstrating a simple laparoscopic task and analyzing its learning process through novice brains. VESF effects cognitive framework of a laparoscopic task in trainee's mind, ultimately reducing errors. There is a need of purpose built software which could complement current laparoscopic equipment of video processing to develop video error signature feedback.