Performance evaluation¶

Each algorithm will be evaluated in terms of classification accuracy in classifying the images into one of the four histology classes (gastric metaplasia, Barrett's esophagus, neoplasia). For the most representative score evaluation,a leave-one-out cross-validation must be performed at the site level on the training set, which we call Leave-One-Site-Out Cross-Validation (LOSOCV). Hence, the algorithm should be evaluated for all sites separately, where the images of all the other sites are used for training. In this fashion, all possible bias due to images of the same imaging site appearing in both the training and the test set is ruled out.