Medical Image Classification
Welcome to our site!
This is a website demonstrating information about our class project for CAP 5610 - Introduction to Machine Learning.

Here are the links to our paper and presentatoion - please feel free to view them and contact us if necessary.


Abstract
The demand for automaticic annotation of medical images is growing faster than ever. Manual description and annotation of each image is time consuming, expensive and impractical. This calls for development of automatic image annotation algorithms that can perform the task reliably. In this paper, we present a hierarchical multi-label classification (HMC) system for medical image automatic annotation. In order to find the most effective classification, a contextual hierarchy (CH) [5] loss is presented in accordance with the problem hierarchy. The problem of automatic annotating medical images can be efficiently solved using a greedy algorithm (GLabel) [5] on tree-structured label hierarchies.  The experimental results on IMAGECLEF 2007 [1] annotation data set clearly show the strength and the promise of the presented methods.
Background and Intorduction
Medical images play a central role in patient diagnosis, therapy, surgical planning, medical reference, and medical training. With the advent of digital imaging modalities, as well as images digitized from conventional devices, collections of medical images are increasingly being held in digital form. It becomes increasingly expensive to manually annotate medical images. Consequently, automatic medical image annotation becomes very important [2].
    This paper describes the medical annotation task using the ImageCLEF 2007 dataset [1]. The objective of this task is to provide the Image Retrieval in Medical Applications (IRMA) code [3] for each image of a given set of previously unseen medical (radiological) images. 12,076 classified training images are provided to be used in any way to train a classifier. The results of the classification step can be used for multilingual image annotations as well as for DICOM standard header corrections.  According to the IRMA code [3], a total of 197 classes are defined. The IRMA coding system consists of four axes with three to four positions, each in {0,…,9, a,…,z}, where “0” denotes “unspecified” to determine the end of a path along an axis. This allows a short and unambiguous notation (IRMA: TTTT-DDD-AAA-BBB), where T, D, A, and B denotes a coding or sub-coding digit of the respective axis. Figure 1 gives two examples of unambiguous image classification using the IRMA code. The image on the left is coded: 1123 (x-ray, projection radiography, analog, high energy) – 211 (sagittal, left lateral descubitus, inspiration) – 520 (chest, lung) – 3a0 (respiratory system, lung). The image on the right is coded: 1220 (x-ray, fluoroscopy, analog) – 127 (coronad, ap, supine) – 722 (abdomen, upper abdomen, middle) – 430 (gastrointestinal system, stomach).