Interstitial Lung Disease (ILD) encompasses a wide array of diseases that share some common radiologic characteristics. When diagnosing such diseases, radiologists can be affected by heavy workload and fatigue thus decreasing diagnostic accuracy. Automatic segmentation is the first step in implementing a Computer Aided Diagnosis (CAD) that will help radiologists to improve diagnostic accuracy thereby reducing manual interpretation. Automatic segmentation proposed uses an initial thresholding and morphology based segmentation coupled with feedback that detects large deviations with a corrective segmentation. This feedback is analogous to a control system which allows detection of abnormal or severe lung disease and provides a feedback to an online segmentation improving the overall performance of the system. This feedback system encompasses a texture paradigm. In this study we studied 48 males and 48 female patients consisting of 15 normal and 81 abnormal patients. A senior radiologist chose the five levels needed for ILD diagnosis. The results of segmentation were displayed by showing the comparison of the automated and ground truth boundaries (courtesy of ImgTracer™ 1.0, AtheroPoint™ LLC, Roseville, CA, USA). The left lung's performance of segmentation was 96.52% for Jaccard Index and 98.21% for Dice Similarity, 0.61 mm for Polyline Distance Metric (PDM), -1.15% for Relative Area Error and 4.09% Area Overlap Error. The right lung's performance of segmentation was 97.24% for Jaccard Index, 98.58% for Dice Similarity, 0.61 mm for PDM, -0.03% for Relative Area Error and 3.53% for Area Overlap Error. The segmentation overall has an overall similarity of 98.4%. The segmentation proposed is an accurate and fully automated system.
Human interaction has become almost mandatory for an automated medical system wishing to be accepted by clinical regulatory agencies such as Food and Drug Administration. Since this interaction causes variability in the gathered data, the inter-observer and intra-observer variability must be analyzed in order to validate the accuracy of the system. This study focuses on the variability from different observers that interact with an automated lung delineation system that relies on human interaction in the form of delineation of the lung borders. The database consists of High Resolution Computed Tomography (HRCT): 15 normal and 81 diseased patients' images taken retrospectively at five levels per patient. Three observers manually delineated the lungs borders independently and using software called ImgTracer™ (AtheroPoint™, Roseville, CA, USA) to delineate the lung boundaries in all five levels of 3-D lung volume. The three observers consisted of Observer-1: lesser experienced novice tracer who is a resident in radiology under the guidance of radiologist, whereas Observer-2 and Observer-3 are lung image scientists trained by lung radiologist and biomedical imaging scientist and experts. The inter-observer variability can be shown by comparing each observer's tracings to the automated delineation and also by comparing each manual tracing of the observers with one another. The normality of the tracings was tested using D'Agostino-Pearson test and all observers tracings showed a normal P-value higher than 0.05. The analysis of variance (ANOVA) test between three observers and automated showed a P-value higher than 0.89 and 0.81 for the right lung (RL) and left lung (LL), respectively. The performance of the automated system was evaluated using Dice Similarity Coefficient (DSC), Jaccard Index (JI) and Hausdorff (HD) Distance measures. Although, Observer-1 has lesser experience compared to Obsever-2 and Obsever-3, the Observer Deterioration Factor (ODF) shows that Observer-1 has less than 10% difference compared to the other two, which is under acceptable range as per our analysis. To compare between observers, this study used regression plots, Bland-Altman plots, two tailed T-test, Mann-Whiney, Chi-Squared tests which showed the following P-values for RL and LL: (i) Observer-1 and Observer-3 were: 0.55, 0.48, 0.29 for RL and 0.55, 0.59, 0.29 for LL; (ii) Observer-1 and Observer-2 were: 0.57, 0.50, 0.29 for RL and 0.54, 0.59, 0.29 for LL; (iii) Observer-2 and Observer-3 were: 0.98, 0.99, 0.29 for RL and 0.99, 0.99, 0.29 for LL. Further, CC and R-squared coefficients were computed between observers which came out to be 0.9 for RL and LL. All three observers however manage to show the feature that diseased lungs are smaller than normal lungs in terms of area.
Lung disease risk stratification is important for both diagnosis and treatment planning, particularly in biopsies and radiation therapy. Manual lung disease risk stratification is challenging because of: (a) large lung data sizes, (b) inter- and intra-observer variability of the lung delineation and (c) lack of feature amalgamation during machine learning paradigm. This paper presents a two stage CADx cascaded system consisting of: (a) semi-automated lung delineation subsystem (LDS) for lung region extraction in CT slices followed by (b) morphology-based lung tissue characterization, thereby addressing the above shortcomings. LDS primarily uses entropy-based region extraction while ML-based lung characterization is mainly based on an amalgamation of directional transforms such as Riesz and Gabor along with texture-based features comprising of 100 greyscale features using the K-fold cross-validation protocol (K = 2, 3, 5 and 10). The lung database consisted of 96 patients: 15 normal and 81 diseased. We use five high resolution Computed Tomography (HRCT) levels representing different anatomy landmarks where disease is commonly seen. We demonstrate the amalgamated ML stratification accuracy of 99.53%, an increase of 2% against the conventional non-amalgamation ML system that uses alone Riesz-based feature embedded with feature selection based on feature strength. The robustness of the system was determined based on the reliability and stability that showed a reliability index of 0.99 and the deviation in risk stratification accuracies less than 5%. Our CADx system shows 10% better performance when compared against the mean of five other prominent studies available in the current literature covering over one decade.