Identification of potential genomic biomarkers for Sjögren’s syndrome using data pooling of gene expression microarrays

Abstract

Sjögren’s syndrome (SS) is an autoimmune disease characterized by lymphocytic infiltration and destruction of salivary and lacrimal glands. The diagnosis of SS can be challenging due to lack of a specific test for the disease. The purpose of this study is to examine the accuracy of using gene expression profile for diagnosis of SS. We identified 9 publically available datasets that included gene expression data from saliva and salivary gland biopsy samples of 52 patients with SS and 51 controls. Out of these datasets, we compiled and pooled data from three datasets that included 37 and 29 samples from SS patients and healthy controls, respectively, which were designated as “training set.” Then, we performed cross-listing in a group of independent gene expression datasets from patients with SS to identify consensus gene list of differentially expressed genes. We performed Linear Discriminant Analysis (LDA) to quantify the accuracy of discriminating genes to predict SS in both the “training set” and an independent group of datasets that was designated as “test set.” We identified 55 genes as potential classifier genes to differentiate SS from healthy controls. An LDA by leave-one-out cross-validation method identified 19 genes (EPSTI1, IFI44, IFI44L, IFIT1, IFIT2, IFIT3, MX1, OAS1, SAMD9L, PSMB9, STAT1, HERC5, EV12B, CD53, SELL, HLA-DQA1, PTPRC, B2M, and TAP2) with highest classification accuracy rate (95.7 %). Moreover, we validated our results by reproducing the same gene expression profile as a discriminatory test in the “test set,” which included data from salivary gland samples of 15 patients with SS and 22 controls with 94.6 % accuracy. We propose that gene expression profile in the saliva or salivary glands could represent a promising simple and reproducible diagnostic biomarker for SS.

Topics

4 Figures and Tables

Download Full PDF Version (Non-Commercial Use)