Date of Award

Summer 8-15-2017

Author's School

Graduate School of Arts and Sciences

Author's Department

Biology & Biomedical Sciences (Computational & Systems Biology)

Degree Name

Doctor of Philosophy (PhD)

Degree Type



The epidermis covers the surface of the skin and functions as a protective barrier for the entire body. As the epidermis is necessary for protecting organisms from environmental threats such as water loss, ultraviolet radiation and microbial infection, it is found in nearly all vertebrate species. Cells in the epidermis called keratinocytes proliferate in the basal layer of the epidermis and migrate upwards to the suprabasal spinous and granular layers and finally in the stratum corneum. As this terminal differentiation process occurs, genes in the Epidermal Differentiation Complex (EDC) locus are expressed. The human EDC consists of four gene families: S100, SPRR, LCE and FLG-like, which have resulted from gene duplications during vertebrate evolution. Furthermore, the EDC was identified as being under the strongest degree of positive selection in the human lineage compared to our most recent common ancestor, the chimpanzee as evidenced by an abundance of non-synonymous nucleotide substitutions. However, the mechanisms underlying the positive selection of the human EDC remain unknown. I hypothesize that the EDC and the human skin barrier continues to evolve in primates and in ancestral human populations. Here, I examine the sequence of genes and patterns of allele frequencies in the EDC of primates and humans to more accurately determine the molecular and evolutionary mechanisms underlying positive selection in the EDC. Using a diverse panel of EDC gene homologs from 14 mammals, I identify biologically significant non-synonymous single nucleotide polymorphisms (SNPs) in filaggrin, SPRR4, LELP1, and S100A2. By contrast, I identify recent positive selection in SPRR4 in primates. I observe positive selection on specific sites in SPRR4, LELP1, filaggrin, and repetin across 14 mammals. In addition to continuous evolution of SPRR4, I discover site-specific positive selection in S100A11, KPRP, SPRR1A, S100A7L2, and S100A3 in primates and filaggrin, filaggrin2, and S100A8 in great apes. Very recent human positive selection was identified in the filaggrin2 L41 site that was present in Neanderthal. Given this finding, I hypothesized ongoing positive selection in the EDC in modern human populations for skin barrier adaptation. Composite of multiple signals (CMS) statistic scores for each EDC SNP were extracted and analyzed to identify positively selected SNPs in the CEU, YRI and JPT/CHB populations. Three coding SNPs (rs2229496, rs7535306 and rs7545520) in the involucrin (IVL) gene and one non-coding SNP (rs4845490) upstream of the SMCP gene exhibited the highest CMS scores with derived allele frequencies (0.95) specific to the CEU population. Direct and positive correlations were observed between the allele frequencies for the three IVL and the SMCP CEU SNPs and increases in northern latitude revealing the positive selective pressure for these alleles in northern geographical locations. The IVL CEU SNPs are associated with a relatively longer haplotype in the CEU that includes an epidermal-specific enhancer (923) that was not observed in the YRI and JPT/CHB populations. Moreover, rs4845327 in the 923 enhancer is in linkage disequilibrium with the IVL CEU haplotype and is a skin-specific eQTL. The CEU major allele for rs4845327 was also associated with increased enhancer activity in reporter assays as well as an allele-specific increase in IVL expression. Together, my findings reveal a recent selective sweep in modern human populations for an IVL defined haplotype harboring an enhancer associated with increased allele-specific IVL expression for epidermal barrier function.


English (en)

Chair and Committee

Cristina de Guzman Strong

Committee Members

Donald Conrad, Justin Fay, Nancy Saccone, Gary Stormo,


Permanent URL:

Available for download on Wednesday, December 15, 2117