Mammary Gland evo-devo

Mammary glands are a defining feature of mammals and have been key to our success for the last 60 million years. The need to care for and feed the next generation of baby mammals places huge amounts of selective pressure on body plan development to match the number of mammae to the number of offspring and position of the mammary glands to a feeding strategy appropriate to the environment. Despite the inherent interest in mammary glands, we know very little on how either of those plans are implemented in the genome. 

To answer these questions, I use the African multimammate rat (Mastomys coucha) that has doubled its litter size (8-14) and number of mammary glands (16-24) in the 8 million years since it diverged from the house mouse. I've created a de novo genome assembly and annotated the genome of M. coucha to find candidate enhancers controlling the extreme gain of mammary glands. This genome sequence is available at, along with a sister species M. natalensis, an important host of Lassa virus in Africa.

Exaptation of Repetitive Elements

Mammalian genomes are littered with the remains of repetitive element invasion. Over half of the human and mouse genomes are derived from these sequences, including DNA next to genes where they can influence regulation.  The repetitive nature of these elements has made it difficult to incorporate into experiments with short high throughput sequencing reads due to the mapping problems of placing short reads. By focusing on the older, more diverse, and mappable repetitive elements, I have discovered a set that are enriched for tissue specific histone activity in the mouse. This process of changing activity from A to B, genomic invaders to gene regulators, is called exaptation 

The questions we can answer with this set of active repetitive elements include: How much of tissue specific regulation is due to exaptation? Does exaptation work through the creation or co-option of functional elements in repeats? Are older repetitive elements more likely to be exapted? 

Enhancer variation

Natural variation and diversity is the fuel of natural selection. While the effects of variation in protein coding regions are understood in enough depth to predict their effect on the protein function, these easily understood regions typically are no more than 2% of vertebrate genomes. The regulatory elements controlling the expression of the genes make up 2-3x more then the genes themselves but despite their critical role, predicting the effect of variation in regulatory elements is extremely difficult. However, by using a set of natural variants that control critical patterning during development we can restrict our problem to understanding variants that have easily measured effects.

I used a set of highly inbred strains of Drosophila melanogaster with high quality genotypes and measured the chromatin accessibility landscape during early embryogenesis along with RNA-seq. This combination of tightly staged embryos with high resolution measurements of the genetic variation, chromatin variation and expression variation enabled me to make predictions of variant effects in the binding sites of key developmental regulators such as the Zelda pioneering transcription factor. All genome data avalible here: