Data Science Insitute
Center for Computational Molecular Biology

Hosted bi-weekly on Wednesdays at 4pm at the Data Science Institute, Room 302. See below for seminar dates and speakers. 

Fall 2024 Seminar Schedule

Wednesday September 11: Transforming Cancer Informatics at Brown

Center for Clinical Cancer Informatics and Data Science (CCIDS) Speakers

Jeremy L. Warner MD, MS, FAMIA, FASCO is the founding Director of CCIDS, Associate Director of Data Science for the Legorreta Cancer Center, and Professor of Medicine and Professor Biostatistics at Brown University. 

Ece Uzun, MS, PhD, FAMIA is the founding Director of Clinical Bioinformatics at Lifespan, Associate Director of CCIDS, Associate Professor of Pathology and Laboratory Medicine at Brown University and Editor-in-Chief of JMIR Bioinformatics and Biotechnology. 

Sanjay Mishra, MS, PhD is a Research Associate of Medicine at Brown University, a Research Program Manager at the Lifespan Cancer Institute, and the coordinator of CCIDS and the COVID-19 and Cancer Consortium (CCC19).

Abstract: The Brown Center for Clinical Cancer Informatics and Data Science (CCIDS) aims to 1) lead and collaborate in the development of standards for clinical cancer informatics and genomics; 2) to demonstrate and improve best practices in interoperability of the local data ecosystem that aligns with clinician and bioinformatician workflows; 3) to develop best-in-class natural language processing (NLP) solutions for cancer phenotyping at scale; 4) to cultivate machine-learning based models to predict cancer status using patient data; 5) to identify and promote rapid translational innovations; and 6) train next generation of clinicians and basic researchers in the domain of clinical cancer informatics through collaboration with stakeholders across Lifespan and Brown University, and the national organizations. In this seminar, we will introduce CCIDS and share our current research activities, seminar series and future plans.

 

Wednesday, September 25: Performance and safety of AI and machine learning algorithms and LLMs for clinical diagnosis by patients

Hamish Fraser MBChB, MSc, FACMI, FIASHI; Associate Professor of Medical Science, Brown Center for Biomedical Informatics, Warren Alpert Medical School, and Associate Professor of Health Services, Policy and Practice, School of Public Health, Brown University 

Hosted by Ritambhara Singh, John E. Savage Assistant Professor of Computer Science and Data Science.

Abstract: Over 350,000 health related apps are available in the US, mostly in use outside of formal health systems. One widely used application area is diagnosis and triage of symptoms by patients, with apps termed Symptom Checkers.  These apps have the potential to help people recognize symptoms of potentially serious diseases and seek care in a timely manner, or reassure people with minor conditions. However there is very limited evidence regarding the accuracy, safety and usability of such apps. In my lab we have been evaluating a symptom checker in use by actual patients attending urgent or emergency care in Rhode Island, and comparing their diagnostic and triage accuracy to the physicians who actually see the patient. Recently we have extended this work to study the diagnostic and triage accuracy of ChatGPT and GPT4 for these cases and also for diagnosis of TIA or stroke. I will describe our findings to date, the potential benefits ands risks of these diagnostic tools, and how ChatGPT compares to established algorithms. Finally I will discuss how some of these findings can generalize to low income countries like Kenya.

 

Wednesday, October 9: Inferring gene regulatory network dynamics that control cell fate in single cells

Zoom: https://brown.zoom.us/j/95516847080

Adam MacLean, PhD; Assistant Professor of Quantitative and Computational Biology, University of Southern California

Hosted by Ritambhara Singh, John E. Savage Assistant Professor of Computer Science and Data Science.

Abstract: Cells make decisions to enable multicellular life. Cell fate decision-making underlies development and homeostasis, and goes awry as we age. Despite great promise, we have yet to harness the high-resolution information on cell states and fates that single-cell genomics data offer to understand cell fate decisions in development and aging. Nor do we know how these fate decisions are controlled by gene regulatory networks. I will describe our recently developed methods for gene regulatory network inference using single-cell multiomics to infer dynamic cell state transitions. We have also constructed models of cell fate decisions in stem cells to discover how early-life events – mutational, transcriptional, and epigenetic – shape and change stem cell function as we age in a manner that could be harnessed to ameliorate diseases of aging.

 

Wednesday, October 23: Multimodal characterization of cell states at single-cell resolution

Zoom: https://brown.zoom.us/j/97248840588

Galip Gürkan Yardimci; Assistant Professor, CEDAR, OHSU Knight Cancer Institute

Hosted by Ritambhara Singh, John E. Savage Assistant Professor of Computer Science and Data Science.

Abstract: Advent of single-cell genomics has enhanced our ability to study heterogeneous cell populations (1) to track course of temporal processes, such as cellular differentiation, (2) to identify novel and rare cell states, and (3) characterize the heterogeneity of complex tissues, such as tumors.  In this talk, I will present two methods to study such cell populations using multimodal single-cell omics assays. Our first algorithm, Epiconfig, is an interpretable multimodal topic model that learns unsupervised clustering of single-cells while modeling cross modality relationships. We applied EpiConfig to a collection of sc-RNA+ATAC-seq assays that jointly measure transcriptomic and chromatin accessibility of single cells from healthy and cancerous cell populations. Epi-Config is as accurate as widely used sc-multiomics clustering methods; it learns sets of RNA,ATAC and cross modality features , called topics, that correspond to specific cell types and states. We developed a shiny app for interpretation of these topics to obtain biological insights into different cell states; we show that cross modality features reflect 3D genome interactions. Our second method, RIDDLER, can identify copy number variation (CNV) events in single-cell datasets. CNV is a widely studied structural variation seen in the genomes of cancerous and other dysfunctional cells. CNVs can have direct and indirect effects on gene dosage, and may drive cancer progression and other disorders. RIDDLER is a single-cell resolution CNV detection algorithm based on outlier aware generalized linear modeling. We demonstrate the effectiveness of our algorithm on cancer cell line models where it achieves better agreement with sc-WGS derived CNVs than competing methods. RIDDLER is able to accurately reconstruct clonal heterogeneity of the cell population, in accordance with sc-WGS derived clones, and can be applied to both sc-ATAC-seq, sc-WGS and sc-methylation datasets. Furthermore, we show that RIDDLER can enable CNV similarity based integration of different modalities.

 

Wednesday, November 6: Deep statistical modeling of nanopore sequencing translocation times reveals latent non-B DNA structures

Zoom: https://brown.zoom.us/j/97944260330

Derek Aguiar, PhD; Associate Professor in the Department of Computer Science and Engineering, University of Connecticut 

Hosted by Roberta De Vito, Thomas J. and Alice M. Tisch Assistant Professor of Biostatistics and Data Science.

Abstract: Non-canonical (or non-B) DNA are genomic regions whose three-dimensional conformation deviates from the canonical double helix. Non-B DNA plays an important role in basic cellular processes and is associated with genomic instability, gene regulation, and oncogenesis. Experimental methods are low-throughput and can detect only a limited set of non-B DNA structures, while computational methods rely on non-B DNA base motifs, which are necessary but insufficient indicators of non-B structures. I will present our group’s work on the first computational method to predict non-B DNA structures from Oxford Nanopore sequencing. We formalized non-B structure prediction as a novelty detection problem and developed an autoencoder that optimizes goodness-of-fit tests, which enables the computation of P-values that indicate non-B structures. Based on whole genome nanopore sequencing of NA12878, we showed that there exist significant differences between the timing of DNA translocation for non-B DNA bases compared with B-DNA. Experimental validations suggest that reliable detection of non-B DNA from nanopore sequencing is achievable.

Bio: I am an assistant professor in the Computer Science and Engineering Department at the University of Connecticut. I graduated from the University of Rhode Island with B.S. degrees in Computer Engineering and Computer Science, received my Ph.D. in Computer Science from Brown University, advised by Professor Sorin Istrail, and completed my postdoctoral work at Princeton University with Professor Barbara Engelhardt. My research aims to develop probabilistic machine learning models, combinatorial algorithms, and scalable inference methods to better understand high-dimensional data, particularly genomics and genetics data applied to complex disease.

 

Wednesday, November 20: Your inner Neanderthal: A deep dive into the Neanderthal and Denisova DNA in modern human genomes

Zoom: https://brown.zoom.us/j/98217558612

Laurits Skov, Ph.D.; Assistant Professor, Section for Evolutionary Genomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark.

Hosted by Emilia Huerta-Sanchez, Associate Professor of Ecology, Evolution, and Organismal Biology. 

Abstract: 100,000 years ago there were at least 7 different human groups roaming the earth – among them were anatomically modern humans and archaic groups such as the Neanderthals and Denisovans. Today only humans remain.

However our ancestors encountered Neanderthals and Denisovans and these groups contributed DNA to our genomes – in a sense they live on in us. With the sequencing of millions of genomes over the last decade we know have ability to fully reconstruct and study the genomes of Neanderthals, Denisovans and potentially other extinct human groups.

In this presentation I will discuss the evolutionary history of Neanderthal and Denisovan DNA found in the genomes of >30,000 people living today and discusses the following questions:

How often did our ancestors encounter extinct human groups and when and where did these encounters take place? What role does archaic human DNA play in the lives of present-day people?

Which pieces of archaic DNA were beneficial and which were detrimental to modern human groups? And are the some pieces of DNA which are unique to modern humans i.e. regions that never cross over from archaic groups to modern humans.

 

Wednesday, December 4: Title & Abstract TBA.

Zoom: https://brown.zoom.us/j/94273168511

Kevin Lin, Assistant Professor, Biostatistics, University of Washington.

Hosted by Zhijin Wu, Professor of Biostatistics, Brown