Education

  • Ph.D. in Statistics, University of Chicago, 2020 (expected)
  • B.A. in Mathematics and Economics (Music Minor), Cornell University, 2013

Awards

  • Davis Wallace Award for Applied Statistics, 2018
  • Scholarship for Summer Institutes at the University of Washington in Seattle, 2017

Research

Softwares

  • HIPPO (Heterogeneity Inspired Pre-Processing Tool), available in BioConductor: HIPPO
  • Multivariate Missing Bayesian Variable Selection, available in CRAN: MMVBVS
  • Differential Network Analysis: diffNet
  • Clustering Noisy Single Cell Data: SCNoisyClustering
  • Change Point Analysis for Copy Number Variation: CopyNumberCellShift

Talks

  • University of Chicago Consulting Seminar
    • Application of Neural Networks to Predicting Winter Wheat Yield
  • University of Chicago, Medical School, Section of Genetic Medicine
    • Autoencoders with Parametric Noise Models for De-Noising scRNA-seq Data

Other Projects

  • Noisy Data Clustering, advised by Mengjie Chen
    • design a similarity learning algorithm to cluster zero-inflated data with high noise and apply to single-cell sequencing data
  • Copy Number Variation Change Point Analysis, advised by Mengjie Chen
    • devise an alternating descent algorithm combining group fused lasso and mixed effects model to detect copy number alterations in cancer cells
  • Ancestry-eGenes, advised by Dan Nicolae
    • found genes that are differentially expressed based on local ancestry and genotypes for 44 human tissues in African Americans and European Americans; found an enrichment in the immunity-related region
  • Microbiome Data Analysis, consulting team leader
    • Advised a manuscript revision in a medical journal for sound analysis of microbiome data, especially to model within and between group variance accounting for complex batch effects
  • Effects of Maternal Language Use in Children’s Brain Development, consulting team leader
    • Led a consulting team to provide statistical analysis for complex correlated data for a psychologist’s post-doctoral project
  • Computing Variance across Large Data Sets, advised by Lars Vilhuber
    • Manually implemented map-reduce for 3.3TB census data with several million time-series while avoiding memory issue and ensuring numerical stability

Work experience

  • Zurich North America, 2019
    • Data Scientist Summer Intern
    • Built a neural network model to predict performance of winter wheat yields for the Multi-peril Crops Insurance that reduced RMSE by 10% compared to the existing GBM model
    • Conducted extensive research toe valuate the current state of the data for explaining the variability of crop yields
  • Hanwha Life Insurance, 2013 - 2014
    • Actuarial Associate
    • participated as part of the product design team
    • revised contracts to abide by the amendments of national tax systems
  • LnB Prep, 2014 - 2015
    • Instructor
    • taught SAT, ACT, and TOEFL
    • developed strategy textbooks for the new SAT

Skills

  • Statistical Modeling and Inference
  • Coding Languages
    • R, Rcpp, Python, Matlab, Java, bash, Julia (basic)
  • Other Technical Skills
    • Git, TeX, Linux/Unix, HTML, MS Office
  • Language
    • English, Korean

Courseworks

  • Theoretical Statistics
    • Distribution Theory (STAT 304)
    • Mathematical Statistics 1 (STAT 301)
    • Mathematical Statistics 2: Bayesian Analysis and Principles (STAT 302)
    • Nonparametric Inference (STAT 374)
    • Multiple Testing, Modern Inference, and Replicability (STAT 308)
  • Applied Statistics
    • Applied Linear Statistical Methods (STAT 343)
    • Design and Analysis of Experiments (STAT 345)
    • Generalized Linear Models (STAT 347)
    • Computational Biology: Models and Inference (STAT 354)
    • Statistical Genetics (STAT 355)
  • Computational Statistics
    • Mathematical Computation I: Matrix Computation (STAT 309)
    • Mathematical Computation II: Nonlinear Optimization (STAT 310)
    • Machine Learning (STAT 377)
    • Machine Learning and Cancer (CMSC 337)
    • Machine Learning and Large-Scale Data Analysis (STAT 376)

Teaching

I served as a course assistant in Stat 331, Stat 226, and Stat 200 and was in charge of grading and answering the students’ questions. For Stat 234, I served as the lead TA who was in charge of the overall organization of the course.

  • Stat 331 Sample Survey (Autumn 2016, Autumn 2017, Autumn 2018)
  • Stat 234 Statistical Models and Methods I (Spring 2016, Spring 2017)
  • Stat 226 Analysis of Categorical Data (Winter 2017, Winter 2019)
  • Stat 200 Elementary Statistics (Winter 2015)

Other Activities

  • Cornell University Chorus, 2011 - 2013
  • Rockefeller Chapel Choir, 2015 - Present