I am a PhD student in Department of Computer Science at University of Pittsburgh.

My research interests are in machine learning, particularly in deep learning. Currently I am working on developing new methods in Recurrent Neural Networks and lower-dimensional embeddings for predicting discrete events in time series data.
My research advisor is Dr. Milos Hauskrecht


  • • Our paper on periodicity-based event time-series prediction is accepted to FLAIRS 2020 conference
  • • Our paper on recent context-based LSTM for clinical event time series prediction won the best paper award from Artificial Intelligence in Medicine (AIME) 2019 (Paper)


  • Facebook Seattle WA, May 2019 - July 2019
    Intern / Pages Sciences team (Project: contents embedding, personalized page recommendation)
  • Facebook Seattle WA, May 2018 - Aug 2018
    Intern / Pages Sciences team (Project: sequence models for page recommendation)
  • Clinical Translational Science Institute, University of Pittsburgh Pittburgh PA, May 2016 - Aug 2016
    Data Science and Analysis Intern
  • Daeyang Luke Hospital & Baobab Health Trust Malawi, Feb 2013 - Jun 2013
    Software Engineer
  • Tetherless World Constellation, Rensselaer Polytechnic Institute Troy NY, May 2012 - Aug 2012
    Research Intern




  • TA for Introduction to Machine Learning (CS1675) 2019 Spring
    TA for Programming Languages for Web Applications (CS 1520) 2019 Fall, 2018 Spring, 2017 Fall, 2016 Spring & Fall, 2015 Spring
    TA for Introduction to Systems Software (CS449) 2018 Fall
    TA for Algorithm Implementation (CS1501) 2018 Fall
    TA for Introduction to Programming with Python (CS8) 2015 Fall


  • Discrete Event Prediction and Modeling in Time Series Data
    - Develop time series models that can represent and learn behaviors of complex event time series in Electronic Health records.
    - Built end-to-end data pipeline that extracts features from raw data sources, trains models on GPU, tunes hyperparameters, and conducts evaluation and visualizes result.
    - Models and the pipeline are built with PyTorch, Python, and bash.
  • Clinical Knowledge Modeling using Medical Textbooks
    - Developed a machine learning model that learns to quantify the similarity of clinical concepts such as disease, medication and lab test from various knowledge sources including medical textbooks, websites, and knowledge graphs.
    - Embedding method (Skip-gram) was used and the aim of the project is to research the potential of embedded distance measures for feature selection of classifying electronic health record data.
    - The online text scrapping and parsing scripts were built with Beautiful Soup library on Python.
    - Using Biomedical Annotating API (http://bioportal.bioontology.org), free texts in textbooks and websites were transformed into ontological concepts. Word embeddings were trained by Gensim library.
  • Modeling Patient Mortality from Unstructured Text Data
    - Developed a machine learning model that predicts patient mortality from clinical note data. Latent Dirichlet Allocation and sparse group regularization were used for feature learning and SVM was used for classification.

Previous Projects

  • Electronic Medical Record System Project in Africa
    Patient registration and billing module, a part of EMR, for Daeyang Luke Hospital in Malawi, East Africa

  • Global Health Explorer
    Semantic web tool that can be used to conduct public health surveillance using Twitter

  • Data Cube Browser
    Data exploration tool in Javascript using D3 and jQuery that makes it simple to explore data expressed as RDF Data Cubes.
  • Empowering community health worker (CHW) through connecting mobile health (mHealth) and electronic medical record system (EMR)
    Independent Research, Advisor: Daiyon Joh
  • Toward Next Generation of Global Disaster Response and Coordination System (GDRCS)
    Class Project - X Informatics


Jeongmin Lee
Computer Science Department
University of Pittsburgh
210 S Bouquet St. Pittsburgh, PA 15260