Yanna Shen

 

I am a Ph. D. student in the Intelligent Systems Program at the University of Pittsburgh, advised by Dr. Gregory Cooper. I am working as a Graduate Student Researcher at RODS Laboratory.

My research interests include machine learning, graphical models, Bayesian statistics, and the investigation of these methods in addressing real-world problem.

Suite M-183 VALE, 200 Meyran Ave

University of Pittsburgh

Pittsburgh, PA 15260

 

Phone: (412)648-6710

E-mail: shenyn@cs.pitt.edu

 

Detecting anomalous events in data has important applications in domains such as disease outbreak detection, fraud detection, and intrusion detection. In a typical scenario, a monitoring system examines a sequence of data to determine if any recent activity can be considered as deviation from baseline behavior. Many anomaly-detection algorithms, such as cumulative sum method, use frequentist statistical techniques. I am currently doing Bayesian modeling for anomaly detection.

►The goals of automated disease-outbreak detection systems are to detect disease outbreaks early, while exhibiting few false positives.

 I am now working on Bayesian modeling of unknown causes of events in the context of disease-outbreak detection. I developed a Bayesian hybrid detection algorithm that models and detects both known diseases (e.g., influenza and anthrax) by using informative prior probabilities and unknown diseases (e.g., a new, highly contagious respiratory virus that has never been seen before) by using relatively non-informative prior probabilities [3,7].

 I developed a general Bayesian univariate anomaly-detection algorithm that runs in linear time [9]. I intend to develop a multivariate version of this algorithm for monitoring multiple features of some event.

 I developed an efficient spatial Bayesian outbreak-detection algorithm that performs complete Bayesian model averaging over all possible spatial hypotheses, which we call SBMA [2]. I intend to develop a multivariate version of SBMA and apply SBMA to a wide variety of disease-outbreak scenarios.

►The objective when evaluating a disease-outbreak detection algorithm is to measure its accuracy (sensitivity and false alarm rate) and time to detection. However, little research has been done on estimating how well automated disease-outbreak detection systems augment traditional outbreak detection that is carried out by clinicians. It would be best to develop algorithms that are augmentative of, rather than redundant with, clinician detection.

 I developed a general mathematical framework for evaluating joint clinician-machine detection of anomalies [1,4].

 

In the summer of 2008, I did research at Intel Research Pittsburgh through a summer internship there. I worked on Bayesian dynamic joint modeling of multiple perspectives in spatial-temporal inference problems [8].

 

1. Shen, Y., C. Adamou, J. N. Dowling, and G.F. Cooper, Estimating the joint disease outbreak-detection time when an automated biosurveillance system is augmenting traditional clinical case finding. Journal of Biomedical Informatics, 2008. 41(2): 224-231. [PDF]

2. Shen, Y., W.-K. Wong, J. Levander and G.F. Cooper, An outbreak detection algorithm that efficiently performs complete Bayesian model averaging over all possible spatial distributions of disease. Advances in Disease Surveillance 2007; 4:113. [PDF]

3. Shen, Y. and G.F. Cooper, A Bayesian biosurveillance method that models unknown outbreak diseases. In: Proceedings of Intelligence and Security Informatics: BioSurveillance 2007: 209-215. [PDF]

4. Shen, Y., W.-K. Wong, and G.F. Cooper, Estimating the expected warning time of outbreak-detection algorithms. Advances in Disease Surveillance 2006; 1:65. [PDF]

5. Lu, X., Q. Li, Z. Huang, Y. Shen and T. Yao, Towards Chinese-English sentence alignment based on statistical method. Journal of MINI-MICRO Systems. 2004, Vol. 25, No. 6, 990-992.

6. Zhang, L., X. Lu, Y. Shen and T. Yao, A statistical approach to extract Chinese chunk candidates from large corpora. In: Proceedings of International Conference on Computer Processing on Oriental Languages, 2003.

Papers Submitted or in Preparation

7. Shen, Y. and G.F. Cooper, Bayesian modeling of unknown diseases for biosurveillance. Submitted to Artificial Intelligence in Medicine.

8. Denver Dash, Y. Shen, and Matthai Phillipose, AMP: Automating Multi-viewpoint Perception. Submitted to NIPS08 Workshop: Learning from Multiple Sources.

9. Shen, Y. and G.F. Cooper, A Bayesian univariate anomaly-detection algorithm.

10. Shen, Y. and G.F. Cooper, An efficient Bayesian model averaging algorithm for spatial disease-outbreak detection.

 

 

 

An outbreak detection algorithm that efficiently performs complete Bayesian model averaging over all possible spatial distributions of disease. 2007 International Society for Disease Surveillance Annual Conference, Indianapolis, Indiana USA, October 2007. [PPT]

A Bayesian biosurveillance method that models unknown outbreak diseases. NSF Workshop on BioSurveillance Systems and Case Studies (BioSurveillance 2007), New Brunswick, New Jersey USA, May 2007. [PPT]

Estimating the expected warning time of outbreak-detection algorithms. 2005 Syndromic Surveillance Conference, Seattle, Washington USA, September 2005. [PPT]

 

2003 Fall
ISSP 2020  Topics in Intelligent Systems
CS 2710 Foundations of Artificial Intelligence
CS 2731 Introduction to Natural Language Processing
BIOST 2041 Introduction to Statistical Methods 1

2004 Spring
CS 2150  Design & Analysis of Algorithms
ISSP 2030  Advanced Topics in Intelligent Systems
ISSP 2170  Machine Learning
MUSIC 0121  Basic Musicianship: Class Piano

2004 Fall
ISSP 2070  Probabilistic Methods for Computer-Based Decision Support
STAT 1631  Intermediate Probability

2005 Spring
BIOST 2042 Introduction to Statistical Methods 2
STAT 1632 Intermediate Mathematical Statistics

2005 Fall
CS 3710 Probabilistic Graphical Models (Advanced Topics in Artificial Intelligence)
CMU
36-705 Intermediate Statistics

2006 Spring
CMU
10-702 Statistical Machine Learning

2006 Fall
CMU
10-708 Probabilistic Graphical Models

 

[PDF]

 

 

 

 

About Me

Research

About Me

Research

Publications

Talks

Courses

CV

Publications

National Conference Presentations

Courses

CV

Last update: Oct 2008