Contact Information
Research Interests
- Bayesian nonparametric and semiparametric methods
- Mass spectrometry data preprocessing method
- Genomic and proteomic data analysis
- Bayesian statistical designs for clinical trials
Projects
Mass spectrometery data preprocessing
Mass Spectrometry (MS) technology makes it possible to study various
biological samples at their protein level. The data generated by
this technology holds invaluable information leading to the disease
diagnosis and treatment (Paweletz et al., 2000; Adam et al., 2003;
Schaub et al., 2004). However, there are tremendous challenges when
dealing with MS data since the raw mass spectrometric data reflects
not only the protein information but also "junk" information (often
referred as noise). The noise is due to multi-sources variations,
for instances, there are substantial variations
in the signal intensity even for those replicate spectra. The
small amount of shifts at the mass/charge locations for the peaks
representing the same protein indicate another source of variations.
In addition, there exists quite different decreasing baseline for each
individual spectrum. MS data preprocessing aims to take
control of these variations and obtain the true signal information
for further statistical analysis. To approach the same goal of
taking care of the variations but from different perspectives, the
existing preprocessing methods could be summarized into two major
ones: functional data analysis approach (Morris and Carroll 2004,
Billheimer 2004) and the wavelet-based feature extraction approach
(Baggerly et al., 2003, Qu, et al. 2003, Chen, Hong and Shyr 2004;
Coombes et al. 2004, Jeffrey et al. 2004). The focus of my current research
is on the feature extraction approach.
By feature extraction approach, it assumes that all the useful
information from the MS data intensity plot are in the peaks and
those identified peaks ideally correspond to individual protein. The
major goal of this approach is to identify, quantify and match the
peaks across spectra. Therefore, we might view the feature
extraction approach as the peak detection process, which
takes general steps like following: (1)data calibration; (2) denoising (smoothing); (3) baseline correction;
(4) normalization; and (5) peak detection and alignments (binning).
Each step is crucial for providing an accurate peak list containing
important features of the MS data for further statistical analysis. Current reseach focus is to improve the existing clibration methods as well as the peak alignment methods.
Bayesian design for phase II clinical trials
The nature of the research goal on phase II clinical trial makes
Bayesian methodology appealing since it seems to fit the setting
of decision theory naturally. However, the requirement for the
utility functions and relatively complicate computation procedures
may discourage many from adopting this approach. Recently, a very
user friendly Bayesian two-stage design, the STD (single threshold
design) has been proposed by Tan and Machin (2002). Based on their
design framework, we proposed one modification to reduce the sample
size for stage II given the very promising outcome observed from
stage I. To keep it consistent to STD and as user friendly as
possible, this design is a slight extension of STD yet keep its
original structure to the greatest extent.
Topic revision: r13 - 29 Jan 2009 - 16:59:42 -
ChunLi