One study, in the context of breast cancer,

400-800 patients
whole-genome high-density scans for polymorphism
about 1 million polymorphisms each
gene expression
about 55,000 transcripts in tumors
clinical and histopathological data

These data sets are ideal for addressing the issues of dimensionality, and coping with "large p, small n" problems inherent to biology in the post-genomic era.

As the sample cohort is population based (not family based), it is ideal for association mapping of genes/loci for complex diseases and traits and may also be of interest to population geneticists.

We are also interested in mining data to enable integration of datasets generated on diverse genomic platforms for a comprehensive analysis and to gain insights into the higher order complexity of health and disease states.

We also have collaborative projects in prostate and brain cancers directed at diverse phenotypes and outcome predictions.