# Statistical Machine Learning Ph.D. Program

## Program Contacts

**Department of Mathematical and Statistical Sciences**

Professor Edit Gombay

*Office: *CAB 425

*Email*: gombay@math.ualberta.ca

*Phone*: (780) 492-2337

*Fax*: (780) 492-6826

**Department of Computing Science**

Professor Csaba Szepesvári

*Office: *ATH 311

*Email*: szepesva@cs.ualberta.ca

*Phone*: (780) 492-8581

Fax: (780) 492-1071

## Jump to

## Overview

The Master of Science and Doctor of Philosophy degrees in Statistical Machine Learning may be taken jointly in the Department of Computing Science and in the Department of Mathematical and Statistical Sciences. The program emphasizes the theoretical aspects of the design and analysis of machine learning algorithms using tools of statistics and computer science.

Students can apply either to the Department of Computing Science or to the Department of Mathematical and Statistical Sciences to participate in this program. The department the student applied to becomes the host department of the student, gives his/her degree and does the administration of the program.

## Why take Statistical Machine Learning?

**If you are a Computing Science student interested in machine learning, why should you take the SML program? What are the benefits? And what are the pitfalls? **

90 percent of machine learning is based on statistical ideas. Statistical ideas and statistical thinking constitute the core of the subject. If you really want to understand topics such as *overfitting*, *cross validation and its uses*, *the limits of learnability*, *adaptive methods*, why is *LASSO* a good idea (if at all), then the SML program can help you speak this language.

The SML program gives you the opportunity to build strong foundations in probability theory and statistics. These days, the boundary between machine learning and statistics is even less clear than it was ever before. Statisticians publish in machine learning journals and at machine learning conferences and vice versa. After all, both paths are exploring better ways to create better models which would, in turn, produce better predictions. In fact, the demand for rigorous analysis of algorithms is bigger than ever -- and for good reason: a solid understanding of algorithms is necessary to build a good foundation so that the tower of results built on top it does not collapse. Empirical evidence is important, but it can never tell the whole story.

**What if you are a Statistics student? Why should you care? **

Machine learning is a very vibrant and rapidly expanding part of statistics. As new models appear, so do the opportunities. Scientists in machine learning like non-standard models and situations creating many wonderful research opportunities. Also, being a new subject, it may be easier to gain recognition from the community (though in truth, you'll still have to work hard at it!)

**What are the job prospects like? Will this program enhance your chances of employment?**

These days, employers (looking for machine learning researchers) are aware that machine learning and statistics are tightly intervowen. Having an SML degree certifying that you speak both languages is to your advantage. You are also given *double the options* as you apply for jobs - you can look into jobs that require a computing science/machine learning background as well as a statistics/probability theory background. If you decide to stay in the academia, you can consider one of the many openings are in statistics. Or, your specialization in machine learning may lead you to work as a researcher for Google, Yahoo, Amazon or Netflix.

**Who should not take the SML program?**

If you are a computing science student who is bored of theory, math, and probability, do not take the program. If you are afraid of hard work, this program is not for you! In fact, the load for this program is slightly higher than average.

## Entrance Requirements

The entrance requirement for the PhD program in Statistical Machine Learning is, normally, an MSc degree in Computing Science or in Mathematical and Statistical Sciences, or equivalent.

## Course Requirements

In addition to the examinations called for by the general regulations in the host department, the student must successfully complete an entrance year which includes at least two full terms of course work. The program of a full-time student in each of these terms shall normally include at least three courses from the list of approved courses (graduate or senior undergraduate, Computing Science or Mathematical and Statistical Sciences).

Computing Science PhD students participating in this program need to take:

- Four courses from the approved list of courses
- At least two out of four from Mathematics & Statistical Sciences
- At least two out of four from Computing Science
- Extra courses (over the required load) during your MSc will reduce the number of courses you will need to take during your PhD. For example, if you already took 2 extra courses with the Math/Stats during your MSc, then you do not need take any Math/Stats courses during your PhD.

- One course on Teaching and Research Methods (CMPUT 603)

### Entrance Year Course Requirements

This list does not reflect all courses that are offered in every year. If you are in doubt, please contact the program coordinators.

Students must select two of the following core Statstics courses:

- STAT 571: Probability and Measure
- STAT 566: Methods of Statistical Inference (or STAT 664, Advanced Statistical Inference)
- STAT 665: Asymptotic Methods in Statistical Inference

Similarly, students must select another two of the following core Computing courses:

- Numerical Optimization: Theory and Algorithms
- Machine Learning
- Probabilistic Graphical Models
- Reinforcement Learning in Artificial Intelligence

### Approved Courses

For completing their course requirements, in addition to the courses listed above, students can also select courses from the following.

**Mathematics and Statistical Sciences courses**

- STAT 512: Techniques of Mathematics for Statistics
- STAT 575: Multivariate Analysis
- STAT 580: Stochastic Processes
- STAT 671: Probability Theory I
- STAT 672: Probability Theory II
- STAT 503: Directed Study III
- STAT 679: Time Series Analysis
- STAT 578: Regression Analysis

The list of offered courses varies from year to year. See the graduate course directory for this year's list of approved courses.

## How to Apply

To apply, follow the departmental application process. In the Departmental application system (GAPS), specify "*Statistical Machine Learning*" as your program.