NCSU : ST701 : Fall 2017

## ST701 – Statistical Theory I   Fall 2017

Updated 12/10/2016

### Instructor data

Name:   Ryan Martin
Office:   5238 SAS Hall
Phone:   919-515-1920
Email:   rgmarti3 AT ncsu.edu (best way to contact me)

### Course data

Course syllabus:   PDF file
Teaching Assistant: Mr. Rahul Ghosal (email: rghosal AT ncsu.edu)
Weekly meetings:
• Lecture:   TTh 8:30–9:45am in SAS Hall 1108
• Lab:   T 10:15–11:05am in SAS Hall 1108.
Office hours:
• with instructor, Thursday 11am–12pm, or by appointment.
• with TA, in SAS Hall 1101, Monday 3–5pm.
Textbook:   Casella & Berger Statistical Inference, 2nd Edition.  [link]
Software:   R is free to download at http://cran.r-project.org/
Corequisites:   MA405 (linear algebra) and MA425/511 (real analysis)

### Announcements, etc

Please check this section occasionally for updates to the schedule or other information.

12/10: Have a nice holiday break!

12/10: The exams have been graded and there was a very wide performance range—mean about 78 and standard deviation about 20. My overall assessment is that the exam was a little bit too hard and a little bit too long; I say a "little bit" because there were about 7 students whose unadjusted scores were over 100! Anyway, I decided to adjust the scores slightly by adding 7 points to each raw score, and a summary of the (adjusted) scores is here.

12/05: My solutions to the final exam are here. It may take me a few days to finish the grading, but you probably have other exams to focus on for the time being. When the grading is done and scores are posted on moodle, I'll add one more announcement here.

11/30: The TA will hold his regular office hours on Monday before the final exam.

11/27: I hope you had a nice holiday weekend. As promised, solutions for the two sets of review problems are here and here, and solutions for the two exams from last fall are here and here. The lab session on Nov 28th will go over some of these problems. It is likely that there are some mistakes in my prepared solutions, though these should only possibly be in terms of the numerical answers, not the approach taken. So if you can follow my approach but get a different numerical answer than I did, then it is safe for you to assume that I made a mistake.

11/21: Here are some review materials to help you prepare for the upcoming final exam. Keep in mind that, last fall, the course had two midterms and a final exam, whereas this time there is only one midterm and final. So certain aspects from both Exams 2 and 3 from last fall are relevant to our exam this time. In particular, two sets of review problems are here and here, and the two exams from last fall are here and here. Solutions to all of these things will be posted soon.

10/19: Scores for Exam 1 are posted on the course moodle page. A summary of the distribution of scores is here.

10/17: I will try to have the exams graded and ready to return to you by Thursday class, but that might not happen; if not this Thursday, then I'll have the graded exams for you by next Tuesday. In the meantime, here are my solutions to the exam if you want to take a look.

10/12: At students' request, I pushed the exam start time back 30 minutes. So, the exam will be Tuesday, October 17th, 9–11am.

10/11: Solutions to the review problems are here and solutions to the old exam are here.

10/08: For lab this Tuesday (October 10th), the TA will do some review activities, e.g., answer students' questions about previous homework problems, discussion some of the exam review problems, etc.

10/06: Here is some various information about our upcoming exam, on Tuesday, October 17th.

• The exam will cover all the material in Chapters 1–3, excluding Sections 2.4, 3.4–3.6; Section 3.6 will be part of the material for Exam 2.
• For the exam, you may use a one-page (front and back, regular size paper) sheet of handwritten notes. Feel free to write whatever you want on this paper, the only condition is that it be handwritten, not printed out.
• Here are some practice problems, some are fairly easy and some are less so. This is just to give you something to think about, there is no need to answer write out solutions to all of them.
• Here is the exam I gave last fall, which was based on a slightly smaller chunk of material than what we have covered so far. Note that this exam was "too easy" and that I will try to make the exam harder than this one; but this is still useful as it gives you some idea about my style of questions.

09/26: There are now some (rough) notes for the lab sessions posted below.

09/20: Just a reminder that October 5th is fall break so we will not have class that day.

08/23: The first homework assignment has been posted, see below. Also, I have activated the Moodle page for this course, where I'll be posting your grades, so please check to make sure you can see it.

08/23: I've posted my office hours: Thursday 11am–12pm.

08/17: Grades for this course will be posted on "moodle", which can be accessed from the WolfWare website here. Please check your grades occasionally to be sure that your scores have been recorded correctly.

08/17: Here are a few important dates—I will remind you of these things in class when the time is near.

• There will be no class on Thursday, October 5th, for Fall Break or on Thursday, November 23rd, for Thanksgiving.
• The midterm exam is tentatively scheduled for Tuesday, October 17th, and the final exam will be held on Tuesday, December 5th from 8–11am, based on the schedule set by the university.

08/17: Welcome to ST701!

### Course outline, notes, and supplements

1. Probability Theory, Chapter 1 in Casella & Berger.

• Rough daily log of the material covered:
• 08/17: Introduction, syllabus, and parts of Section 1.1.
• 08/22: Sigma-algebras, Kolmogorov's axioms, and Theorem 1.2.6 in Section 1.2.
• 08/24: Properties of probability, classical model, and combinatorics in Section 1.2.
• 08/29: More combinatorics in Section 1.2.
• 08/31: Conditional probability, multiplication and total probability rules, Bayes formula in Section 1.3.
• 09/05: Independent events and random variables, in Sections 1.3 and 1.4.
• 09/07: More random variables, CDFs, in Sections 1.4 and 1.5.
• 09/12: More random variables, CDFs, PMFs/PDFs, in Sections 1.5 and 1.6.
• A classical problem about the potentially counter-intuitive nature of conditional probability is the Monty Hall problem. This is equivalent to the warden/prisoner problem in Example 1.3.4 in the text. (Sorry I butchered the explanation of these things in the class...)
• Our introduction to the mathematical formulation of probability focused on the "intuition" of Kolmogorov's axioms. The only one that lacked some intuition was countable additivity, but from a non-technical point of view, there shouldn't be too much objection to countable if we allow for finite additivity with an arbitrarily large number of terms; you saw in homework that to make the formal jump from finite additivity to countable additivity, a notion of continuity is needed. But aside from the intuition of Kolmogorov's axioms, is there anything else to support this? An alternative motivation for the axioms (up to finite additivity) can be given based on de Finetti's betting and coherence argument which, roughly, goes as follows: if the prices you set on a set of possible bets don't satisfy the properties of a finitely additive probability, then there exists a strategy that your opponent can take to make you a sure loser. There are some aspects of this formulation that are not entirely realistic, but this historically has been a strong motivation for describing uncertainties with probability. A very readable discussion of this idea is given in Kadane's book (see pages 1–5). This is actually a very good book, with a very different style of presentation compared to the "classical" probability texts.
• ...

2. Transformations and Expectations, Chapter 2 in Casella & Berger.

• Rough daily log of the material covered:
• 09/14: Transformations of random variables in Section 2.1.
• 09/19: More transformations and expected value in Sections 2.1 and 2.2.
• 09/21: More expected value in Section 2.2.
• 09/26: Moments, variance, and moment generating functions in Section 2.3.
• 09/28: More moment generating functions in Section 2.3.
• I did not give a clear explanation of the issue about existence of expected values in class (on 09/20); thanks to several students for pointing this out to me. For the continuous case, potential issues can arise if and only if the integrand, g(x) fX(x), takes both positive and negative values. Then the expected value does not exist only when the integral of the positive part and the integral of the negative part are both infinite. If one is finite and the other infinite, or if both are finite, then expectation exists. The condition, E|g(X)| finite, that I stated in class (which I copied from the book...) does not fully capture what's really going on here. Sorry for the confusion.
• We will not cover the material in Section 2.4 of the book (and it will not be on the exam). Those topics are important but there is really no context for them in ST701; you will see this stuff again in ST702.
• ...

3. Common Families of Distributions, Chapter 3 in Casella & Berger.

• Rough daily log of the material covered:
• 09/28: Normal distribution in Section 3.3.
• 10/03: Normal and other continuous distributions in Section 3.3.
• 10/05: More continuous distributions in Section 3.3.
• 10/10: A bit about exponential families and location–scale models in Sections 3.4 and 3.5.
• 10/12: Probability inequalities from Section 3.6.
• Location-scale families of distributions are important, the normal being one of the most common. The key insight about the location-scale families is that, by definition, they are closed under linear transformations, i.e., if X has distribution in the family, then so does aX + b, for any suitable a and b. Though linear transformations are nice, there is really nothing special here about the transformations being linear. All that really matters for this idea to work is that a composition of two linear transformations is another linear transformations. The natural generalization of this is to consider a group of transformations with function composition as the binary operation. With a "base distribution" and a group of transformations, one can construct a group family of distributions. Some details about this can be found (in several places) in Chapter 1 of my notes; these notes were prepared for a different class at a different university, but you might find them to be a helpful resource for some things.
• My notes also have some considerable discussion about exponential families. In particular, I give a proof of the claim (skipped in class) that we can interchange differentiation and integration when working with exponential families, see Theorem 2.1 on page 31.
• ...

4. Multiple Random Variables, Chapter 4 in Casella & Berger.

• Rough daily log of the material covered:
• 10/19: Joint distributions in Section 4.1.
• 10/24: Marginal distributions, conditional distributions, and independence in Section 4.2.
• 10/26: More independence, and bivariate transformations, Sections 4.2–4.3.
• 10/31: More transformations, in Section 4.3.
• 11/02: More transformations and a bit about mixture models, in Sections 4.3–4.4.
• 11/07: More about mixtures, then covariance/correlation, in Sections 4.4–4.5.
• 11/09: More on correlation, bivariate normal, and Jensen's inequality, in Sections 4.5 and 4.7.
• Please read Section 4.6 on multivariate distributions (i.e., joint distributions for more than two random variables) on your own. This is conceptually no different than the bivariate stuff we've been talking about, so you don't need to be "taught" anything new here; it's just that the calculations are more tedious. We will go over some of this later when we need it for sampling distributions.

5. Properties of a Random Sample, Chapter 5 in Casella & Berger.

• Rough daily log of the material covered:
• 11/14: Sampling distributions in Sections 5.1 and 5.2.
• 11/16: Sampling from a normal population in Section 5.3.
• 11/21: Order statistics and start of random variable convergence in Sections 5.4 and 5.5.
• Chapter 1 in my notes has a bit about sampling distribution properties, even some convergence concepts.
• ...

### Labs

The lab sessions for ST701 are run by the TA, Mr. Rahul Ghosal, and will consist of working out solutions to some extra problems, from the textbook or elsewhere. Rough notes from these lab sessions will be posted here shortly after the lab date. Notes from the first few labs are in one file; separate files for each will be posted for each lab starting on 09/26/2017.

Labs up to and including 09/19/2017. Notes are here.

Lab on 09/26/2017. Notes are here.

Lab on 10/03/2017. Notes are here.

Lab on 10/10/2017. Go over some review materials for the exam, e.g., questions in the review packet.

Lab on 10/24/2017. Go over Stein's Lemma (in Lemma 3.6.5) with application in Example 3.6.6, and Problem 4.4.

Lab on 10/31/2017. Problems 4.14, 4.26, and 4.27; maybe 4.31.

Lab on 11/07/2017. Joint distributions of more than two random variables, in Section 4.6. Notes are here.

Lab on 11/14/2017. Definition and properties of the multinomial distribution, pages 180–182 in the text.

Lab on 11/21/2017. Distribution of order statistics and examples, Section 5.4.

Lab on 11/28/2017. Go over some of the review and old exam problems.

### Homework

Homework will be collected at the beginning of class on the day it's due. The assigned problems will usually be taken from the textbook, but I may occasionally include some problems of my own. You are welcome to discuss the homework with your classmates, but each student must submit their own independent write-up of the solutions. Copying the work of others (which includes your classmates, people who post materials on the web, etc) is not acceptable. Solutions will be posted here shortly after the due date.

Homework 1 — Due Tuesday 09/05/2017.
From Chapter 1 of Casella & Berger: Exercises 1.1, 1.4, 1.5, 1.6, 1.13, 1.23; Bonus: 1.12. Solutions

Homework 2 — Due Tuesday 09/19/2017.
From Chapter 1 of Casella & Berger: Exercises 1.24, 1.34, 1.38, 1.40 [part corresponding to (c) only], 1.46, 1.51, 1.53. Solutions

Homework 3 — Due Tuesday 10/03/2017.
From Chapter 2 of Casella & Berger: Exercises 2.9, 2.11, 2.17, 2.24, 2.25, 2.26, 2.33. Solutions

Homework 4 — Due Friday 10/13/2017. (In my mailbox by 3pm)
From Chapter 3 of Casella & Berger: Exercises 3.15 [use PMF in Eq (3.2.10)], 3.19, 3.24(a), 3.26(b), 3.38. Solutions

Homework 5 — Due Thursday 11/02/2017.
From Chapters 3–4 of Caslla & Berger: Exercises 3.46, 4.1, 4.5, 4.10, 4.15, 4.17, 4.20. Solutions

Homework 6 — Due Tuesday 11/21/2016.
From Chapter 4 of Caslla & Berger: Exercises 4.24, 4.34(a), 4.45, 4.54, 4.55, 4.59. Solutions