{"id":270,"date":"2023-06-13T13:09:05","date_gmt":"2023-06-13T20:09:05","guid":{"rendered":"https:\/\/live-usc-dornsife.pantheonsite.io\/larry-goldstein\/?page_id=270"},"modified":"2023-10-27T14:15:26","modified_gmt":"2023-10-27T21:15:26","slug":"math-308-statistical-inference-and-data-analysis","status":"publish","type":"page","link":"https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/","title":{"rendered":"Math 308 &#8211; Statistical Inference and Data Analysis"},"content":{"rendered":"\n\n  \n    \n\n\n\n\n\n\n<div\n  class=\"cc--component-container cc--rich-text \"\n\n  \n  \n  \n  \n  \n  \n  >\n  <div class=\"c--component c--rich-text\"\n    \n      >\n\n    \n      \n<div class=\"f--field f--wysiwyg\">\n\n    \n  <p><big><strong>Course Content<\/strong>: Efficiency, information inequality, simulation, bootstrap, hypothesis testing,\u00a0p-values, likelihood ratio, nonparametrics, descriptive statistics, experimental design, regression, multiple linear regression and analysis of variance, categorical data, chi-squared tests, Bayesian statistics.<\/big><big><\/p>\n<p><strong>Instructor<\/strong>:\u00a0<a href=\"https:\/\/dornsife.usc.edu\/larry-goldstein\/\">Larry Goldstein<\/a>, larry at usc dot edu, KAP 406D, 213 740 2405. Office Hours:\u00a0<\/big><big>MW 10:45-11:45<\/big><br \/>\n<big><br \/>\n<strong>Main Text<\/strong>:\u00a0<\/big><big><a href=\"http:\/\/www.amazon.com\/Mathematical-Statistics-Resampling-Laura-Chihara\/dp\/1118029852\/ref=sr_1_1?ie=UTF8&amp;qid=1383604183&amp;sr=8-1&amp;keywords=chihara+statistics\">Mathematical Statistics and Resampling with R<\/a>, Chihara and Hesterberg,\u00a0<a href=\"https:\/\/sites.google.com\/site\/chiharahesterberg\/\">Textbook Supplements<\/a>, including datasets<\/big><big><\/p>\n<p><strong>R Resources<\/strong>:<a href=\"http:\/\/cran.r-project.org\/doc\/manuals\/R-intro.html\">\u00a0Introduction to R<\/a>, Another\u00a0<a href=\"http:\/\/cran.r-project.org\/doc\/contrib\/Lam-IntroductionToR_LHL.pdf\">Introduction to R<\/a>,\u00a0\u00a0<a href=\"http:\/\/www.r-project.org\/\">R homepage<\/a>,\u00a0\u00a0<a href=\"http:\/\/www.jeremymiles.co.uk\/regressionbook\/extras\/appendix2\/R\/\">Introductory example<\/a>,\u00a0\u00a0<a href=\"http:\/\/www.r-tutor.com\/\">R-Tutorial<\/a>, an\u00a0<a href=\"http:\/\/research.stowers-institute.org\/efg\/R\/\">R Graphics Gallery<\/a><\/p>\n<p><strong>Teaching Assistant<\/strong>: Michael Hankin, mhankin at usc dot edu. Office Hours: Tuesday 3-4, Wednesday 1-2, Thursday 3-4,\u00a0<a href=\"https:\/\/dornsife.usc.edu\/mathcenter\/\">Math Center<\/a><\/big><\/p>\n\n\n\n<\/div>\n\n\n  <\/div><\/div>\n\n\n\n\n  \n    \n\n\n\n\n\n\n<div\n  class=\"cc--component-container cc--rich-text \"\n\n  \n  \n  \n  \n  \n  \n  >\n  <div class=\"c--component c--rich-text\"\n    \n      >\n\n    \n      \n<div class=\"f--field f--wysiwyg\">\n\n    \n  <h5><strong>Exams and Grading Policy<\/strong><\/h5>\n<ul>\n<li>Homework: 15%<\/li>\n<\/ul>\n<ul>\n<li>Midterm 1: 20%: Wednesday, February 19th.<\/li>\n<\/ul>\n<p><big>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0\u00a0<\/big><big>Scores: 23 26 33 34 34 35 39 44 46 54 58 69<\/big><big><br \/>\n<\/big><\/p>\n<ul>\n<li>Midterm 2: 20%, Wednesday, April 2nd. Includes material in Chapter 4, Chapter 5 up to section 5.6 excluding Bootstrap percentile intervals, Section 6.3.1, 6.3.3., Sections 7.1.1, 7.1.2, and 7.3.<\/li>\n<\/ul>\n<p><big><\/big><big>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 Scores: 37,42,47,49,52,71,76,87,98,99,105<\/big><\/p>\n\n\n\n<\/div>\n\n\n  <\/div><\/div>\n\n\n\n\n  \n    \n\n\n\n\n\n\n<div\n  class=\"cc--component-container cc--rich-text \"\n\n  \n  \n  \n  \n  \n  \n  >\n  <div class=\"c--component c--rich-text\"\n    \n      >\n\n    \n      \n<div class=\"f--field f--wysiwyg\">\n\n    \n  <p><strong>Course Project:<\/strong>\u00a020% Analysis of a data set using the techniques learned in class. You should prepare a report, or writeup, for distribution to the class that explains the data, what inferences were drawn from it, and what statistical techniques were applied. Presentations should last 25 minutes. If interesting coding or new R issues arose during your work you may also consider discussing these with the class as well. 15% credit for your own presentation, 5% for participation with questions, comments, or suggestions on the presentations of others. April 21st, 23rd, 28th and 30th.<\/p>\n<p><strong>Final Exam:<\/strong>\u00a025% Monday, May 12th, 2-4PM. Exam will be comprehensive over all material during the semester, with emphasis on what was covered since the second midterm, including Sections 8.2.1, 8.2.2, 9.1-9.4.<\/p>\n<p>&#8220;<a href=\"https:\/\/www.youtube.com\/watch?v=3T7jMcstxY0\">The Greatest Ever Infographic<\/a>&#8220;<\/p>\n\n\n\n<\/div>\n\n\n  <\/div><\/div>\n\n\n\n\n  \n    \n\n\n\n\n\n\n<div\n  class=\"cc--component-container cc--rich-text \"\n\n  \n  \n  \n  \n  \n  \n  >\n  <div class=\"c--component c--rich-text\"\n    \n      >\n\n    \n      \n<div class=\"f--field f--wysiwyg\">\n\n    \n  <h5><strong>Assignments<\/strong><\/h5>\n<p>1. Listen to Radio Lab,\u00a0<a href=\"http:\/\/www.radiolab.org\/story\/91684-stochasticity\/\">Stochastiticy<\/a>\u00a0parts A: A Very Lucky Wind, and B: Seeking Patterns<\/p>\n<p>2. Write a simulation in R to estimate the probability of obtaining seven heads in a row in 100 tosses of a fair coin. It may be simpler to break the task down in two pieces, as follows<\/p>\n<p>a) Write a function that takes in a vector of 100 0&#8217;s and 1&#8217;s and returns either TRUE or FALSE depending on whether or not it has a run in it.<br \/>\nHint: check to see if each flip is the beginning of a 7 flip run.<br \/>\nb) Create 1000 100-flip-samples and use your function to count the number that contain such runs.<\/p>\n<p>3. Exercises 1.11: 2, 3, 5, 6.<\/p>\n<p>4. Exercises 2.8: 2,4,6,8,13,14,15,17<\/p>\n<p>5. Test the null hypothesis that the Salk Vaccine is ineffective, using the Hypergeometric distribution, and compare the exact p-value obtained there to the one computed using both the Binomial and the Normal Approximation. Recall that both treatment and control groups were of size 200,000, and that the treatment group had 56 cases, while the control group had 141.<\/p>\n<p>6. Perform a permutation test for the data\u00a0<a href=\"https:\/\/sites.google.com\/site\/chiharahesterberg\/\">NCBirths2004<\/a>\u00a0to test the null hypothesis that Tobacco use by the mother does not affect the birth weight of newborns.<\/p>\n<p>7. Exercises 3.9: 4,8,11,13,17,19,22, 25,29<\/p>\n<p>8. Exercises 4.4: 1,2,6,9,10,15,18,20,22,25,27,28<\/p>\n<p>9. Find EZ^n for Z ~ N(0,1) for n=0,1,2,3,4,5 and 6.<\/p>\n<p>10. Find the moment generating function of the chi squared distribution on k degrees of freedom, and use it to calculate the mean and variance.<\/p>\n<p>11. The victor of the World Series in Baseball is awarded to the first team who wins four games. Hence the series can be 4,5,6 or 7 games long. Over the 50 year peroid starting in 1952, the number of times the series lasted for those number of games was 8,8,10 and 24, respectively. Test the hypotheses H_0 that the games of the world series are independent, with each team having an equal chance of winning.<\/p>\n<p>12. Find the expected number of contiguous subsequences of the form 01111111 in 100 tosses of a fair coin. Assuming that the distribution of the number of occurrences of this subsequence is approximately Poisson, find an approximation to the probability that the 100 toss sequence contains at least one subsequence of this type. Compare the result obtained this way to the estimate computed by simulation in Problem 2, above.<\/p>\n<p>13. The unbiased estimates of variance, scaled by n-1, is typically preferred to the variance estimate scaled by the sample size n. Use the bootstrap to estimate the bias of these two variance estimates for a small sample of independent normal variables.<\/p>\n<p>14. Find the distribution and density function of the second largest observation from a sample of n independent and identically distributed random variables with density function f. What is the expected value of this variable when the density is uniform over [0,1]?<\/p>\n<p>15. Exercises 5.10: 5,6,8,9,10,11,12,14<\/p>\n<p>S. Find an observational study reported in a `reliable source&#8217; (e.g. LA Times, CNN News, etc.) where you can name an overlooked confounded effect that would partially, or fully, negate the\u00a0conclusion drawn.<\/p>\n<p>16. Exercises 6.4: 1,2,4,5,10,12,14,16,25,27,34,36<\/p>\n<p>17. Exercises 7.6: 1,2,7,8,10,12,17,20,24,25,31,34,37<\/p>\n<p>18. Exercises: 8.5: 4,6,11,14,16,17,18,25,36,37<\/p>\n<p>19. Exercises: 9.7: 7,9,10,11,17,18,21<\/p>\n<p>20. Find the power function for the one sided hypothesis test of H_0:\u00a0\u03bc = \u03bc_0 vs H_0:\u00a0\u03bc &gt; \u03bc_0 at significance level \u03b1 when observing n i.i.d. normal variables with unknown mean \u03bc and variance \u03c3^2=1. Plot the power function for \u03b1 = 0.05, \u03bc=0 and n =100. How large a sample is needed in order to have power 0.80 to detect that \u03bc=1?<\/p>\n<p>21. Find the least squares estimate of, and a confidence interval for, \u03b2 in the linear model y<sub>i<\/sub>=\u00a0\u03b2x<sub>i<\/sub>+ \u03b5<sub>i,\u00a0<\/sub>i=1,2,..n. when the errors are i.i.d. normal variables.<\/p>\n<p>22. Run a linear regression analysis on the Pearson father-son height\u00a0<a href=\"https:\/\/uscdornsife.usc.edu\/wp-assets\/221\/pearson.txt\">data<\/a>\u00a0set. Form a scatter plot for the data, estimate all the parameters of the model, and test the hypotheses that there is no association between father and son height., that is, test the hypothesis that \u03b2 equals zero, against the alternative that it is non-zero,<\/p>\n<p>F. \u00a0Write a problem for possible use in the final exam. Though direct variations of already assigned problems are one possibility, higher credit will be given for problems that fairly test a student&#8217;s ability to understand, use, manipulate and extend the concepts taught in the course.<\/p>\n<p><strong>Due Dates:<\/strong><\/p>\n<p>1,3\u00a0\u00a0\u00a0 \u00a0 \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Jan 30th.<br \/>\n2,4,5,6,7\u00a0 \u00a0 \u00a0 \u00a0 \u00a0Feb 18th<br \/>\n8,9,10,11\u00a0 \u00a0 \u00a0 \u00a0 Mar 6th<br \/>\n12,13,14,15\u00a0 \u00a0 Mar 25th<br \/>\nS.\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0Mar 26th<br \/>\n16,17\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0Apr\u00a0 10th<br \/>\n18,19,20\u00a0 \u00a0 \u00a0 \u00a0 Apr 22<br \/>\n21,22\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 May 1<br \/>\nF \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 May 2<\/p>\n\n\n\n<\/div>\n\n\n  <\/div><\/div>\n\n\n\n\n  \n    \n\n\n\n\n\n\n<div\n  class=\"cc--component-container cc--rich-text \"\n\n  \n  \n  \n  \n  \n  \n  >\n  <div class=\"c--component c--rich-text\"\n    \n      >\n\n    \n      \n<div class=\"f--field f--wysiwyg\">\n\n    \n  <h5><strong>Data Links of Interest<\/strong><\/h5>\n<ul>\n<li><a href=\"http:\/\/wonder.cdc.gov\/\">CDC On Line Data Bases<\/a><\/li>\n<li><a href=\"http:\/\/www.data.gov\/\">www.data.gov<\/a><\/li>\n<li><a href=\"http:\/\/sda.berkeley.edu\/\">http:\/\/sda.berkeley.edu\/<\/a><\/li>\n<li><a href=\"http:\/\/archive.ics.uci.edu\/ml\/datasets.html\">http:\/\/archive.ics.uci.edu\/ml\/datasets.html<\/a><\/li>\n<li><a href=\"http:\/\/www.bigdata-startups.com\/public-data\/\">http:\/\/www.bigdata-startups.com\/public-data\/<\/a><\/li>\n<li><a href=\"http:\/\/www.statsci.org\/datasets.html\">http:\/\/www.statsci.org\/datasets.html<\/a><\/li>\n<li><a href=\"http:\/\/www.pro-football-reference.com\/play-index\/play_finder.cgi\">http:\/\/www.pro-football-reference.com\/play-index\/play_finder.cgi<\/a><\/li>\n<li><a href=\"https:\/\/www.kaggle.com\/\">https:\/\/www.kaggle.com\/<\/a><\/li>\n<li><a href=\"http:\/\/ww2.coastal.edu\/kingw\/statistics\/R-tutorials\/logistic.html\">logistic regression tutorial<\/a><\/li>\n<\/ul>\n\n\n\n<\/div>\n\n\n  <\/div><\/div>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":370,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"footnotes":""},"class_list":["post-270","page","type-page","status-publish","hentry"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.1.1 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Math 308 - Statistical Inference and Data Analysis - Larry Goldstein<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Math 308 - Statistical Inference and Data Analysis - Larry Goldstein\" \/>\n<meta property=\"og:url\" content=\"https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/\" \/>\n<meta property=\"og:site_name\" content=\"Larry Goldstein\" \/>\n<meta property=\"article:modified_time\" content=\"2023-10-27T21:15:26+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/\",\"url\":\"https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/\",\"name\":\"Math 308 - Statistical Inference and Data Analysis - Larry Goldstein\",\"isPartOf\":{\"@id\":\"https:\/\/dornsife.usc.edu\/larry-goldstein\/#website\"},\"datePublished\":\"2023-06-13T20:09:05+00:00\",\"dateModified\":\"2023-10-27T21:15:26+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/dornsife.usc.edu\/larry-goldstein\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Math 308 &#8211; Statistical Inference and Data Analysis\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/dornsife.usc.edu\/larry-goldstein\/#website\",\"url\":\"https:\/\/dornsife.usc.edu\/larry-goldstein\/\",\"name\":\"Larry Goldstein\",\"description\":\"USC Dornsife Larry Goldstein\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/dornsife.usc.edu\/larry-goldstein\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Math 308 - Statistical Inference and Data Analysis - Larry Goldstein","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/","og_locale":"en_US","og_type":"article","og_title":"Math 308 - Statistical Inference and Data Analysis - Larry Goldstein","og_url":"https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/","og_site_name":"Larry Goldstein","article_modified_time":"2023-10-27T21:15:26+00:00","twitter_card":"summary_large_image","schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/","url":"https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/","name":"Math 308 - Statistical Inference and Data Analysis - Larry Goldstein","isPartOf":{"@id":"https:\/\/dornsife.usc.edu\/larry-goldstein\/#website"},"datePublished":"2023-06-13T20:09:05+00:00","dateModified":"2023-10-27T21:15:26+00:00","breadcrumb":{"@id":"https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/dornsife.usc.edu\/larry-goldstein\/math-308-statistical-inference-and-data-analysis\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/dornsife.usc.edu\/larry-goldstein\/"},{"@type":"ListItem","position":2,"name":"Math 308 &#8211; Statistical Inference and Data Analysis"}]},{"@type":"WebSite","@id":"https:\/\/dornsife.usc.edu\/larry-goldstein\/#website","url":"https:\/\/dornsife.usc.edu\/larry-goldstein\/","name":"Larry Goldstein","description":"USC Dornsife Larry Goldstein","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/dornsife.usc.edu\/larry-goldstein\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/dornsife.usc.edu\/larry-goldstein\/wp-json\/wp\/v2\/pages\/270","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dornsife.usc.edu\/larry-goldstein\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/dornsife.usc.edu\/larry-goldstein\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/dornsife.usc.edu\/larry-goldstein\/wp-json\/wp\/v2\/users\/370"}],"replies":[{"embeddable":true,"href":"https:\/\/dornsife.usc.edu\/larry-goldstein\/wp-json\/wp\/v2\/comments?post=270"}],"version-history":[{"count":4,"href":"https:\/\/dornsife.usc.edu\/larry-goldstein\/wp-json\/wp\/v2\/pages\/270\/revisions"}],"predecessor-version":[{"id":746,"href":"https:\/\/dornsife.usc.edu\/larry-goldstein\/wp-json\/wp\/v2\/pages\/270\/revisions\/746"}],"wp:attachment":[{"href":"https:\/\/dornsife.usc.edu\/larry-goldstein\/wp-json\/wp\/v2\/media?parent=270"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}