{\rtf1\ansi\deff0{\fonttbl{\f0\fswiss\fprq2\fcharset0 Arial;}} {\colortbl ;\red0\green0\blue0;} {\*\generator Msftedit 5.41.15.1515;}\viewkind4\uc1\pard\cf1\lang1033\f0\fs18 22S:166 Computing in Statistics Name:\par Fall 2008\par Midterm 2\par \par I. At the beginning of the semester, we used the "wilcox.test" function in R to test whether the \par centers of two population distributions were equal. Although the Wilcoxon test is a nonparametric \par test, it has a very strong assumption: that the shapes of the two population distributions are the same. \par Let's see what happens to the size of the test if that assumption is violated. We'll do a single test first, \par and then a simulation study.\par \par a. Write the line of R code to draw a random sample of size 8 from a normal distribution with \par mean 2 and variance 1 and assign it to a vector called "mynorms." Paste the line of code here.\par \par b. Write the line of R code to draw random sample of size 10 from an exponential distributions with \par mean 2 and assign it to a vector called "myexps." Paste the line of code here.\par \par c. Write a line of code to apply "wilcox.test" to these two samples to test the null hypothesis that the\par centers of the population distribution are the same. Paste the line of code here.\par \par d. Write R code to display the p-value of the test you carried out in part c. Paste the code and output \par here.\par \par e. Now carry out a simulation study to estimate the actual size of the Wilcoxon test with significance \par level .05 if one of the populations is Normal(2,1) and the other is Exponential with mean 2. Have the \par number of replicate datasets S >= 1000, and use the same two sample sizes as in steps a. and b. \par above. Paste all of your code, and the output showing the estimated size, here.\par \par f. Based on your results in part e., would you say that the Wilcoxon test has the correct size, \par is conservative, or is anticonservative, when the populations have different shapes? Type a \par one-sentence answer here.\par \par II. A politician wishes to create a database to store information about the states in the United States and\par their Representatives in Congress. Different states have different numbers of Representatives. One \par person can be a Representative for only one state at a time. Here are the variables the politician wishes \par to store.\par \par state name\par state abbreviation\par capital city\par population at time of last census\par representative name\par representative political party\par year representative was elected\par representative's district\par \par Define one or more relational tables in which the politician can store these data in third normal form.\par Identify all primary and foreign keys. Type your answer here.\par \par III. The Spearman correlation coefficient is a nonparametric estimate of the association between two\par quantitative variables that may not have a bivariate normal distribution. Given two paired samples,\par x and y, the following R code obtains the sample Spearman correlation coefficient:\par \par cor.test( x,y,method="spearman")$estimate\par \par a. Read the BaP dataset from the course web page into an R dataframe, and compute the Spearman\par correlation coefficient. Paste R code and output here.\par \par b. Use the jackknife to calculate an unbiased estimate of the Spearman correlation coefficient and\par its standard error, based on these data. Paste your R code and output here.\par \par \par \par }