# Sampling Distribution Mean and Standard Error Questions

### Question Description

Describe the shape of the sampling distribution of the sample mean and find its mean and standard error

https://miamioh.instructure.com/media_objects_iframe/m-5oyyxPQDS9dSYKQtjG9VcYZVfNFFJ3iR?type=video?type=video

https://miamioh.instructure.com/media_objects_iframe/m-5EANnVAz9DeTyFBp4jYxmxC1A5EFb6Sn?type=video?type=video

ampling Distributions and Large Sample Estimation Department of Statistics MiamiOH.edu/cas @CASMiamiOH 2 What we will Cover • Sampling Distributions • The Central Limit Theorem • Sample Statistic and Their Distributions • Sample Mean, 𝑥ҧ • Sample Proportion, 𝑝Ƹ • Large Sample Estimation MiamiOH.edu/cas @CASMiamiOH 3 Sampling Distributions • Numerical descriptive measures calculated from the sample are called statistics • Example: Sample Mean (𝑥)ҧ and Sample Proportion (𝑝)Ƹ • Statistics are random variables because they vary from sample to sample. • The probability distributions for statistics are called sampling distributions • In repeated sampling, they tell us what values of the statistics can occur and how often each value occurs. MiamiOH.edu/cas @CASMiamiOH 4 Sampling Distribution Continued • Sampling distributions for statistics can be • Approximated with simulation techniques • Derived using mathematical theorems

• The Central Limit Theorem (CLT) is one such theorem. Central Limit Theorem (CLT) : If random samples of 𝑛 observations are drawn from a population with any underlying distribution with a finite 𝜇 and standard deviation 𝜎. Then 1 when 𝑛 is large, the sampling distribution of the sample mean (തx = σni=1 xi ) is n approximately normally distributed with mean 𝜇 and standard deviation 𝜎 . 𝑛 Notice the approximation becomes more accurate as 𝑛 becomes large. MiamiOH.edu/cas @CASMiamiOH 5 Importance of CLT • The Central Limit Theorem also implies that the sum of 𝑛 measurements is approximately normal with mean 𝑛𝜇 and standard deviation 𝜎 𝑛 𝐸 𝑛𝜇 = 𝑛𝜇 and Var[nതx] = 𝑛2 𝜎2 = 𝑛𝜎 2 ⇒ Std. Dev = 𝑛𝜎 2 = 𝑛𝜎 𝑛 • Many statistics that are used for statistical inference are sums or average of sample measurements.

• When 𝑛 is large, these statistics will have approximately normal distribution. • This will allow us to describe their behavior and evaluate the reliability of our inferences. MiamiOH.edu/cas @CASMiamiOH 6 Relationship of CLT and the Sample Size • If the sample is normal, then the sampling distribution of 𝑥ҧ will also be normal, no matter what the sample size is. • When the sample population is approximately symmetric, the distribution becomes approximately normal even for relatively small values 𝑛. • When the sample population is skewed, the sample size must be at least 30 before the sampling distribution 𝑥ҧ becomes approximately normal. MiamiOH.edu/cas @CASMiamiOH 7 Sampling Distribution of the Sample Mean • A random sample of size 𝑛 is selected from a population with mean 𝜇 and standard deviation 𝜎. 𝜎 𝑛

• The sampling distribution of the sample mean 𝑥ҧ will have mean 𝜇 and • If the original population is normal, the sampling distribution will be normal for any sample size. • If the original population is non-normal, the sampling distribution will be normal when 𝑛 is large The standard deviation of 𝑥ҧ is sometimes called the Standard Error (SE). MiamiOH.edu/cas @CASMiamiOH 8 Probabilities for the Sample Mean • If the sampling distribution of 𝑥ҧ is normal or approximately normal, standardize or rescale the interval of interest in terms of 𝑧= • 𝑥ҧ − 𝜇 𝜎Τ 𝑛 Find the appropriate area using a Z-Score table Example: A random sample of size 𝑛 = 16 from a normal distribution with 𝜇 = 10 and 𝜎 = 8. 𝑃 𝑥ҧ > 12 = 𝑃 𝑍 > 12 − 10 8Τ 16 = 𝑃 𝑍 > 1 = 1 − 0.8413 = 0.1587 MiamiOH.edu/cas @CASMiamiOH 9 Applying CLT with Sample Mean Example: The duration of Alzheimer’s disease from the time symptoms first appear until death ranges from 3 to 20 years; the average is 8 years with standard deviation of 4 years.

The administrator of a large medical center randomly selects the medical records of 30 deceased Alzheimer’s patients from the medical center’s database and records the duration of the disease. Find them approximate probabilities for these events; 1. The average duration is less than 7 years. 2. The average duration exceeds 7 years. 3. The average duration lies within 1 year of the population mean 𝜇 = 8. Since the administrator has selected a random sample of 30 files from the database we can use the CLT to draw conclusions about the population. MiamiOH.edu/cas @CASMiamiOH 10 Applying CLT with Sample Mean Example Continued: ഥ: Approximately Normal with Mean 𝜇 = 8 and standard Sampling Distribution of 𝒙 deviation 𝜎 𝑛 = 4 30 = 0.73. This is ensured by CLT with a sample size of 𝑛 = 30.

1. The average duration is less than 7 years. 𝑧= 𝑥ҧ − 𝜇 7 − 8 = = −1.37 0.73 𝜎Τ 𝑛 Using 𝑍-score table, 𝑃 𝑥ҧ < 7 = 𝑃 𝑍 < −1.37 = 0.0853 2. The average duration exceeds 7 years. Using the complement rule. 𝑃 𝑥ҧ > 7 = 1 − 𝑃 𝑥ҧ ≤ 7 = 1 − 0.0853 = 0.9147 3. The average duration lies within 1 year of the population mean 𝜇 = 8. 𝑧= 𝑥ҧ − 𝜇 9 − 8 = = 1.37 0.73 Τ 𝜎 𝑛 Using 𝑍-score table and results from 1 𝑃 7 < 𝑥ҧ < 9 = 𝑃 −1.37 < 𝑍 < 1.37 = 0.9147 − 0.0853 = 0.8294 MiamiOH.edu/cas @CASMiamiOH 11 CLT and the Binomial Random Variable • The Central Limit Theorem can be used to conclude that the binomial random variable 𝑋 is approximately normal when 𝑛 is large, with mean 𝑛𝑝 and standard deviation 𝑛𝑝𝑞 • The sample proportion, 𝑝Ƹ = 𝑥 𝑛 is simply a rescaling of the binomial random variable 𝑋, dividing it by 𝑛. • From the Central Limit Theorem, the sampling distribution of 𝑝Ƹ will also be approximately normal, with a rescaled mean and standard deviation Remember to check the assumptions of the normal approximation of the binomial distribution. 𝑛𝑝 > 5 and 𝑛𝑞 > 5 MiamiOH.edu/cas @CASMiamiOH 12 Sampling Distribution of the Sample Proportion • A random sample of size 𝑛 is selected from a population that follows the binomial distribution with parameter 𝑝.

• The sampling distribution of the sample proportion, 𝑝Ƹ = will have a mean 𝑝 and standard deviation • 𝑥 𝑛 𝑝𝑞 𝑛 If 𝑛 is large, and 𝑝 is not too close to zero or one, the sampling distribution of 𝑝Ƹ will be approximately normal. The standard deviation of 𝑝Ƹ is sometimes called the Standard Error (SE). MiamiOH.edu/cas @CASMiamiOH 13 Probabilities for the Sample Proportion • If the sampling distribution of 𝑝Ƹ is normal or approximately normal, standardize or rescale the interval of interest in terms of 𝑧= • 𝑝Ƹ − 𝑝 𝑝𝑞 𝑛 Find the appropriate area using a Z-Score table We have, 𝑆: Proportion of Underfilled cans 𝑛 = 200; 𝑝 = 0.05; 𝑞 = 0.95 𝑛𝑝 = 10 > 5 and 𝑛𝑞 = 190 > 5 Passes Assumptions Example: The soda bottler claims that only 5% of the soda cans are underfilled. A quality control technician randomly 0.1−0.05 samples 200 cans. What is the probability 𝑃 𝑝Ƹ > 0.1 = 𝑃 𝑍 > 0.05(0.95)Τ200 = 𝑃 𝑍 > 3.24 that more than 10% of the cans are = 1 − 0.99994 = 0.0006 underfilled?

This would be very unusual, if indeed 𝑝 = 0.05! MiamiOH.edu/cas @CASMiamiOH 14 Applying CLT with Sample Proportion Example: A random sample of 500 parents were surveyed about the importance of sports for boys and girls. Of the parents interviewed 60% agreed that boys and girls should have equal opportunities to participate in sports. Suppose someone claims the true proportion of parents in the population is actually equal to 55%. Since the 500 parents were randomly selected and 𝑛𝑝 = 500 ⋅ 0.55 = 275 ≫ 5 and 𝑛𝑞 = 500 ⋅ 0.45 = 225 ≫ 5, we can use the CLT to draw conclusions about the population.

What is the probability of observing a sample proportion as large as or larger than the observed value 𝑝Ƹ = 0.6? 𝑧= 𝑝Ƹ − 𝑝 0.6 − 0.55 = = 2.25 0.0222 Τ 𝑝𝑞 𝑛 Using 𝑍-score table we find 𝑃 𝑝Ƹ > 0.6 ≈ 𝑃 𝑍 > 2.25 = 1 − 0.9878 = 0.0122 Notice an observation of 60% is a pretty rare event. MiamiOH.edu/cas @CASMiamiOH 15 Large Sample Estimation Introduction • Populations are described by their probability distribution and parameters. For quantitative populations, the location and shape are described by 𝜇 and 𝜎. • For binomial populations, the locations and shape are determined by 𝑝 • • If the values of parameters are unknown, we make inferences about them using sample information. MiamiOH.edu/cas @CASMiamiOH 16 Large Sample Estimation Types of Inferences • Estimation: • • • Estimating or predicting the value of the parameter.

What is/are the most likely values of 𝜇 or 𝑝? Hypothesis Testing: • Making decisions about the value of a parameter based on some preconceived idea. • Did the sample come from a population with 𝜇 = 5? Or similarly a population 𝑝 = 0.2? MiamiOH.edu/cas @CASMiamiOH 17 Large Sample Estimation Inferences Using the Estimation Using Hypothesis Testing Example: A consumer wants to estimate the average price of similar homes in their city before putting their house on the market. Example: A manufacturer want to know if a new type of steel is more resistant to high temperatures than the old type. Estimation: They estimate, 𝜇, the average home price by using the sample mean Hypothesis Test: Is the new average resistance, 𝜇new , equal to the old average resistance, 𝜇old ? MiamiOH.edu/cas @CASMiamiOH 18 Large Sample Estimation Types of Inferences – Continued • Whether you are estimating parameters or testing a hypothesis, statistical methods are important because they provide: • • Methods for making inferences A numerical measure of the goodness or reliability of that inference. MiamiOH.edu/cas @CASMiamiOH 19 Large Sample Estimation Estimators versus Estimation

• An estimator is a rule, usually a formula, that tells you how to calculate the estimate based on the sample. • Point Estimation: A single number calculated to estimate the parameter • Interval Estimation: Two numbers are calculated to create an interval within which the parameter is expected to lie. MiamiOH.edu/cas @CASMiamiOH 20 Large Sample Estimation Properties of Point Estimators • • • Since an estimator is calculated from sample values, it varies from sample to sample according to its sampling distribution. An estimator is unbiased if the mean of its sampling distribution equals the parameter of interest. Of all the unbiased estimators, the preferred estimators are those whose sampling distributions has the smallest spread or variability .

MiamiOH.edu/cas @CASMiamiOH 21 Large Sample Estimation Measuring the Goodness of an Estimator • The distance between an estimate and the true value of the parameter is the error of estimation. The distance between the arrow and the bullseye. • In this section, the sample sizes are large, so that our unbiased estimators will have normal distributions. • Recall: The Central Limit Theorem (CLT) MiamiOH.edu/cas @CASMiamiOH 22 Large Sample Estimation The Margin of Error • • For an unbiased estimator with a normal sampling distribution, 95% of all point estimates lie within 1.96 standard deviations of the parameter of interest. Margin of Error: The maximum error of estimation calculated as . Notice: The margin of error is 1.96 ⋅ SE of the estimator MiamiOH.edu/cas @CASMiamiOH 23 Large Sample Estimation Estimating Means and Proportions • For a quantitative population.

Point estimator of the population mean, 𝜇: 𝑥ҧ 𝑠 Margin of Error (𝑛 ≥ 30): ±1.96 𝑛 • For a binomial population 𝑥 Point estimator of the population proportion, 𝑝Ƹ : 𝑛 Margin of Error (𝑛 ≥ 30): ±1.96 𝑝ො𝑞ො 𝑛 MiamiOH.edu/cas @CASMiamiOH 24 Large Sample Estimation Interval Estimation • • • Create an interval (𝑎, 𝑏) so that you are fairly sure that the parameter lies between these two values. “Fairly sure” means with high probability, as measured by the confidence coefficient. Usually, 1 − 𝛼 = 0.9, 0.95, 0.98 or 0.99 Suppose 1 − 𝛼 = 0.95 and that the estimator has a normal distribution. Estimator ±1.96 ⋅ 𝑆𝐸 MiamiOH.edu/cas @CASMiamiOH 25 Large Sample Estimation Interval Estimation – Continued

• Since we don’t know the value of the parameter consider Estimator ± 1.96 ⋅ SE • . Only if the estimate falls in the tail areas will the interval fail to enclose the parameter. This only happens 5% of the time. MiamiOH.edu/cas @CASMiamiOH 26 Large Sample Estimation Different Confidence Levels • To change a general confidence level , 1 − 𝛼, pick a value of 𝑍 that puts area 1 − 𝛼 in the center of the 𝑍 distribution Tail Area 𝒁 𝜶 Τ𝟐 0.05 1.645 0.025 1.96 0.01 2.33 0.005 2.58 100 1 − 𝛼 % confidence interval: Estimator ± 𝑍𝛼Τ2 ⋅ SE . MiamiOH.edu/cas @CASMiamiOH 27 Large Sample Estimation Confidence Interval for Means and Proportions • For a quantitative population. Confidence interval for a population mean, 𝜇: 𝑥ҧ ± 𝑍𝛼Τ2 • 𝑠 𝑛 For a binomial population Confidence interval for a population proportion, 𝑝 : 𝑝Ƹ ± 𝑍𝛼Τ2 𝑝Ƹ 𝑞ො 𝑛 MiamiOH.edu/cas @CASMiamiOH 28 Large Sample Estimation Estimating the Difference Between Two Means • • Sometimes we are interested in comparing the means of two populations.

We define our random sample as follows: • • • Sample 1: a random sample of size 𝑛1 drawn from population 1 with mean 𝜇1 and variance 𝜎12 Sample 2: a random sample of size 𝑛2 drawn from population 2 with mean 𝜇2 and variance 𝜎22 We compare the two averages by making inferences about 𝜇1 − 𝜇2 , the difference in the two population averages. • • If the two populations are the same, then 𝜇1 − 𝜇2 = 0 The best estimate of 𝜇1 − 𝜇2 is the difference in the two sample means , 𝑥ҧ1 − 𝑥ҧ2 . MiamiOH.edu/cas @CASMiamiOH 29 Large Sample Estimation ഥ𝟏 − 𝒙 ഥ𝟐 The Sampling Distribution of 𝒙 • The mean of 𝑥ҧ1 − 𝑥ҧ2 is 𝜇1 − 𝜇2 , the difference in the population means. 𝜎12 𝑛1 + 𝜎22 𝑛2 • The standard deviation of 𝑥ҧ1 − 𝑥ҧ2 is SE = • If the sample sizes are large, the sampling distribution of 𝑥ҧ1 − 𝑥ҧ2 is approximately normal, and SE can be estimated as SE = 𝑠12 𝑛1 + 𝑠22 . 𝑛2 MiamiOH.edu/cas @CASMiamiOH 30 Large Sample Estimation Estimating 𝝁𝟏 − 𝝁𝟐 • For large samples, point estimates and their margin of error as well as confidence intervals are based on the standard normal (𝑍) distribution. Point Estimate for 𝜇1 − 𝜇2 : 𝑥ҧ1 − 𝑥ҧ2 Margin of Error: ±1.96 ⋅

• 𝑠12 𝑛1 + 𝑠22 𝑛2 Confidence Interval for 𝜇1 − 𝜇2 : (𝑥ҧ1 − 𝑥ҧ2 ) ± 𝑍𝛼Τ2 ⋅ 𝑠12 𝑛1 + 𝑠22 𝑛2 The confidence interval contains the value 𝜇1 − 𝜇2 = 0. Therefore, it is possible that 𝜇1 = 𝜇2 . You would not want to conclude that there is a difference in averages between the two populations. MiamiOH.edu/cas @CASMiamiOH 31 Large Sample Estimation ഥ𝟏 − 𝒙 ഥ𝟐 , Example 𝒙 Tire 1 Tire 2 𝑥1ҧ = 26,400 miles 𝑥ҧ2 = 25,100 miles 𝑠12 = 1,440,000 𝑠22 = 1,960,000 The wearing qualities of two types of automobile tires were compared by road-testing samples of 𝑛1 = 𝑛2 = 100 tires for each type and recording the number of miles until wearout, defined as a specific amount of tire wear. (Results given in table.) Estimate (𝜇1 − 𝜇2 ), the difference in mean miles to wearout, using a 99% confidence interval. Is there a difference in the average wearing quality for the two types of tires? Computing the Point Estimate of (𝝁𝟏 − 𝝁𝟐 ): 𝑥1ҧ = 𝑥ҧ2 = 26,400 = 25,100 = 1,300 miles confidence interval we have, 𝑠2 𝑠2 1,440,000 1,960,000 Standard Error of (ഥ 𝒙𝟏 − ഥ 𝒙𝟐 ): 1 + 2 = + = 184.4 miles 𝑛1 𝑛2 100 100 824.2 < 𝜇1 − 𝜇2 < 1,775.8.

The difference in the average miles to wearout for the two types of tires is estimated to lie between the lower confidence limit 824.2 and upper confidence limit of 1,775.8. MiamiOH.edu/cas @CASMiamiOH 32 Large Sample Estimation Estimating the Difference Between Two Proportions • • Sometimes we are interested in comparing the proportion of “successes” in two binomial populations. We define our random sample as follows: • • • Sample 1: a random sample of size 𝑛1 drawn from binomial population 1 with parameter 𝑝1 Sample 2: a random sample of size 𝑛2 drawn from binomial population 2 with parameter 𝑝2 We compare the two proportions by making inferences about 𝑝1 − 𝑝2 , the difference in the two population proportions.

• • If the two populations are the same, then 𝑝1 − 𝑝2 = 0 The best estimate of 𝑝1 − 𝑝2 is the difference in the two sample proportions, 𝑝Ƹ1 − 𝑝Ƹ 2 = 𝑥1 Τ𝑛1 − 𝑥2 Τ𝑛2 MiamiOH.edu/cas @CASMiamiOH 33 Large Sample Estimation ෝ𝟏 − 𝒑 ෝ𝟐 The Sampling Distribution of 𝒑 • The mean of the sampling distribution of 𝑝Ƹ1 − 𝑝Ƹ 2 should be 𝑝1 − 𝑝2 , as the difference in the population proportions. • The standard deviation of 𝑝Ƹ1 − 𝑝Ƹ 2 is SE = • If the sample sizes are large, the sampling distribution of 𝑝Ƹ1 − 𝑝Ƹ 2 is 𝑝1 𝑞1 𝑛1 + 𝑝2 𝑞2 𝑛2 approximately normal, and SE can be estimated as SE = 𝑝ො1 𝑞ො1 𝑛1 + MiamiOH.edu/cas 𝑝ො2 𝑞ො2 . 𝑛2 @CASMiamiOH 34 Large Sample Estimation Estimating 𝒑𝟏 − 𝒑𝟐 • For large samples, point estimates and their margin of error as well as confidence intervals are based on the standard normal (𝑍) distribution. Point Estimator for p1 − 𝑝2 : 𝑝Ƹ1 − 𝑝Ƹ 2 Margin of Error: ±1.96 ⋅ • 𝑝ො1 𝑞ො1 𝑛1 + 𝑝ො2 𝑞ො2 𝑛2 Confidence Interval for p1 − 𝑝2 : 𝑝Ƹ1 𝑞ො1 𝑝Ƹ 2 𝑞ො2 (𝑝Ƹ1 − 𝑝Ƹ 2 ) ± 𝑍𝛼Τ2 ⋅ + 𝑛1 𝑛2 The confidence interval contains the value p1 − 𝑝2 = 0. Therefore, it is possible that p1 = 𝑝2 .

You would not want to conclude that there is a difference in proportions between the two populations. MiamiOH.edu/cas @CASMiamiOH 35 Developing Section Rest of City Sample Size, 𝑛 50 100 Favoring 38 65 Large Sample Estimation ෝ𝟏 − 𝐩 ෝ𝟐 , Example 𝐩 A bond proposal for school construction is on the P-hat 0.76 0.65 ballot at the next city election. Money from this bond issue will be used to build schools in rapidly developing section of the city, and the remainder will be used to renovate and update school buildings in the rest of the city. Data from the random sample of residents is given above. 1. Estimate the difference in the true proportions favoring the bond proposal with a 99% confidence interval. 2. If both samples were pooled into one sample of size 𝑛 = 150, with 103 in favor of the proposal, provide a point estimate of the proportion of city residents who will vote for the bond proposal. What is the margin of error? MiamiOH.edu/cas @CASMiamiOH 36 Developing Section Rest of City Sample Size, 𝑛 50 100 Favoring 38 65 0.76 0.65 Large Sample Estimation ෝ𝟏 − 𝐩 ෝ𝟐 , Example Continued 𝐩 Estimate the difference in the true proportions P-hat favoring the bond proposal with a 99% confidence interval. Point Estimate of 𝒑𝟏 − 𝒑𝟐 : 0.76 − 0.65 = 0.11 1. ෝ𝟐 ): Standard Error (ෝ 𝒑𝟏 − 𝒑 𝑝ො1 𝑞ො 2 𝑛1 + 𝑝ො2 𝑞ො2 𝑛2 = Co.

Do you have a similar assignment and would want someone to complete it for you? Click on the ORDER NOW option to get instant services at EssayBell.com