The Food service Industry Survey Design Analysis

Description

Design and analyse a survey. Data of the survey can be found online. The survey should not be very hard.Design and analyse a survey. Data of the survey can be found online. The survey should not be very hard.

Link for database

https://www.kaggle.com/banaveenkumar/vancouver-restaurent-dataset

Explanation & Answer length: 4 pages

Group Project Introduction Discrepancies in wage can often be the source of dissatisfaction in the workplace. If an employee working in one city does the same job as another employee in a different city, it is often argued that the wages of the two workers should be the same. As such in 2014, the schoolteachers in British Columbia had one of the longest strikes in recent history to the point that final exams were cancelled and schools around the province had late starts for the academic year 2014-2015. The strike revolved not only around the usual wages dispute but also centered on the unevenness in the average salaries of teachers across the provinces in different districts.

The teachers union demanded that teachers should be earning approximately the same salaries irrespective of which district/city/school they teach in, especially amongst those with similar qualifications in terms of previous educational experience. Therefore, for this project we have decided to use different sampling methods in order to try and estimate the average salary of teachers in BC based on the sample data set collected. As we are unable to account for the average annual salaries for all the teachers in BC, taking samples of teachers across various districts in BC can be used to estimate the average annual salary for a schoolteacher in British Columbia. The dataset contains the teacher’s average salaries for the 60 school districts in BC, Canada. The average salaries of teachers were of Category Type 4 teachers (Bachelor’s Degree) as of September 1, 2014. The dataset also includes the number of schools per district.

The scope of this project will be to use different sampling methods to compare which of these methods come closest to the actual average annual salaries for Category Type 4 teachers across BC. Methods For this project, random sampling with replacement and simple random sampling (SRS) without replacement will both be used. Sampling with replacement means that n independent draws can be taken from the population. The simulations were run b=10,000 times in order to obtain the simulation populations. Based on the 2 designs, a sample of n=20 is taken from the population of N=60 to be used for the different methods. A total of 5 different estimation methods were used to try and estimate the average annual salary of category type 4 teachers in BC. 1. Simple Random Sampling Without Replacement (SRS) Sample values are not independent of one another as samples are taken directly from the population. 2. Unequal Probability Sampling a. PPS (Sampling With Replacement) b. Hansen-Hurwitz Estimator (With replacement Using the additional information of number of schools per district across BC, an unequal probability design is used due to the number of schools in each district having different sizes. A sample with probability proportional to the number of schools per district is selected with replacement meaning that a unit may be selected on more than one draw. The estimators utilize the units value as many times as it is selected. 3. Stratified Sampling Stratified Sampling Design was conducted by using SRS in each stratum in order to obtain independence between strata. For this project, we decided to use three stratum. This was obtained by rearranging the average salary in ascending order and then by computing for the possible salary ranges per stratum. We excluded the maximum and minimum values since they were outliers. To obtain the intervals per stratum, we subtracted the second largest and second smallest average salary then divided it by three. The stratum intervals were: (41919.8~54626.89), (54914.3~55614.8) and (56249.2~62412). We determined the sample size for each stratum based on the optimal allocation method. 4. Sampling Using Ratio Estimation Using the number of schools in each district as an auxiliary variable, ratio estimation is used to estimate the average annual salary of category type 4 teachers across BC. The auxiliary information is used both in the estimation and the design. Results Comparing Results Method/ Results SRS PPS HH Stratified Ratio Bias 6.904821 -374.246866 -70.082635 1021.050994 1547.610035 MSE 124554.8 237228.8 142803731.5 1063405.4 88270419.2 Sample Mean 54338.33 53957.17 54261.34 55352.47 55879.03 Based on the results above, the simple random sampling method has the smallest bias whereas the ratio estimation has the largest value in comparison to the actual average salary of category type 4 teachers in Canada (54331.42). The mean squared error values seem to indicate again that the SRS estimation method as having the lowest MSE value whereas the HH estimation method producing the highest MSE value. The SRS estimation method also produces the closest estimate to the actual average annual salaries for teachers followed by the HH estimator. Comparing the unequal probability sampling methods the PPS produces a lower MSE however; the HH estimator is able to estimate more accurately the actual average annual salary of teachers. The sample mean value produced by the PPS estimator is not unbiased for the population means with the unequal probability design, hence the Hansen-Hurwitz estimator used for comparison with this unequal probability design. The larger MSE values can also be explained by looking at the histograms (see appendix B) for the generated simulation populations for each estimation method. The range for average salary in the histogram for the HH (20000~100000) explains the large MSE value whereas the SRS method has the smallest range for the estimated sample salary (53000~55000). In terms of distributions it can be seen that all the histograms follow approximately a normal distribution with the histogram for the SRS and PPS methods being the closest to a symmetric bell curve. The histogram to represent the population distribution produced for the ratio estimation method was not included due to the small values generated for the range although the distribution also seemed to have a normal distribution. Discussion In terms of what the results mean for the real situation although the SRS produces estimates closest to that of the actual average annual salary, the study is more complicated as the province of British Columbia is divided into 3 regions and we conducted the analysis by region. With the regions being split into 3 regions it is very much a possibility that regions containing a population of higher economic status (urban areas), may have higher paid school teachers compared to the school teachers who may be paid lower in the rural areas. Furthermore, in the urban areas some of the teachers may have been earning a higher salary as they may be teaching in private schools compared to the rural areas where most of the schools are public schools and thus a major difference can be seen in the average annual salaries of the teachers. Relating the results to the teacher’s strike of 2014, this difference in average annual salary across the entire province can be explained simply by the rural/urban difference in economic status. However, in terms of relating the annual salaries calculated here compared to the average annual salary of a person living in British Columbia, the average category type 4 teacher actually earns significantly more than an average person does. It is estimated by Statistics Canada, that as of December 2014, the average Canadian salary in the province of British Columbia is $46,900. Comparing this value to the sample average annual salary produced ranging from approximately $53,000 to $56,000 may indicate that the strike may have not been due to earning enough to survive but more related to the sizes of classrooms or average number of students taught which may be used as a better auxiliary variable to measure the approximate salary for a schoolteacher in BC. Appendix A: R Codes & Results > #**************************************** ************ > # STAT 410 Project ># ># ># ># ># April 13, 2015 Miguel Luis Valdez, Li Yong Xian Adrian De Leon, Sim Hong Jiang > #**************************************** ************ > ### Data Set Caveat: > # The dataset contains the teacher’s average salaries for the 60 school districts in British Columbia, Canada. > # Data was obtained using the average salaries of Category type 4 teachers (Bachelor’s Degree) as of > # September 1,2014. The dataset also includes the number of schools per district. ># ># > #**************************************** ************ > # Read in data > school.data school.data District Avg.Salary Number.of.Schools Stratum 1 5 54563.90 22 1 2 6 47919.80 20 1 3 19 53162.00 4 1 4 22 54412.30 27 1 5 23 53162.00 44 1 6 27 53162.00 30 1 7 33 53156.00 31 1 8 34 53162.00 46 1 9 35 53162.00 44 1 10 36 53162.00 123 1 11 37 53162.00 31 1 12 38 53162.00 53 1 13 40 53199.80 13 1 14 41 53162.00 49 1 15 42 53162.00 31 1 16 43 53162.00 67 1 17 44 53315.00 38 1 18 45 53162.00 17 1 19 46 53162.00 14 1 20 47 53853.67 9 1 21 48 54412.30 16 1 22 51 54415.90 11 1 23 53 53162.00 10 1 24 54 53637.80 9 1 25 57 53162.00 55 1 26 62 54434.00 27 1 27 63 53472.80 18 1 28 64 53162.00 10 1 29 67 53162.00 20 1 30 68 54626.89 48 1 31 69 53162.00 17 1 32 70 53248.80 15 1 33 71 53162.00 23 1 34 72 54412.30 24 1 35 74 54508.10 15 1 36 75 53162.00 24 1 37 78 53162.00 10 1 38 79 53162.00 20 1 39 82 53702.60 23 1 40 83 53162.00 32 1 41 93 54086.00 38 1 42 8 54916.10 23 2 43 20 55032.30 12 2 44 28 54963.50 18 2 45 39 54916.50 119 2 46 58 55614.80 13 2 47 61 54914.30 56 2 48 84 55055.50 5 2 49 85 55081.00 11 2 50 91 55137.50 23 2 51 10 56249.40 5 3 52 49 57726.20 4 3 53 50 57524.40 8 3 54 52 56575.20 9 3 55 59 56988.63 23 3 56 60 56448.80 21 3 57 73 57007.63 51 3 58 81 57534.20 5 3 59 87 62412.00 4 3 60 92 57683.30 4 3 > > school.data$District > # Relationship of average salary and number of schools per district > plot(school.data$Number.of.Schools,schoo l.data$Avg.Salary) > > # Defining values > y x N b > # Population Parameters > var mu > > ## Methodology > ####### SRS without replacement ####### > ybar_SRS # For n = 20 > n for (k in 1:b){ + s par(mfrow=c(2,2)) > hist(ybar_SRS) > bias_SRS mse_SRS mean_SRS > bias_SRS [1] 6.904821 > mse_SRS [1] 124554.8 > mean_SRS [1] 54338.33 > > ####### Unequal Probability Sampling ####### > # PPS (Sampling with Replacement) > p p [1] 22 20 4 27 44 30 31 46 44 123 31 53 13 49 31 67 38 17 14 9 16 [22] 11 10 9 55 27 18 10 20 48 17 15 23 24 15 24 10 20 23 32 38 23 [43] 12 18 119 13 56 5 11 23 5 4 8 9 23 21 51 5 4 4 > sum(p) [1] 1592 > p p [1] 0.013819095 0.012562814 0.002512563 0.016959799 0.027638191 0.018844221 0.019472362 [8] 0.028894472 0.027638191 0.077261307 0.019472362 0.033291457 0.008165829 0.030778894 [15] 0.019472362 0.042085427 0.023869347 0.010678392 0.008793970 0.005653266 0.010050251 [22] 0.006909548 0.006281407 0.005653266 0.034547739 0.016959799 0.011306533 0.006281407 [29] 0.012562814 0.030150754 0.010678392 0.009422111 0.014447236 0.015075377 0.009422111 [36] 0.015075377 0.006281407 0.012562814 0.014447236 0.020100503 0.023869347 0.014447236 [43] 0.007537688 0.011306533 0.074748744 0.008165829 0.035175879 0.003140704 0.006909548 [50] 0.014447236 0.003140704 0.002512563 0.005025126 0.005653266 0.014447236 0.013190955 [57] 0.032035176 0.003140704 0.002512563 0.002512563 > sum(p) [1] 1 > > # Select a sample with probability proportional to number of schools per district, > # with replacement. > n s s [1] 1 8 17 18 12 31 9 17 39 16 7 17 25 43 29 44 8 38 34 14 > y[s] [1] 54563.9 53162.0 53315.0 53162.0 53162.0 53162.0 53162.0 53315.0 53702.6 53162.0 [11] 53156.0 53315.0 53162.0 55032.3 53162.0 54963.5 53162.0 53162.0 54412.3 53162.0 > mean(y[s]) [1] 53527.88 > > # Simulate 10000 times > ybar_pps for (k in 1:b) { + s mean_pps bias_pps mse_pps > bias_pps [1] -374.2469 > mse_pps [1] 237228.8 > mean_pps [1] 53957.17 > > # Sample mean is not unbiased for the population mean > # with the unequal probability design > # Hence, use Hansen-Hurwitz Estimator: > hh hh for (k in 1:b){ + s hist(hh) > mean_hh bias_hh mse_hh > bias_hh [1] -70.08263 > mse_hh [1] 142803731 > mean_hh [1] 54261.34 > > > ####### Stratified Sampling ####### > # Stratified into 3 stratum > # First order average salary in ascending order > # then grouped into three stratum by stratification principles > # exclude the max and min (outliers), then use (second largest – second smallest)/3. > # Stratum separations: 41919.8~54626.89, 54914.3~55614.8,56249.4~62412 > > st table(st) st 1 2 3 41 9 10 > > tapply(y,st,mean) 1 2 3 53368.39 55070.17 57614.98 > tapply(y,st,var) 1 2 3 1037930.91 48243.47 3123364.87 > > y1 y2 y3 > # Population statistic > var1 = var(y1) > var2 = var(y2) > var3 = var(y3) > > n1 n2 n3 N1 N2 N3 > # Sample size > n = 20 > # Optimal allocation: > n*N1*var1 / (N1*var1 +N2*var2 + N3*var3) [1] 11.46684 > # n1=11 > n*N2*var2 / (N1*var1 +N2*var2 + N3*var3) [1] 0.1169964 > # n2=1 > n*N3*var3 / (N1*var1 +N2*var2 + N3*var3) [1] 8.416164 > # n3=8 > > # Stratified sampling uses SRS for each stratum > must b = 10000 > for (k in 1:b){ + s1 > ####### Ratio estimation ####### > # Reading in data > mux > # Simulation > muhatr r for(k in 1:10000){ + s bias_re [1] 1547.61 > mse_re [1] 88270419 > mean_re [1] 55879.03 > > # Results > Bias MSE Mean > Bias [1] 6.904821 -374.246866 -70.082635 1021.050994 1547.610035 > MSE [1] 124554.8 237228.8 142803731.5 1063405.4 88270419.2 > Mean [1] 54338.33 53957.17 54261.34 55352.47 55879.03 > mu [1] 54331.42 Appendix B – Plots & Histogram Project. The class project will be to design and analyse a survey. Groups may collect their own survey data or analyse existing data. The project will be worth 25% of the overall grade. A 5-10 page report on your project will be due at midnight on April 18. The assessment will include quality of the writing and logic of the arguments. You may work in groups of up to three people. Groups of any size greater than 3 people should break into subgroups working on separate but related projects. For example, if you have 4 people who want to work together, break yourselves into two groups of size two to work on related topics that synergize with eachother. A note on project ideas. .Many people prefer to design their own survey and go through the entire process of conducting a study. (Then they retain control.) The survey doesn’t have to have people as units. For example, you can do a survey of tins of pineapple chunks purchased at your local grocery store to check the volume of fruit (by fluid displacement). Some people have used google maps to survey features in neighborhoods (e.g parked cars). Others have done surveys of flowers such as crocuses blooming in the front yards of residential neighborhoods (near where they live, so a chance to exercise outside). Other people have taken a more sedentary approach and used websites such as rew.ca to sample house prices in neighborhoods. Other possibilities include Craig’s list for sampling car prices, indeed.com for jobs, trovit.ca for apartments. As you can see there are lots of possibilities. That being said, you may also use existing data collected for another purpose but then will need to take more care to understand and acknowledge any selection bias or other limitations for your own study when writing up your report. Hi everyone, I enjoyed the lightning talks a great deal! The project proposals were diverse, interesting and original! Although the project proposals were quite varied, it seems like they can be grouped into two types. One type is projects that are using data from an existing survey in a new way (e.g. different research questions). The other type is projects from people who are designing their own survey and collecting their own data. I had a few general comments for the reports associated with each type of project. If you are using someone else’s survey data, please be careful to thoroughly describe the design for their survey and why you think it is suited to your own (different) research questions. Show that you understand how they did their survey and that you understand its limitations for your research questions. Make it clear that your research questions differ from those in the original survey. Also: 1. Specify the target population of the original survey. 2. Describe the sampling frame of the original survey. 3. Identify the sampling unit and the observation unit of the original survey. 4. State the potential sources of bias for your research questions. 5. Be up front about any limitations of the design of the original survey for reaching conclusions about your research questions. If you’ve designed your own survey (and collected your own data), please also thoroughly describe your design and why you think it works for your research questions. In particular, mention: 1. What is your target population? 2. What is your sampling frame? (If applicable, how did you get it? Sometimes this is challenging.) 3. What is your sampling unit? Your observation unit? 4. What are potential sources of bias for your research questions? 5. Be up front about any potential limitations of your study design for reaching conclusions about your research questions. Thanks and looking forward to your reports. Hi everyone, A couple of people have asked after class about the report formatting for projects. I’d suggest setting up the report so that the title page has your project title, a list of all the authors/contributors (i.e. your group), and a short abstract summarizing the findings of your study. Following the title page, you can set it up as research paper with sections such as: Background and Introduction (including study questions and briefly why you’re interested in them) • Methods (including your survey design) • Results (i.e. summaries of your data, results of your analyses) • Discussion and Conclusion (including any anticipated shortcomings in your study design and their impact on your conclusions about the questions of interest) • References/Bibliography Appendix (if necessary) No study is ever perfect and there are bound to be limitations that you identify along the way. In the Discussion section, be sure to show that you recognize these limitations and have thought through their potential impact on your study conclusions. The main text of the report (i.e. Backgroun…

Do you have a similar assignment and would want someone to complete it for you? Click on the ORDER NOW option to get instant services at EssayBell.com. We assure you of a well written and plagiarism free papers delivered within your specified deadline.