# Relationship Between Preparation Time And Student Marks In A Statistics Class

## Survey and Sampling Methods

Many Holmes Institute instructors believe that students need to spend at least 2 hours studying outside of class for every hour of lecture. They believe that the number of hours students study to prepare for the exam affect students’ marks significantly. As opposed, few of the lecturers believe that the number of preparation hours do not essentially affect students’ marks while some other factors are to be considered. To study the relationship between the preparation time spent by each student (in hours) for the exam and the reported mark, a sample of 100 students were selected randomly from a large statistics class. The data are stored in the file named “ASSIGNMENTDATA” in the course website. Answer below 9 questions:

Cross-sectional survey; this is where the researcher collects data from the respondents at a single period in time uses the cross-sectional type of survey.

Simple random sampling could be used. This method would give the participants an equal chance of being included into the study and as such will reduce the chances of bias.

1. On the basis of given data, determine the dependent and independent variables we should use, and why? Also, identify the data type(s) for each variable.

The dependent variable is the student’s marks while the independent variable is the number of hours students study to prepare for the exam. This is because number of hours students study to prepare for the exam is believed to influence the students marks hence it is the independent variable while the student marks is the dependent variable.

• Non-response from some of the participants. Some participants might not be willing to respond for their own reasons.
• High cost of collecting data; one challenge would be in regard to the cost if the participants are widely spread apart.

Using 8 classes and intervals of 20 – 30, 30 – 40, etc for both of the variables selected in question 3, develop a distribution tableincluding class intervals, frequency, relative frequency and cumulative relative frequency for each variable. Then, draw frequency histogram, relative frequency histogram and cumulative relative frequency histogram for each variable. Also, Comment on the shape of frequency histogram for each variable and provide reason(s) for your comment.

 Class Interval Frequency Relative Frequency Cumulative relative frequency 20-30 1 0.01 0.01 30-40 8 0.08 0.09 40-50 16 0.16 0.25 50-60 20 0.2 0.45 60-70 20 0.2 0.65 70-80 17 0.17 0.82 80-90 12 0.12 0.94 90-100 6 0.06 1

 Class Interval Frequency Relative Frequency Cumulative relative frequency 20-30 1 0.01 0.01 30-40 5 0.05 0.06 40-50 10 0.1 0.16 50-60 17 0.17 0.33 60-70 21 0.21 0.54 70-80 22 0.22 0.76 80-90 14 0.14 0.9 90-100 10 0.1 1

In the next three figures, we present the frequency histogram, the relative frequency histogram and the cumulative relative frequency histogram for the preparation time. The histogram help to visualize the distribution of the data.

Figure 1: Frequency Histogram for the preparation time

Figure 2: Relative Frequency Histogram for the preparation time

Figure 3: Cumulative Relative Frequency Histogram for the preparation time

The histogram (both frequency and relative frequency) of the preparation time shows that the distribution is left skewed (has longer tail to the left).

The next three figures below presents the frequency histogram, the relative frequency histogram and the cumulative relative frequency histogram for the student marks.

## Descriptive Statistics and Analysis

Figure 4: Frequency Histogram for the student marks

Figure 5: Relative Frequency Histogram for the student marks

Figure 6: Cumulative Relative Frequency Histogram for the student marks

The histogram for the student’s marks shows that the distribution is skewed to the left (longer tail to the left).

Draw and use an appropriate scatter plot to investigate the relationship between the two variables. Also, briefly explain the selection of each variable on the X and Y axes and the reason? Finally, draw the fitting line for the plotted observations.

Figure 7: A scatter plot of student’s marks against preparation time (number of hours)

As can be seen from the above plot, the X-axis is the preparation time while the Y-axis is the student’s marks. The X-axis is the independent variable hence the reason as to why preparation time was chosen for the x-axis while the Y-axis is the dependent variable hance the reason as to why student’s marks was chosen as the y-axis.

The above scatter plot shows evidence that there exists a positive linear relationship between the two variables (preparation time and student marks). This means that an increase in the number of hours spent by students to prepare for exam would result to an increase in the marks obtained by the student in that particular exam. Similarly, the it can also be inferred that a unit decrease in the number of hours spent by students to prepare for exam would result to a subsequent decrease in the marks obtained by the student in that particular exam.

1. Present the equation of the estimated fitting line (regression) in your answer to Question f. Then, estimate the effect of an increase in the independent variable by one unit on the dependent variable.

The coefficient of the preparation time is 28.984; this means that a unit increase in the independent variable (preparation time) would result to an increase in the dependent variable (student’s marks) by 28.984. It also means that a unit decrease in the independent variable (preparation time) would result to a decrease in the dependent variable (student’s marks) by 28.984.

1. Prepare a numerical summary report about the data on the two variables by including the mean, median, range, variance, standard deviation, smallest and largest values, quartiles, interquartile range and the 30thpercentile for each variable.

Table 3: Descriptive (summary) statistics for the preparation time and student marks

 PREPARATION TIME MARK Mean 63.04 65.74 Median 64 68 Standard Deviation 16.32 17.41 Sample Variance 266.36 303.12 Range 65 75 Minimum 25 25 Maximum 90 100 1st Quartile 51 54 3rd Quartile 76.25 78 Interquartile range 25.25 24 30th percentile 54 58

Table 3 above presents the descriptive statistics for both the preparation time and the student marks. As can be seen, the average preparation time for the 100 sampled students was found to be 63.04 hours with the median time being 64 hours. The lowest amount of time taken by student to prepare for the exam was 25 hours while the highest amount of time taken was found to be 90 hours. The standard deviation was 16.32 implying that the data is not widely spread out.

## Scatter Plot and Regression Analysis

On the other hand, the average student marks was 65 with the highest score being 100 and the lowest score recorded being 25. The median marks scored by the students was 68. Again the standard deviation showed that the student marks are not widely spread out from the mean (SD = 17.41).

Compute a numerical measurement which measures the strength and direction of the linear relationship between the two variables. Also, interpret this value.

Table 4: Correlation coefficient table

 PREPARATION TIME MARK PREPARATION TIME 1 MARK 0.546556 1

As can be seen from the above table, there is a moderate positive relationship between the two variables (preparation time and student’s marks). The correlation coefficient is 0.5466. The fact that the correlation coefficient is positive means that an increase in the number of hours spent by students to prepare for exam would result to an increase in the marks obtained by the student in that particular exam. Similarly, the it can also be inferred that a unit decrease in the number of hours spent by students to prepare for exam would result to a subsequent decrease in the marks obtained by the student in that particular exam.

To determine whether or not the height of sons is related to father’s height (x1) and mother’s height (x2), data were gathered and part of the multiple regression excel output is shown below. Fill the table and answer the following questions.

The missing values in the table have been filled in red colour.

SUMMARY OUTPUT

 Regression Statistics Multiple R 0.5169 R Square 0.2672 Adjusted R Square 0.2635 Standard Error 8.0683 Observations 400 ANOVA df SS MS F Significance F Regression 2 9421.58 4710.79 72.366 0.0000 Residual 397 25843.41 65.097 Total 399 35264.98 Coefficients Standard Error t Stat P-value Intercept 93.8993 8.0072 11.7269 0.0000 X1 0.4849 0.0412 11.7772 0.0000 X2 -0.0229 0.0395 -0.5811 0.5615
1. What is the standard error of estimate? What does this statistic tell you?

The standard error of the estimate is 8.0683. The statistics tells us how accurate the predictions are made from the regression line. And since this value is small enough, it clearly shows that the model is accurate in predicting the height of the son based on the father’s height (x1) and the mother’s height (x2).

1. What is the coefficient of determination? What does this statistic tell you?

The coefficient of determination is 0.2672; this statistic tells u that 26.72% of the variation in the dependent variable (height of son) is explained by the two independent variables (father’s height (x1) and mother’s height (x2)).

1. What is the adjusted coefficient of determination for degree of freedom? What do this statistic and the one referred to in part (b) tell you about how well the model fits the data

The adjusted coefficient of determination tells how great an additional variable predicts the dependent variable. This statistic (adjusted coefficient of determination for degree of freedom) and the coefficient of determination tells on the proportion of variation in the dependent variable is explained by the independent variables. The larger the values of these two statistics the better the model (the better the model fits the data).

1. Test the overall utility of the model. What does the test result tell you?

As can be seen from the ANOVA table, the overall model is statistically significant at 5% level of significance [F(2, 399) = 72.366, p = 0.000].

The coefficient of father’s height (x1) is 0.4849; this means that a unit increase in the father’s height would result to an increase in the height of the son by 0.4849.

The coefficient of mother’s height (x2) is -0.0229; this means that a unit increase in the mother’s height would result to a decrease in the height of the son by 0.0229.

The intercept coefficient is given as 93.8993; this implies that holding all the other factors constant (zero values for the father’s height as well as the mother’s height) we would expect the height of the son to be 98.8993.

1. Do these data allow the statistic practitioner to infer that the heights of the sons and the fathers are linearly related?

Yes the data allow the statistic practitioner to infer that the heights of the sons and the fathers are linearly related. This is based on the fact that the father’s height (x1) was found to be significant in the model (p = 0.0000).

1. Do these data allow the statistic practitioner to infer that the heights of the sons and the mothers are linearly related?

No the data does not allow the statistic practitioner to infer that the heights of the sons and the mothers are linearly related. This is based on the fact that the mother’s height (x2) was found to be insignificant in the model (p = 0.5615).

Calculate the price
Pages (550 words)
\$0.00
*Price with a welcome 15% discount applied.
Pro tip: If you want to save more money and pay the lowest price, you need to set a more extended deadline.
We know how difficult it is to be a student these days. That's why our prices are one of the most affordable on the market, and there are no hidden fees.

Instead, we offer bonuses, discounts, and free services to make your experience outstanding.
How it works
Receive a 100% original paper that will pass Turnitin from a top essay writing service
step 1
Fill out the order form and provide paper details. You can even attach screenshots or add additional instructions later. If something is not clear or missing, the writer will contact you for clarification.
Pro service tips
How to get the most out of your experience with Answers Market
One writer throughout the entire course
If you like the writer, you can hire them again. Just copy & paste their ID on the order form ("Preferred Writer's ID" field). This way, your vocabulary will be uniform, and the writer will be aware of your needs.
The same paper from different writers
You can order essay or any other work from two different writers to choose the best one or give another version to a friend. This can be done through the add-on "Same paper from another writer."
Copy of sources used by the writer
Our college essay writers work with ScienceDirect and other databases. They can send you articles or materials used in PDF or through screenshots. Just tick the "Copy of sources" field on the order form.
Testimonials
See why 20k+ students have chosen us as their sole writing assistance provider
Check out the latest reviews and opinions submitted by real customers worldwide and make an informed decision.
Technology
Customer 452551, October 22nd, 2021
Political science
I like the way it is organized, summarizes the main point, and compare the two articles. Thank you!
Customer 452701, February 12th, 2023
Finance
Thank you very much!! I should definitely pass my class now. I appreciate you!!
Customer 452591, June 18th, 2022
Education
Thank you so much, Reaserch writer. you are so helpfull. I appreciate all the hard works. See you.
Customer 452701, February 12th, 2023
Great paper thanks!
Customer 452543, January 23rd, 2023
Psychology
I requested a revision and it was returned in less than 24 hours. Great job!
Customer 452467, November 15th, 2020
Political science
Thank you!
Customer 452701, February 12th, 2023
Psychology
Thank you. I will forward critique once I receive it.
Customer 452467, July 25th, 2020
Accounting
Thank you for your help. I made a few minor adjustments to the paper but overall it was good.
Customer 452591, November 11th, 2021
11,595
Customer reviews in total
96%
Current satisfaction rate
3 pages
Average paper length
37%
Customers referred by a friend 