This walkthrough provides a masterfully clear distinction between mean and individual variability, a conceptual nuance that remains the ultimate gatekeeper for high-scoring students. It is an indispensable resource for navigating the most cognitively demanding portion of the AP Statistics exam.
Inmersión profunda
Prerrequisito
- No hay datos disponibles.
Próximos pasos
- No hay datos disponibles.
Inmersión profunda
2026 AP Statistics FRQ #6 Explained | Step-by-Step Free Response SolutionAñadido:
What's up my sads? In this video, we're going to talk about question number six in the 2026 APIs exam. Question number six is the final investigative task that's ever going to be on the API exam because for 2027 beyond there's no longer investigative task. It typically is a pretty difficult question and this year it definitely was. It dealt with a scatter plot showing the number of hits several baseball teams gets and the number of runs the baseball team scored.
So, on the surface, it's actually a pretty easy question. And a lot of the parts were fairly easy. But some questions dove into some pretty difficult topics that, well, you might not have been ready for. But regardless of how easy or difficult it was, let's talk about the solutions to it right now. To win a game, a baseball team needs to score runs. To score runs, players need to get on base. The primary way players get on base is by hitting the ball. Figure one shows relationship between the number of hits and the number of runs for 30 randomly selected professional baseball teams. So we see on the x- axis our explanatory variable the number of hits. On the y- axis is the number of runs for 30 baseball teams. All right. Part a says consider the scatter plot in figure one. Describe the relationship between the number of hits and the number of runs. Well hopefully you remember how to do this.
You have to mention the direction positive or negative shape linear or curved or maybe a parabola strength pos good weak moderately weak very strong.
And then you have to give some context as to what's going on. So you could do all this in pretty short paragraph. I wrote the relationship between the number of hits and number of runs for these baseball teams. There's my context is positive. It's definitely going up linear and fairly strong. We could see there's a tendency that the more hits a team gets, the more runs they will score. So to be honest, that was a pretty easy question that I hope everybody got right. Now for part two.
Based on the data from the sample of 30 teams, an equation of the least squares regression line modeling the predicted number of runs from the number of hits is as follows. Negative -3 372.2 plus 0.823 times the number of hits. And what they want us to do is to predict the number of runs a team will get if they have a goal of 1,250 hits. Well, this is also pretty easy. All you have to do is plug in 1,250 into the regression equation. and you get 656.55 runs. So, the predicted number of runs for a team that has 1,250 hits is 656.55 runs. So, you should be able to get one point just because you were able to do those two parts fairly easy. Now, part B. Each team has a total salary equal to the sum of the salaries for all players on the team. The median total salary for the sample of 30 teams is 160 million.
In figure two, points with dots represent teams with a total salary greater than the median, and points with squares represent teams with the total salary less than the median. Now, they did circle one particular point for us, and they marked it with a circle and the letter A. So, basically, the teams with the black dots are teams that make over the median, teams that spend more money, and teams that are in the squares are teams that make less than the median salary, teams that spend less money. All right, the first part single I says compare the team represented by the circle that is labeled A with the other teams that have the same total salary classification. Okay. Well, the one circled is teams that make less than the median. The first thing I notice is that that particular team has more hits and more runs than all other teams that have a salary that's less than the median.
So, it's exactly what I wrote. the circle team had the most hits and the most runs scored than any other team with a total salary less than the median salary of all the teams. Or another way you could say that is so amongst the teams with salaries less than the median, the circle team has the most runs and the most hits. Again, another pretty easy question. I know it was kind of long to read off that, but hopefully wasn't too too bad.
All right. Now, double I says, for each salary classification, consider the linear relationship between the number of hits and the number of runs. For teams with a salary greater than the median, is the strength of the linear relationship stronger, weaker, or similar to the strength of the linear relationship with teams below the median. So, you're looking at the relationship. They're both positive.
They're both fairly linear, but I think that the black dots are a little bit more linear. It's a little bit of a better form, which means it's a little bit stronger than the squares. The squares are, you know, it's still going up, but you see this kind of section of squares right here that are a little bit, you know, different, a little bit lower, don't really necessarily fit the trend as well. So, I believe that the black dots, those are the teams that have a greater than the median salary, have a little bit of a stronger relationship. So, I wrote for teams with a salary greater than the median, the strength of the Lin relationship is stronger than teams with a salary below the median. Now, I also have to be honest. If you said that the relationship is similar, I I really believe that you should get the point.
Now, I can't promise that. I don't know what the the full scoring rubic has been yet. They haven't released it, but honestly, like it's not like it's crazy different. I don't think that one of them is like way more scattered than the other. But, you know, if you just try to single out those black dots, I think that they do form a little bit of a stronger relationship than the square ones. All right. Next part says part C.
This is where things get a little bit tough, but if we go slow, it's not too bad. Different types of intervals can be used when working with linear regression models. One type of interval is a confidence interval for the slope of the least squares regression line, which you've actually done before in unit 9.
But they're actually not even going to ask about that one. They're going to ask about different intervals. The other types of intervals in this context could be a conference interval for the mean number of runs scored for all teams and a prediction interval that predicts the number of runs for a single team with a specific number of hits. All three intervals will follow the same basic formula. You have your point estimate.
You're going to go up, you're going to down, you're going to plus, you're going to minus a tar, a critical value times your standard error. Then they even remind you that the t star, the critical value comes from a t distribution with n minus2 degrees of freedom. The same critical value is used to find all three of these different types of confidence intervals. And of course that leads us right into the first question. What is the 95% confidence level critical value that could be used for any of these different intervals that they described?
So basically they just want us to find tar for 95% confident based on 30 teams.
We could quickly do that in Desmos or you could actually even use one of your t tables if you like using those tables as well. But in Desmos we're going to click that little plus sign, select inference, select probability distribution and from the dropown menu select t distribution. First thing you would do is type in 20 degrees of freedom 30 teams minus two. What they told us to do 20 degrees of freedom.
Then you want to go ahead and make sure you select inner and then select bounds and you're going to select the middle 95%. So basically we're figuring out what t-cores represent that middle 95%.
You can also do this on numerics. You can also do this on your T84 using invert t. But we get 2.048.
Now it does say estimate two decimal places. So that'd be 2.05.
So at the end of the day all you had to have for this one was a tar of 2.05.
All right. Now for double eye, the standard error for the confidence interval for the number of hits, excuse me, for the mean number of runs is 17.48. Assuming all conditions are met, calculate a 95% confidence interval for the mean number of runs for all teams with 1,250 hits. So, we're going to literally do exactly what it said back in the original question. We're going to take our point estimate. We're going to go up and down by the tar that we just found, 2.05. And then they literally gave us the standard error. All we have to do is plug it all in. So here it is.
We're going to take our point estimate.
Remember this was the number of runs for teams that have 1,250 hits that we calculated back in part A. We're going to add and subtract our critical value, our tar that we just got, 2.05. And then we're going to multiply by the standard error that they gave us for the mean number of runs, 17.48. Multiplying that together, we get 35.834.
add it, subtract it, we get an interval of about 621 to 692 runs. So I'm 95% confident that the predicted mean number of runs a team with 1,250 hits will get is somewhere between 621 and 692 on average. Okay. Now in triple I they say the standard error for the prediction interval for the number of runs not the mean number of runs of the number of runs is 56.78 assuming the conditions for infants are met calculate the 95% prediction interval for the number of runs for a single team with 1250 hits. So once again we're going to use that same formula. We're going to start off with our sample. Okay. Uh, our predicted for 1,250 hits was 656.55.
We're going to go up and down by our t-star 2.05. And now here's the standard error for the number of runs for a single team. Multiplying, we get a margin of error of 116.40.
Add it, subtract it, we get about 540 to about 773. So, I'm 95% confident the predicted total number of runs for a team that has 1,250 hits will get is somewhere between 540 and 773 runs. So, this interval is for the total number of hits, excuse me, total number of runs we predict for a team that gets 1,250 hits.
The previous interval was for the mean.
If if if we took all the teams that had 1,250 hits and we got their mean, what would that be? So, again, a little bit different in what we're talking about there. So, make sure that makes sense.
And that's exactly what part D wants to now bring up. Would a distribution of sample means be expected to have more variability or less variability than a distribution of individual observations?
Now, this question actually has nothing to do with baseball or hits or runs.
They're just saying, does a distribution of sample means, which is a sampling distribution, have more or less variability than a distribution of individual values? So if we look at a population, the distribution of individual observations that you see in a population and compare that to a sampling distribution for sample means, you know, looking at samples taken from that population and looking at the mean of the sample, is there going to be less variability? And of course, you guys should know this. A distribution of sample means would be expected to have a variability that a less to have a lower variability. Sorry, I think there's a little bit of a typo there. Listen, I'm not an English student. A distribution of sample means would be expected to have less variability. I should say less right there real big. Less variability than a distribution of individual observations. Now the biggest reason why is remember the standard deviation for a sampling distribution takes the standard deviation of your population and divides it by the square root of your sample size. So if you have a bigger sample, you're going to vary less. So the mean of a sample will always have less variability compared to the population mean when compared to how an individual observation will vary from the population mean. Samples vary less.
Pretty simple point there. All right.
And finally they bring up this kind of weird thing. A little bit complicated and I know a lot of kids are probably just done at this point but here's what they explain because the answer is not too too bad. The standard error used in a confidence interval to estimate the mean number of runs for all teams is this ugly formula right here. And the standard error used in the prediction intervals that predicts the number of runs for a single team is right here.
Now remember earlier they gave us these these standard errors. It's like about 17 something. This was like 15 56 something like they gave us those standard errors. We did not have to use those formulas. But what they now say is in both standard error formulas s is the standard deviation of the residuals. n is the sample size and xbar is the sample mean number of hits based on the answers from part D that we just answered the one talking about the fact that the mean of a sample is going to vary less than a single observation and using the standard error formulas that we just showed you here explain why the prediction interval calculated in part C triple I is wider than the interval calculated in part C double I okay [laughter] a lot of words here Listen, all they're trying to do is say, "Hey, go back and notice that this interval for the total number of runs for a single team was wider than this interval looking at the mean number of runs for all the teams." Well, that's because the mean is always going to vary less. So, no wonder the interval for the mean is a smaller interval, which means it's more narrow or more accurate. in an interval for the total number of team number of runs for a single team. Well, a single team could vary a lot more because it's a single team. We're not looking at the mean of a sample and that's why it's wider being a little bit, you know, less predictable. All right, so that's all I said down here in my answer. I said the interval in part C double I is for the mean number of runs and the interval in C part triple I is for the total number of runs for a single team. the mean as discussed in part D single eye will vary less. So that is why the interval for the mean number of runs is more narrow.
Furthermore, literally the standard errors that they gave us the standard error for the mean number of runs was only 17.48 and the standard error for the total number of runs for a single team was 56.78. And when you have a smaller standard error, you're going to have less variability. literally telling us that there's less variability when you're looking with means as opposed to an individual team. [gasps] That question was difficult. Like, I feel like I needed a two-hour nap after doing it. Hopefully, it all made sense to you. I know part D definitely got a little bit weird, but part A and part B were actually not that difficult. And if you follow the directions for part C, you could get those intervals just by plugging in numbers. All right, guys.
That's it for question number six from the 2026 AP STS exam.
Videos Relacionados
A Number Plus 5 Is 12
MathGirlTutor
101 views•2026-06-03
Olympiad Mathematics | Indian | Can You Solve This One?
PhilCoolMath
650 views•2026-06-03
Escaping the Fog
LogicLemurGaming
760 views•2026-06-03
H2 Math June Holiday 2026 Intensive Revision | H2 Math Tuition by Achevas #singaporemath #h2math
AchevasTV
304 views•2026-06-01
A Brutal Radical Expression Made Easy! The Shortcut Changes Everything.
tamoshop
112 views•2026-06-02
V : jee main /advance class 11 mathematics : Binomial Theorem class-1 ( 29 may 2026 )
dcamclassesiitjeemainsadva9953
125 views•2026-05-29
Is This Pentomino Tileable?
3cycle
241 views•2026-05-30
This Sudoku Has Many Lines!!
CrackingTheCryptic
2K views•2026-05-29











