0% found this document useful (0 votes)
226 views12 pages

S2 Cheat Sheet: Usual Types of Questions Tips What Can Go Ugly

1. This chapter discusses the binomial distribution and provides tips for answering questions involving calculating binomial probabilities. Key points include using the binomial probability formula and edge cases, writing out the distribution, and checking assumptions. Common errors involve misreading terms like "at least" and incorrectly setting up inequalities. 2. Tips are provided for finding unknown p or n values from context. Questions may also involve using tables or counting failures for probabilities greater than 0.5. Care is needed with strict vs non-strict inequalities and switching between successes and failures. 3. Sneaky questions can incorporate aspects of the geometric distribution or require "double inequalities" comparing binomial probabilities.

Uploaded by

Gulnar Javad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
226 views12 pages

S2 Cheat Sheet: Usual Types of Questions Tips What Can Go Ugly

1. This chapter discusses the binomial distribution and provides tips for answering questions involving calculating binomial probabilities. Key points include using the binomial probability formula and edge cases, writing out the distribution, and checking assumptions. Common errors involve misreading terms like "at least" and incorrectly setting up inequalities. 2. Tips are provided for finding unknown p or n values from context. Questions may also involve using tables or counting failures for probabilities greater than 0.5. Care is needed with strict vs non-strict inequalities and switching between successes and failures. 3. Sneaky questions can incorporate aspects of the geometric distribution or require "double inequalities" comparing binomial probabilities.

Uploaded by

Gulnar Javad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

S2 Cheat Sheet

Chapter Usual types of questions Tips What can go ugly


1 – Binomial  Finding the probability of a  ( ) ( ) ( )  Misreading terms like “at
Distribution certain number of least” or “more than”.
 Remember the two edge cases where you don’t need to use the full formula:
successes. Make sure you get vs
a. 0 successes: ( )
 Finding the probability of right.
b. successes:
some range of successes, ( )  Incorrectly ‘flipping’ a
c. Thus “at least 1 success”:
e.g. ( ), probability for > and , i.e.
 For worded questions, always start your working by writing out your distribution, e.g.
( ) ( ) being one off when you like
( )
 Be able to list the up a value in the
 Sometimes you require the probability of a range when the cumulative table can’t be used,
assumptions made in order cumulative Binomial table.
because the value of is not a nice number. This involves subtracting the opposite cases from 1:
to model a scenario using a ( ) ( )
e.g. ( ) ( ) ( )
Binomial Distribution. ( ) ( )
 Remember your table requires , so if you have ( ), use ( ). See on the right
 Calculating mean and Just think logically about
regarding problems of ‘flipping’ your inequality. what the opposite of
variance (where
 Assumptions of a Binomial Distribution: “more than 1” is and so on.
)
a. Fixed number of trials.  Similarly, incorrect
 Sneaky Geometric
b. Probability of success constant. switching from the number
distribution questions (see
c. Each trial is independent (ensure you put in context!) of successes to the number
right)
d. Each trial has two outcomes (‘success’ and ‘failure’)
 Calculating an unknown of failures, usually by
 When is unknown: “An unfair coin with probability of heads is tossed 20 times. The forgetting to replace the
value of from context.
probability of seeing no heads is 0.1. Determine ”. value with minus it, or
 Calculating an unknown
Since this is an ‘edge case’: ( ) ( ) . Thus: not preserving the
value of from context. ( )
 Solve problems in which strictness/non-strictness of
√ the inequality.
tables have to be used, but
the probability of success is √
(i.e. not in table), by  When is unknown: “I play a game for which the probability of winning is 0.7. If I win every
instead counting the game, what is the smallest number of times I play such that the probability of winning every
number of failures. game is less than 0.01?”
Again an edge case so: ( )
 Solving “double
inequalities”, e.g. “smallest
value of such that
( ) ”
Thus at least 13 games required. Notice that the direction of the inequality reversed because we
divided by a negative number ( ).
 Sometimes you’ll get a part of a question which requires some non-Binomial probabilistic
calculation, particularly involving some number of failures before a success is obtained, e.g. “Bob
keeps firing arrows at a target until he gets a Bullseye. The probability he gets a Bullseye is 0.4.
th
What’s the probability he hits the Bullseye on the 4 shot”:

www.drfrostmaths.com 1
This is related to something called the Geometric Distribution which isn’t formally covered in the
syllabus.
 When switching from the number of successes to the number of failures (so that the
probability is less than 0.5 for the purposes of using tables), flip the inequality (but preserve vs
) and if the number of successes was , use for number of failures:
( ) ( )
( ) ( )
e.g. “In Joe’s café 70% of customers buy a cup of tea. In a random sample of 20 customers find
the probability that more than 15 buy a cup of tea.”
( )
( )
( ) ( )
( )
( )
 “Given ( ), find the smallest value of such that ( ) ”
We can only use the table if the probability is less than 0.5. This question requires a great deal of
care, particularly with the effect of switching from to and getting vs right!
( )
( )
( ) ( )
( )
( )

www.drfrostmaths.com 2
2 – Poisson  Be able to state the  ( )  Since these questions are
Distribution conditions under which a very similar to Binomial
While this is in the formula booklet, the easy way to remember it is that reading left to right and
Poisson distribution may be questions (except we use a
then down, the repeats consecutively, as does the .
used. different table and
 ( ) and ( ) . The fact these are the same sometimes provides a justification for
 As with the Binomial evaluate the probabilities
why a Poisson distribution would be suitable to model certain data. See ‘Wordy Questions’ page.
Distribution, find the differently), the same
 As with the Binomial Distribution, ensure you state the distribution for any wordy question, e.g. potential problems can
probability of a particular
( ). arise, e.g. not correctly
number of events
happening within a given  Conditions required for Poisson: flipping an inequality from
time frame, or within some a. Events occur independently (e.g. volcano eruptions might not be modelled using Poisson ( ) to ( )
range using tables. because volcano less likely to erupt immediately after previous eruption, thus eruptions and so on.
not independent). 
 State the mean and Not realising a Poisson
b. Events occur singly in time. question is a Poisson
variance of a Poisson
c. A fixed rate for which events occur. question! Remember that
distribution.
 A very common occurrence is that you will need to scale to another time period. e.g. “A printer
 Be able to scale a Poisson any mention of ‘rate’
jams on average 0.3 times an hour. Find the probability over a 5 hour period the printer jams at implies Poisson rather than
distribution to a different
least 4 times.” Binomial.
time period.
Just scaling the 3 times an hour to a 15 times every 5 hours: 
 Feed the probability Also not realising when
( ) ( )
obtained from a Poisson both a Poisson Distribution
( ) ( )
distribution into a Binomial and a Binomial Distribution
Distribution, and make have to be used within the
 Another common question is to feed the value calculated from a Poisson question, into a
subsequent calculations. same question!
Binomial Distribution. e.g. “Defects occur in planks of wood with rate 0.5 per 10cm. If Bob buys 6
 Approximate a Binomial
blanks each of length 100cm, find prob that fewer than 2 of planks contain at most 3 defects.”
Distribution using a Poisson
First find probability a plank of 100cm contains at most 3 defects:
Distributions, and be able ( ) ( )
to state the conditions Then feed into a Binomial Distribution:
under which we can do so. Let be the number planks with at most 3 defects. ( )
 Again, forming and solving ( ) ( ) ( ) ( )
‘double inequalities’: e.g.
“smallest value of such Notice that we couldn’t use tables for the Binomial here because of the non-nice p value.
that ( ) ”  As with the Binomial Distribution, we can get nasty ‘double inequality’ questions: “While a
popcorn bag is in the microwave, an average of 5 pops can be heard per second. What’s the
minimum number of pops heard such that there’s less than a 10% chance of hearing more than
this number of pops?”
( )
( )
( )
( )

Note we had to take care with vs and vs .


 For Binomial Poisson approximations, see notes on Normal Approximations.

www.drfrostmaths.com 3
3 – Continuous  Find the probability of  Remember that ( ) is the probability density function for continuous variables, and that this is  A common error is doing
Random Variables some range of values given not a ‘probability’ as such: we only get a probability when we integrate ( ) over some range. ( ) (
a probability density  Relatedly, if is a continuous variable, then ( ) because the probability of a specific ) for continuous variables.
function or cumulative value is infinitely small (e.g. no one has an ‘exact’ height of 1.5m). This is true for discrete
distribution function. e.g.  Think of ( ) as “the running total of the probability up to ”. variables, but the opposite
( ) ( ) ( ) ( ) of “more than 10m tall” is
 Appreciate that  To find a probability over a range: not “under 9m tall”!
( ) if is a ( ) (
continuous random ( ) ∫ ( ) )
variable. Note that the vs does
 State the probability ( ) ∫ ( ) not matter since is
density function given a continuous.
For the latter, if you know the probability is 0 after some value (this will always be the case in
graph.  Note paying attention to
exams), we can use instead of .
 Comment on the skew of a  If ( ) is known, then you could calculate say ( ) using ( ) ( )
whether the question gives
distribution. the cumulative distribution
( )
 Calculate ( ) ( ), function ( ) or the
 When asked to find the value of some constant used in a p.d.f., use the fact that
the median/quartiles and probability distribution
∫ ( ) , i.e. the area under the whole probability function is 1. function ( ). This will
the mode of a probability
density function.  However, if the cumulative distribution function is given, then there is no need to integrate, just completely change the
use ( ) where is the highest possible value (since ( ) ). approach to use for
 Convert from ( ) to ( )
and vice versa, potentially  Remember that it doesn’t matter for continuous variables if you use or (but it does answering a question! If
involving multiple ranges. matter for discrete variables!). you’re finding the
 Be able to calculate ( )  ( ) ∫ ( ) median/quartiles and
(surprisingly common!)  ( ) ( ) ( ) you’re already given ( ),
 Be able to calculate the don’t integrate!
∫ ( )
median and quartiles.  When finding the
 Be able to calculate the  To go from ( ) to ( ), integrate. e.g. cumulative distribution
mode, either by finding the function, forgetting the
turning point, or by ( ) { rows for the two ‘ends’ of
inspection of the graph. the ranges. You should
Then in the range: have one more row in ( )
than ( ). The same
( ) ∫ [ ]
applies when going from
Therefore the full cumulative distribution function is: ( ) to ( ): you should
have one less row.
( ) {  When finding the
cumulative distribution
function, then in the
Note the extra ‘1’ row required since the running total of the probability by the time you get to 2 example on the left, it
is 1. Note that the use of was to avoid a clash with the used as the upper limit of the integral. might have been tempted
But the mark scheme permits ∫ , so do this way if you find it less confusing. to go straight from to

www.drfrostmaths.com 4
When ( ) has multiple rows, see the note on the right about ensuring you add the running total without properly
up to that range.
evaluating ∫
 To go from ( ) to ( ) just differentiate. Don’t forget that the ‘1’ row disappears.
 To find median or quartiles: use ( ) ( ) ( ) . Either ( ) will This will result in a missing
already be given, or you will have to determine it from ( ) first. constant.
Sometimes you have to determine which range the quartile/median occurs in first by evaluating  When finding the
the borderline values (although this is not necessary if you only have one range). cumulative distribution
e.g. If: function from ( ) where
there are multiple ranges,
forgetting to add on the
running total up to the
( )
start of the range being
considered. e.g. If you had
{ ranges and
and we wished to find the median , then it might be in the range or in ( ), then
range. However ( ) , i.e. the running total of the probability up to 1 is 0.25, thus the median when finding ( ) in the
wouldn’t have yet occurred, and thus it’s in the range. latter range, ( )
Then using ( ) : ( ) ∫ ( ) . This is
because you want the area
up to 1 and then the area
and so on. between 1 and .
 The mode is the value of such that ( ) is at its maximum. The mode can be calculated in two i.e. Don’t forget the ( )!
different ways: (and usually you can only use one of the two)  When finding the mode,
a. For curved graphs, finding the turning points using accidentally giving the
( ) probability density of the
mode as the answer rather
b. Using the graph. e.g. If ( ) { , then we can see from the sketch than the mode itself (e.g.
Jan 2011 Q5d: answer is 0
that the probability is greatest when , so the mode is 0. not 4).
 If asked to find the probability density function of a given graph,
ensure you don’t just give the equation of the line you see: you need
to use the full curly brace construction covering all values:
Then ( ) is not enough, we need to write:
( ) {
 When asked to find skew of a probability density function, if two of
the mode, median and mean have been found, compare in the usual S1 way (remembering if
positive skew that ). If you only have the graph look at the shape: a
‘positive tail’ mean positive skew.

www.drfrostmaths.com 5
4 – Continuous  Find the probability of a  If ( ) then:  Suppose that ( ).
Uniform range for a continuous Then what is
( )
Distribution uniform distribution, e.g. ( )? You
( ) or ( ) might be tempted to
( )
( ). calculate ( ) , but
 Sketch the probability I remember the variance as “a twelfth of the squared difference”.
the probability above a
function of a continuous  Suppose ( ) and ( ) and ( ) where and are unknown. Then:
value of 5 is 0, thus:
uniform distribution. ( )
 Find the mean or variance ( )
( )
of a continuous uniform ( ) . i.e. we
distribution:
We can then solve these simultaneously. ‘truncate’ any part of the
 Find the and of
 The key is just remembering the area of the rectangle (when you sketch the p.d.f.) is 1. Therefore range which is outside the
( ) when the mean
if ( ) then since the width is 2, the height is clearly . We would specify the probability range of the uniform
and variance of the
distribution.
distribution is given. density function as:
 Be able to calculate ( )
 The probability calculated ( ) {
from a uniform distribution
may be fed into a Binomial  For the example above, we could then find ( ) by just considering the rectangular area
involved:
distribution.
( )
 ( ) ∫ ( )
For the above example, ( ) ∫ [ ]
 “I pick 10 real numbers randomly from 12 to 17. Find the probability that at least 5 of these
numbers are greater than 15.5.”
If is the number picked each time, ( ) ( ) . Then if is the
number of times a number greater than 15.5 was picked, ( ), and calculate ( ).
 Don’t forget your rules of coding from S1:
( ) ( )
( ) ( )

www.drfrostmaths.com 6
5 – Normal  Be able to approximate a  If asked to give the
Approximations Binomial Distribution using conditions under which a
a Normal Distribution. Binomial Distribution can
 Be able to approximate a be approximated using a
Poisson Distribution using a Poisson Distribution, do
Normal Distribution. NOT say . This
 Be able to approximate a condition is a rule of thumb
Binomial Distribution using only: the actual condition is
a Poisson Distribution. “ is large, is small” (from
 Be able to give the which stems).
conditions under which  All manner of things can go
such approximations can wrong with continuity
be made. corrections. This might be
 Justify why we need forgetting to convert to
continuity corrections. or first (i.e. incorrectly
going from ( ) to
( )), or making
your range 0.5 smaller
 This diagram may seem like quite a lot to memorise, but all you need to memorise for carrying
rather than 0.5 larger, e.g.
out the majority of approximations is this: If you have a Binomial Distribution, is ? If yes
incorrectly from ( )
use Poisson, else use Normal. In terms of converting between the distributions, the mean and
to ( ).
variance of the Normal/Poisson approximation is just the mean and variance of the original
 See note on the left about
distribution.
the perils of scaling the
 If asked why a continuity correction is needed (and suppose the original distribution is Poisson),
value instead of in the
say: “Poisson is discrete, but Normal is continuous”.
case of the Poisson
 For continuity corrections, we want to go from a discrete to a continuous version of it .
Distribution.
You will never get a continuity correction wrong if you carry out these two simple steps:
 In the formula for ,
a. Make sure your inequality uses or instead of < or >. i.e. Ensure inequality is non-
accidentally dividing by the
strict.
variance rather than the
b. ‘Extend’ your range by 0.5 at each end. i.e. If you visualise your inequality as a line on
standard deviation.
the number line, it should be 0.5 longer each end.
Examples: ( ) ( ) ( )
( ) ( ) ( )
( ) ( )
 I prefer to do the continuity correction immediately, i.e. before you either reverse the direction
of the inequality or standardise. e.g.
( ) ( ) ( ) ( ) ( )
 The number of marks effectively tells you what approximation you are using: If at least 6 marks,
it’s a Normal Approximation (because of the many steps of converting the distribution,
standardising and continuity corrections), otherwise it’s Binomial Poisson.

www.drfrostmaths.com 7
 Example Normal Approximation: “The number of houses sold by an estate agent follows a Poisson
distribution, with a mean of 2 per week. The estate agent will receive a bonus if he sells more
than 25 houses in the next 10 weeks. Use a suitable approximation to estimate the probability
that the estate agent receives a bonus.”
a. Note first that you might be tempted to scale the 25 houses in 10 weeks to 5 houses in 2
weeks and stick with the original . The catastrophic flaw in doing this is that the
continuity correction affects the range differently depending on whether you’re using
the original or scaled value.
If not scaling: ( ) ( ) ( )
If scaling: ( ) ( ) ( )
In the latter incorrect case the 0.5 has a greater effect on the smaller value of 6
compared with the larger value of 26, so the probability will be too high.
b. Step 1: Determine what approximation to use.
In this example we have a Poisson Distribution, which always goes to Normal. If it were
Binomial, you’d first determine if .
c. Step 2: Identify original distribution.
As discussed, we scale (rather than the 25), so: ( )
d. Step 3: Write the approximation, potentially with reference to a new continuous
variable which is the continuous version of , i.e.: ( )
As discussed, use the mean and variance of the original distribution.
e. Step 4: If necessary, carry out continuity correction to get a probability in terms of :
( ) ( ) ( )
f. Step 5: Use your S1 knowledge and find the probability by first standardising. Don’t
forget that you’re dividing by the standard deviation, not the variance:
( ) ( ) ( )

( )
6 – Populations  Be able to define key terms  Key definitions:  When finding the sampling
and Samples such as ‘sampling a. Statistic: “A random variable (1) which is some function of the sample and not distribution, forgetting that
distribution’, ‘sampling dependent on any population parameters (1)” - I think the ‘random variable’ bit is a bit different orderings are
frame’, ‘population’, pernicious (as does Wikipedia), but c’est la vie! If 1 mark, the second part is important. different possibilities, e.g.
‘sample’, ‘statistic’. b. ‘Population’: The collection of all items. (1,1,2) and (1,2,1) should
 Be able to describe what c. ‘Sample’: Some subset of the population which is intended to be representative of the both be considered.
the population is or population.  Even though you can
sampling frame is given the d. ‘Census’: When the entire population is sampled. subtract from 1 to find the
context, and identify e. ‘Sampling unit’: Individual member or element of the population or sampling frame. last probability in your
reasons for differences f. ‘Sampling frame’: A list of all sampling units or all the population. sampling distribution, if
between the two. g. Sampling distribution: All possible samples are chosen from a population (1); the values you have time, you might
 Be able to list possible of a statistic and the associated probabilities is a sampling distribution (1). want to calculate it ‘the
samples.  It’s important you get your head around what the sampling distribution actually is: It gives the long way’ to check your
 Be able to calculate the distribution over possible values of the statistic as we take different samples. So if for example answer, as probabilities

www.drfrostmaths.com 8
sampling distribution for a the statistic was the ‘range’ of the sample, then this range is likely to vary as we take different should obviously all add up
variety of statistics, such as samples. As these ranges vary across samples, it forms a distribution. to 1.
median (very common!),  The sampling frame is the list of things in the population that are available for sampling, e.g. “The
range, maximum and ID numbers”, “The list of car registration numbers”. The mark scheme seems to particularly like it
mode. when you refer to some identifying property of the things in the sampling frame.
 Be able to identify when The sampling frame may be different from the population, because some things in the
the sampling distribution is population may not be available for sampling. e.g. If sampling people who’ve visited a medical
a Binomial Distribution or practice, “some people may have left the area but hadn’t deregistered”.
otherwise, and specify this  When listing outcomes, it helps to be systematic in listing them so you don’t miss any. Note that
distribution. different orderings count as distinct possibilities.
e.g. “You have a large collection of 1p, 2p and 5p coins, and take 3 coins. Find all samples in
which the maximum is 5.” We may want to first list the possibilities where 5 appears once, 5
appears twice, and so on…” (5,1,1), (1,5,1), (1,1,5), (5,1,2), (5,2,1), (1,5,2), (2,5,1), (1,2,5), (2,1,5),
(5,2,2), (2,5,2), (2,2,5), (5,5,1), (5,1,5), (1,5,5), (5,5,2), (5,2,5), (2,5,5), (5,5,5)
 e.g. “A bag contains a large number of 1p and 2p coin, of which 40% are 1p and 60% are 2p. A
sample of 2 coins. Find the sampling distribution of the sample maximum.”
When finding the sampling distribution, it may help to have a table as follows to organise your
working, such that the outcomes for each possible value of the statistic are grouped:

Possibilities Statistic (Maximum) Probability


(1,1) 1
(1,2), (2,1), (2,2) 2
Notice that we didn’t need to do any complicated calculation for the last probability, because it
was just 1 minus the others! Had we had to calculate it fully, then
 On the rare occasion you get a question asking for the sampling distribution, where you don’t
actually have to do any calculation, but just have to consider what well-known distribution you
get as the sample varies:
“A factory produces components. Each component has a unique identity number and it is
assumed that 2% of the components are faulty. On a particular day, a quality control manager
wishes to take a random sample of 50 components. A statistic represents the number of faulty
components in the sample. Specifying the sampling distribution of .”
We know a sampling distribution is the possible values of the statistic as we take different
samples of 50 light bulbs. If the statistic is the count of light bulbs, we can see this count varies
Binomially between 0 and 50. Thus ( )

www.drfrostmaths.com 9
7 – Hypothesis  Be able to define key terms  Key terms whose definitions you need to remember:  Forgetting to halve the
Testing such as ‘critical region’, a. Critical Region: The range of values such that the null hypothesis is rejected. significance level for two-
‘hypothesis test’, b. Hypothesis Test: a procedure to examine a value of a population parameter proposed tailed tests.
‘significance level of a test’. by the null hypothesis  Being one off the critical
 Determine a critical region c. Significance level: “the probability of rejecting if is true”, or “the probability of value (particularly at the
(one or two ranges incorrectly rejecting ”. right tail). See notes on the
depending on whether one  The “or more extreme” thing often confuses people: is “more extreme” below the value or above left. Pay careful attention
or two tailed) it? We can always tell this by what side of the mean we’re on. If and we’re interested in to whether it says “use the
 Understand when we need seeing 10 hits to a website “or more extreme”, then since 10 is above the mean of 7, clearly we closest value”, or not.
the probability to be want “10 or above” to get the tail. Ensure “10 or above” is not .  Getting your continuity
strictly within the  Identifying the critical values from a table: correction wrong (see
significance level, and Firstly, note if the test is one-tailed (involving or ) or two-tailed (involving ), as you need to notes for Chapter 5).
when to be close to it as halve the significance level each side in the latter case.
possible either side. At the left tail, find the closest value under the significance level (or half it) in the table, or if you
 Be able to calculate the are explicitly told to use the closest value to the significance level, do that. This is the lower end
‘actual level of of the critical region.
significance’. However, at the right tail, first find the closest value above it (or again if explicitly told, the
 Carry out a hypothesis test closest value either side), but then go one above it. This is the critical value.
involving Binomial and
Poisson distributions. Example: Under the null hypothesis ( ). For a two-tailed test with significance level
 Carry out a hypothesis test 5%, what are the critical values?
where a normal Looking at the tables, at left end , and at right end, closest value with probability above
approximation is required. 0.975 is 7, thus going one above .

If we had been asked for the ‘closest value’ to 0.025 and 0.975, then we’d then get and
instead.
It is vitally important you specify the probability of being in each tail to evidence that you have
used the table, e.g. “ ( )
 For the critical region, don’t forget to provide a lower limit or upper limit in the case of the
Binomial Distribution, as the outcomes are finite.
For the previous example: . Mark schemes usually condone the lack of
, but don’t take any chances.
 The actual level of significance is the actual probability of being in the critical region. You should
have already written out the probabilities of being in each part of the critical region, so it’s then
just a case of adding the two probabilities.
 The mark scheme for a hypothesis test without a normal approximation is as follows:
a. Specifying and (1 mark)
b. Specifying the distribution for under the null hypothesis, e.g. ( ) (1 mark)
The or will be your population parameter under the null hypothesis.
c. 2 marks for either: Determining the probability of the observed value or more extreme
(e.g. ( ) ( ) or determining the critical region.

www.drfrostmaths.com 10
d. Using your probability to state whether is rejected or not, ensuring you directly
compare your probability with significance level.
e.g. “0.0723 > 0.05, so not significant ( is not rejected)” (1 mark)
e. Put this conclusion in context. “Bob is not justified in his claim that the rate of flamingo
attacks has increased.”
 The mark scheme for a hypothesis test with a normal approximation is usually broken down as
the following:
a. Specifying and (1 mark)
b. Possibly a mark for specifying your distribution of , i.e. ( ) or ( )
c. Specifying the distribution of for your normal approximation, ( ) (1 mark)
d. Doing the continuity correction: ( ) ( ) (1 mark)
e. Standardising to get ( ). Note at this stage vs does not matter as variable is
continuous. (1 mark)
f. As above, 2 marks for your two-part conclusion.
 Note that it is possible for the actual level of significance to be greater than the level of
significance, if you were asked to find the closest value to 0.025/0.975, etc rather than those
strictly below/above these values.

www.drfrostmaths.com 11
Wordy/interpretation questions:

 Definitions (see the respective chapter notes above):  “Identify the sampling units”. (Using above example) A cooker
Statistic, population, census, sampling frame, sampling unit, sampling  “A researcher took a sample of 100 voters from a certain town and asked
distribution, critical region, hypothesis test, significance level. them who they would vote for in an election. The proportion who said they
 Modelling assumptions: would vote for Dr Smith was 35%. State the population and the statistic in
 “List the assumptions made when modelling as a Binomial this case. What do you understand by the sampling distribution of this
distribution”. See Chapter 1. statistic.”
 “State two conditions under which a Poisson distribution is a Population is the residents of the town. Statistic is percentage/proportion
suitable model to use in statistical work”. See Chapter 2. who vote for Dr Smith. Sampling distribution here is number of people who
 (Surprisingly common!) “Describe the skewness of , giving a reason for your voted for Dr Smith in all possible samples of 100.
answer.” The mode and mean had previously been calculated in this question  [List of possible statistics given] “State, giving a reason which of the following
so “Positive skew as ”. 1 mark for skew type, 1 for reason. is not a statistic based on this sample.”
 Approximation related: The one involving the and , because these are population parameters,
 “Write down the conditions under which the Poisson distribution whereas a statistic must be based on the sample only.
can be used as an approximation to the Binomial distribution.”  “Suggest a suitable model to describe the number of vehicles passing the
is large fixed point in a 15 s interval.” ( ) (depending on what the average
 “Write down the two conditions needed to approximate the rate in the question is, which often you have to
Binomial distribution by the Poisson distribution.” scale)
is large, is small (NOT )  [Using distribution pictured] “state, giving your
 “Write down which of the approximations used in part (a) is the reason, whether E(X ) < 3, E(X ) = 3 or E(X ) > 3.”
most accurate estimate of the probability [One was Poisson, one ( ) because the graph tails to the left (i.e. negative skew).
Binomial]. You must give a reason for your answer.”  “Find, to 2 decimal places, the value of k so that
Normal approximation (1) because is large and close to half (1). ( ) .” This means within standard deviations
OR Normal because either side of the mean, so be definition, we want ( ). Since this
 “State the probability of incorrectly rejecting H0 using this critical region.” is middle 50%, upper end is upper quartile so find when ( ) .
Add the probabilities of your critical region(s).  “Comment on this finding in the light of your critical region found in part
 [Given that is a continuous variable] “Determine ( )”. Answer is 0. (a).” 11 is in the critical region (1 mark) therefore there is evidence of a
 “Identify a sampling frame.” change/increase in the proportion/number of customers buying single tins (1
See notes on Chapter 6. Remember the mark scheme likes reference to an mark).
‘identifier’, e.g. “list of unique identification numbers of the cookers.”  [Mean and variance of some data was calculated] “Explain how the answers
 [In context of cookers being tested] “Give one reason, other than to save from part (c) support the choice of a Poisson distribution as a model.”
time and cost, why a sample is taken rather than a census.” For a Poisson model , Mean = Variance ; For these data
There would be no cookers left to sell (i.e. the idea that testing something in a
sample destroys it/makes it unsellable).

www.drfrostmaths.com 12

You might also like