0% found this document useful (0 votes)
309 views12 pages

BA Module 02 - 2.1 + 2.2

Amazon uses sampling to continuously monitor inventory accuracy across its warehouses. Taking a complete inventory of all warehouses would be too time consuming. Instead, Amazon randomly samples inventory items to estimate defect rates. This provides a lower-cost way to frequently check inventory accuracy compared to traditional annual inventories that require closing warehouses. The random sampling allows Amazon to draw useful conclusions about full inventory levels and quality from a subset of items.

Uploaded by

ScarfaceXXX
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
309 views12 pages

BA Module 02 - 2.1 + 2.2

Amazon uses sampling to continuously monitor inventory accuracy across its warehouses. Taking a complete inventory of all warehouses would be too time consuming. Instead, Amazon randomly samples inventory items to estimate defect rates. This provides a lower-cost way to frequently check inventory accuracy compared to traditional annual inventories that require closing warehouses. The random sampling allows Amazon to draw useful conclusions about full inventory levels and quality from a subset of items.

Uploaded by

ScarfaceXXX
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

2.1.

1 Sampling at Amazon

 As we've seen,
descriptive statistics
 in graphical
representations of data
 often provide a
great deal of insight
 into patterns or
relationships in a data set.
 Often we wish to analyze a
large set of people or objects.
 We'll call this set our
population of interest,
 and the people or
objects in the set
 the members of the population.
 Due to time or
resource constraints,
 it is often not
practicable to analyze all
 of the members of a population.
 Fortunately, analyzing
a sample that
 is a representative subset
of the population often
 helps us draw useful conclusions
about the full population.
 Before we discuss
how to do this,
 let's learn about
how Amazon uses
 sampling to answer an
important managerial question.
 We decided long ago that
our company's mission
 is to be Earth's most
customer centric company.
 We are obsessed with
the customer experience.
 So when we have an opportunity
to improve the customer
 experience through
analytics, we'll
 usually focus on the thing that
is likely to have the highest
 impact on customer
experience, positive impact,
 and the broadest
possible impact,
 because we're a global company.
 So we look for things like
low prices, huge selection,
 improvements in the
delivery experience
 and convenience that
are likely to apply
 for a long period of time
everywhere in the world.
 When we ship items to customers,
they come from a warehouse
 where we store inventory.
 The way that we
process inventory
 is that we receive a truck that
has books, consumer electronics
 items, toys, kitchen,
sports, clothes, shoes.
 The truck comes in.
 We receive the items,
which basically
 means that we open up a
carton and take out the items
 and make sure that
they're in good shape.
 And then we stow
them into a shelf,
 waiting on the customer
orders that will eventually
 come to ship to customers.
 The places for errors
in this process
 include misidentifying
the item at receive.
 So we think we have black shoes,
and someone's made a mistake
 and identified
them as blue shoes.
 We could place the item
into the wrong bin.
 We could pull the wrong
item from the shelf,
 and then there are a couple
of other smaller ways
 that we might make mistakes.
 We're trying to minimize
the defects to customers,
 meaning minimize the chance
that a customer would receive
 the wrong item or receive a
delay because the last item
 that we have is in
the wrong place.
 And we're trying
to reduce our costs
 to deal with those kinds of
defects at the same time,
 so improve quality
and lower costs.
 The best to do that is to have
as few defects as possible
 in our inventory.
 Years ago, way
before Amazon.com,
 retail learned that
inventory accuracy matters
 in stores and in
warehouses, and retailers
 got accustomed to annual counts,
annual inventory accounts.
 Often, stores would close
for a day or two days,
 or sometimes a week.
 Warehouses would do
the same, and humans
 would go out into the
warehouse and count every item,
 make sure that they
knew what was where
 in the warehouse, and
then you would reopen
 and start selling again.
 That's a very expensive
process, because you actually
 have to close your
operation during the time
 that the warehouse is closed.
 And you also don't
have the benefit
 of knowing whether you're
perfect in your inventory
 throughout the rest of the year.
 You basically have one sample.
 It's a complete sample, but
it's one sample, once a year,
 and then you hope that your
processes are good enough
 the rest of the year.
 What we've learned to do
is to sample our inventory
 continuously, sample the
accuracy of our inventory
 continuously, to make sure
that we have as accurate
 an inventory as we
can afford to have.
 The idea behind sampling
is it might not be possible
 for you to learn the true value
of a statistic of interest
 in the population.
 We have many warehouses
that house that inventory.
 Going through all that would
be very, very time consuming.
 And the idea behind sampling
in that situation would be you
 would at random pick a subset
of the items in inventory,
 and ask whether they
had those defects.
 So it's a lower
cost way to learn
 the rate at which the
statistic of interest
 occurs in the population.

2.2.1 Samples vs. Populations



 Before we take a
sample, we need to have
 a very clear understanding of
the problem we wish to address.
 Based on that understanding,
we then do two things.
 First, we select the correct
target population to sample.
 Second, we determine
the question
 we wish to ask about the
members of that sample.
 Only then are we ready to take
our sample, ask the question,
 and compile and
analyze the results.
 Fortunately, if we
ask the right question
 about a sample from
the correct population,
 taking care to
sample it randomly,
 we should be able to
draw useful inferences
 about the entire
population based
 on studying only a small
portion of that population.
 Imagine we're planning
the annual conference
 for our industry association.
 We've picked a date
in October, and want
 to estimate the number of
people who will attend.
 We don't have the
time or resources
 to ask all 20,000 members.
 So which members
should we survey?
 We could easily survey the
10 other people organizing
 the conference, but
would that really
 be a representative sample?
 Everyone who's
organizing the conference
 is very likely to attend, so
our results would probably
 be 10 out of 10 in
favor of attending.
 Clearly this would be
a very poor predictor
 of the general
memberships attendance.
 To obtain a
representative sample,
 we need to choose
people at random
 from the full population.
 So if we decide to
survey, say 100 people,
 we need to choose them randomly
from the full list of 20,000
 members.
 This means that each
person in the association
 needs to have the same chance.
 In this case, 100 out of
20,000, or 1 out of 200
 of being selected
for the survey.
 We survey 100 randomly chosen
industry association members,
 asking each if they will
attend the conference.
 25 people, or 25%, say
that they will attend.
 Based on this outcome we feel
quite confident reporting
 to the other organizers
that between 17 and 33%
 of the members will attend
the October conference.

Below is a summary of the steps in the sampling process. Remember, we only start this
process after we have clearly established the problem we wish to solve and the question
we will ask of the members of the sample.
In the previous module, we learned about descriptive statistics. The numerical properties
of a population are called parameters and those of a sample are called statistics. A
statistic is an estimate of a true value of a parameter. If a sample is sufficiently
large and is representative of the population, the sample statistics should be
reasonably good estimates of the population parameters. 

To differentiate between population and sample measures, we use the Greek alphabet
for population parameters, and the Latin alphabet for sample statistics. The symbols for
the mean and standard deviation are summarized in the table below.

Click on the button below to generate a random sample of 30 points from the population.
In this case we are given the population mean and standard deviation, but generally we
will not have that information. Indeed, we take a sample precisely because we do NOT
have complete information about a population. Take as many random samples as you
would like, making sure to notice if the sample statistics vary and how accurately they
represent the population parameters.

What happens to the sample mean and standard deviation as you take new samples of
equal size?

The sample mean and standard deviation remain exactly the same
Since each sample is randomly selected, the mean and standard deviation vary from
one sample to the next.
The sample mean and standard deviation vary but remain fairly close to the
population mean and standard deviation (CORRECT)
Since each sample is randomly selected, the mean and standard deviation vary from
one sample to the next. However, since the sample size is fairly large, each sample’s
mean and standard deviation are fairly close to the population mean and standard
deviation. We’ll learn more about how to select a good sample later.
The sample mean and standard deviation vary substantially from one sample to the
next
Since each sample is randomly selected, the mean and standard deviation may vary
from one sample to the next. However, since the sample size is fairly large, each
sample’s mean and standard deviation are fairly close to the population mean and
standard deviation.

In some cases, selecting a random sample is quite straightforward. If we have a list of all
members of a population in a database, we can use a computer to assign a random
number to each member and draw a sample from the list. This process makes sure that
each member—that is, each element of the population—has an equal likelihood of being
selected, which ensures that the sample is representative of the population. 

Suppose we have the phone numbers of 20,000 people, and we want to survey a
random sample of 100 of them. We will do this using Excel’s RAND function. RAND
assigns a random identification (ID) number between 0 and 1 to each data point—in this
case, to each phone number. We use these random ID numbers to sort the data,
creating a list of the phone numbers in “random” order. We then call the first 100
numbers on the list.

 The Excel formula requires that we simply type the formula with closed
parentheses.
 We can use the RAND function to generate random numbers between any two
specified values. For example, if we wanted to generate random numbers
between 0 and 10 we would multiply the function by 10 and enter =RAND()*10. If
we wanted numbers between 5 and 15, we would enter =5+RAND()*10.

Spreadsheet: Randomly Sorting Data


Let's practice using the RAND function. To ease navigation, we’ll use a population of
only 25 phone numbers.

Step 1

Before we generate random ID numbers, type “Random ID” in cell A1 to label column


A.

Step 2

In cell A2, enter the function =RAND() to generate a random ID number between 0 and
1.
Step 3

Copy and paste the function from cell A2 into cells A3:A26 so that all 25 phone numbers
are assigned a random ID number. You can use auto-fill instead of copying and pasting.

Step 4

Now we need to sort the phone numbers. Highlight the data in column A and column
B, excluding the labels, and select Sort Ascending from the Data menu.

 Note that the RAND function generates a random number for each phone
number every time the spreadsheet is calculated. Therefore, even though the
phone numbers actually were sorted, the (new) random numbers will not appear
in order. The sorting was based on the previously assigned random numbers.
 After sorting, the 25 phone numbers on the list are in random order. If we wanted
to draw a random sample of 10 phone numbers, we would start at the top of the
list and choose the first 10 people.

If our population of interest is not listed in an easily accessible database, the task of
selecting a sample at random becomes more difficult. In such cases, we have to be
extremely careful not to introduce bias into our selection process.

Amazon’s inventory sampling process is more complex than selecting from a list of
phone numbers. Let’s see how Amazon’s managers ensure that their samples are
randomly selected.


 We have a team of
auditors that are
 dedicated to sampling
our inventory
 and they sample continuously.
 It's a relatively small team
in each of the warehouses.
 We randomize the
places that they
 go to inspect the inventory.
 So they don't decide where
to go to check a shelf,
 they have a software tool that
directs them to the shelf,
 and then they go
and check to see
 if what the computer
believes is on the shelf
 actually exists on
the shelf, and we're
 doing this all the time.
 The logic behind
the sampling scheme
 is to randomize across all of
the different types of storage
 that we have in the warehouse.
 That ensures that we cover
all of the warehouse,
 or all the types of
storage that we have
 several times during the year.
 And types of
storage might be, we
 might have small items
stored in shelves that
 make for easy picking manually.
 We might have clothes
stored in shelves
 that allow for easy stacking
of shirts or folded jeans.
 For larger items
like TVs, we might
 have them stored on pallets,
just wooden boxes on the floor,
 and use mechanical
equipment because it's
 too heavy for a
single person to lift.
 We check all of these
types of locations
 during our cycle counts.
 When we're deciding on
the right sample size,
 we use statistics to figure
out the smallest sample
 at the right frequency to
ensure statistical significance
 of the results.
 It can vary from one
location to the next,
 depending on the velocity
of the items coming
 in and out of the warehouse.
 So the more opportunities
we have to create defects,
 the more likely we
are to need to sample.
 If you have a
warehouse where there
 is no movement in
and no movement out,
 and nobody ever goes
into the shelves,
 you just stored them
once and leave them,
 we probably don't need
to sample that warehouse,
 because the probability that
there is a defect, after you
 know that's correct at the
beginning, is about zero.
 On the other hand, if you
are removing the items
 and replacing them with new
ones every day, 365 days a year,
 you're probably creating
defects and you need to sample
 more frequently.

Suppose a college has asked you to conduct a survey to determine the percentage of
8:00 AM classrooms that were full on a given morning. The college has three classroom
buildings, each containing two lecture halls. Each lecture hall has a capacity of 100
students. You randomly choose one of three buildings, and stand outside the entrance
when classes let out. You ask the first 60 students leaving the building how full their
class was. However, you soon realize that this sample is not random because you only
went to only one of the buildings and the classes at that building may not be
representative of all 8:00 AM classes. Moreover, since the students you surveyed were
the first to exit the building, it’s also quite possible that they all came from the same
class! 

Realizing that your survey approach would not produce a random and representative
sample, you gather some friends to help sample. You place one surveyor outside each
building. You each randomly select 20 students leaving the buildings that morning and
tally the results: 5 people decline to participate, 35 tell you that their class was full, and
20 tell you that their class was not full. Is your sample now representative of all classes
that morning?

Yes
See correct answer for explanation.
No
This question is a bit tricky. This sample still may not be representative of all classes
because there is a bias in the approach. When you sample students leaving each of the
buildings, you will, on average, select more people from full classes, simply because
there were more people in those classes. Imagine that of the 6 classes that took place
that morning, 4 were full (each having 100 students) and 2 had only 40 students each. In
this case, most of the students, 400 of the total 480, were in full classes. Your sample
would include more students from the full classes and therefore is not representative of
all classes that took place that morning.

Based on what we have learned, how can we ensure that we choose a sample of
students that is representative of all 8:00 AM classes that take place on a given
morning?
I would ask 10 students at random from each class by observing their exit time and
selecting them at different points in time while they exit (not just the first 10 to exit so that
we don’t get them all from the same class). I would also place 5 other friends to stand at
2 other schools buildings to do the same survey. I would also gather information asking
them which class they were from.

I would enlist the help from my friends. I and 5 other friends (total 6 surveyors) would go
to the exits of each of the classrooms. Each would survey 10 students asking them
which class they were from and question whether their class was full. I would also make
sure that we selected the students at random from throughout the duration of their exit
and not just the first 10 to exit.

You might also like