0% found this document useful (0 votes)
30 views27 pages

C207 Study Guide

WGU

Uploaded by

naasirbush1979
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views27 pages

C207 Study Guide

WGU

Uploaded by

naasirbush1979
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

C207 Data-Driven Decision Study Guide

Module 1: The Case for Quantitative Analysis


Analytics
Analytics – extensive use of data, statistical and quantitative analysis, explanatory and predictive
models, and fact-based management to drive decisions and add value.
• Descriptive statistics are used to inform/explain
• Inferential statistics are used to predict/trend

Big Data
Refers to both structured and unstructured data in such large volumes that it's difficult to process using
traditional database and software techniques.

Data Mining
Process of discovering patterns in large data sets. Data mining is performed on big data to decipher
patterns from these large databases.

Davenport-Kim 3-stage model

• Framing the Problem


o Problem Recognition
1. Identifying Stakeholders
2. Focusing on decisions
3. Identifying the kind of story
4. Determining the scope of the problem
5. Getting specific about what data to analyze
o Review of Previous Findings

1
• Solving the Problem
o Modeling Step
o Data Collection Step
o Data Analysis Step
• Communicating Results

Data Management
Refers to cleaning and organizing a data set that has been collected
• Available
• Accurate
• Complete
• Relevant
• Timely

4 Levels of Measurement
• Continuous Data – a data point can lay along any point in a range of data (age)
o Interval Data – all objects are an equal interval apart, cannot have a natural zero (time)
o Ratio Data – has a unique zero point (age, Kevin scale, income, stock price, inventory)

• Discrete Data – can only take on whole values and has clear boundaries (number of cars)
o Nominal Data – called categorical data, used to label subjects in a study (males/females)
o Ordinal Data – places data objects into an order according to some quality (degrees)

Reliability and Validity of Data


• Random Error –will not repeat over time, minimized by larger sample size
• Systematic Error – it repeats itself, constant measurement error, measurement instruments
• Omission Error – when relevant data is not included in study or action has not been taken
• Outlier – observation points (numbers) that are distant from other observations
• Measurement Bias
o Sample is not representative of the population
o Sample tested is not sufficiently random
• Information Bias
o Response Bias – Respondent says what they believe the questioner wants to hear
o Conscious Bias – Surveyor is actively seeking a certain response

Skewness (Bias) – is a measure of the degree to which data leans toward one side.

2
Research Design
• Observational Studies – when it’s impractical or impossible to control the conditions of the study
o Cohort Study
o Case Control Study
• Experimental Studies – variable measurements and subjects are under the researcher’s control
o Experimental units – subjects of objects under observation
o Treatments – the procedures applied to each subject
o Responses – the effects of the experimental treatments

Experimental Studies: Explanatory Variable


Also known as the independent or predictor variable, it explains variations in the response variable; in
an experimental study, it is manipulated by the researcher

Experimental Studies: Response Variable


Also known as the dependent or outcome variable, its value is predicted, or its variation is explained by
the explanatory variable; in an experimental study, this is the outcome of study.

• Blind Study – participants are not told


• Double Blind Study – data gatherers and participants are not told
• Triple Blind Study – data analyzer, data gatherers, and participants are not told

Experimental Design
• Qualitative Research – exploratory research, data not characterized by numbers
• Quantitative Research – uses numerical data and measurements

Module 2: Statistics as a Managerial Tool


The Misuse of Statistics
• Not a truly representative sample
• Response bias
• Conscious bias
• Missing data and refusals
• Small sample sizes
• Association and causality
• Training and test data
• Unfounded assumptions
• Faulty operationalization
• Lack of blinding

3
Probability
It is the chance of an event occurring or happening at some time in the future

Independent Events – first result does not have any impact on the second one
Complementary Events – the only possible outcomes of that event (flipping a coin – heads or tails)
Conditional Probability – probability of even occurring, given that another event has already occurred

Probability of an Intersection: P(A∩B) = P(A) x P(B)

Probability of a Union: P (AUB) = P(A) + P(B) – P(A∩B)

Probability of Mutually Exclusive Events: P(AUB) = P(A) + P(B)

4
Permutations – when the order does matter
Permutation (where repetition is allowed): nr = n x n x n (r times)
Permutation (with no repetition): mPn = m!/(m - n)! = n x (n – 1) x (n -2) x (n – 3) …

Combination – when the order does NOT matter: mC = m!/((m – n)!n!)

Bayes’ Theorem
Describes probability of event, based on prior knowledge of conditions that might be related to event.

Measures of Central Tendency


• Mode – is the value or values in the data set that occur most frequently
• Median – is the point at which equal number of scores fall above and below
• Mean - is the same as the average value of a data set and is found using a calculation

Variance and Standard Deviation


• Standard Deviation – is a measure of how spread out numbers are.
• Normal Distribution – when data tends to occur around a central value with no bias right or left.
• Variance – average squared deviation of values from mean, determines difference between groups.
5
Z-score – number of standard deviations for data point from its mean; for one data point against median

Empirical Rule – applies to a normal, bell-shaped curve which is symmetrical about the mean.

Graphic Displays
Range – represents the array of possibilities in which a value can exist, from minimum to maximum.

Percentiles – unit of measurement that gives a value of which a percentage of population falls below.

6
Inter-quartile Range – measures the difference between the third quartile and the first quartile.

Boxplot – is standardized way of displaying the distribution of data based on a 5-number summary
(“minimum”, first quartile (Q1), median, third quartile (Q3), and “maximum”).

Histogram – graph that displays continuous, non-discrete data (compare frequency of numerical data)

7
Bars Chart – is a graph that displays discrete data (compare different categories of data)

Scatter Diagram (bivariate chart) – shows relationships between two variables for determining how
closely they are related.

8
Line Graph (bivariate chart) – shows relationship between two or more variables by using connected
data points.

Module 3: Quantitative Statistical Tools


Statistical Analysis vs Decision Analysis

9
Linear Programming
Mathematical technique used to find a maximum or minimum of linear equations containing several
variables. A technique for minimize total cost or maximize profit based on constraints

• Question: What is the "product mix" to minimize cost?


• Question: What is the "product mix" to maximize profit?

Crossover Analysis
When there are two or more plans or options to consider, crossover analysis allows a decision maker to
identify the crossover point, which represents the point at which they are indifferent between.

Break-even Analysis
Tells how many units of a product must be sold to cover the fixed and variable costs of production.

Hypothesis Tests
t-Test
Tests null hypothesis about one or two means; most often, it tests the hypothesis that two means are
equal, or that the difference between them is zero.

Chi-squared Test
Performs hypothesis testing on two categorical variables from a single population.

ANOVA Test
Used to compare multiple (three or more) samples with a single test.

P-value
If the p-value is less than 0.05 we will reject the null hypothesis.

R-square
o Measures the goodness of fit in a regression analysis, it ranges in value from 0 to 1.
o Value close to 1 indicates that estimation error is small, and data closely aligns to regression line
o Value close to 0 indicates that data does not align as closely to the estimated regression line

Correlation Coefficient
Measures the strength of a linear relationship.

10
Forecasting, Regression Analysis, and Quantitative Techniques
Forecasting Techniques

Other Quantitative Techniques

Regression Analysis
Linear Regression
A technique using a single independent variable to predict a single dependent variable.
Dependent variable is the variable whose value depends on the other variables in the equation.
Independent variables are variables presumed to influence the dependent variable.

11
Correlation
o The strength of a linear relationship can be measured with the correlation coefficient.
o Correlation coefficient, a number between -1 and 1, is only useful in measuring linear regression.
o Correlation coefficient that is close to 0 indicates a weak linear relationship
o Correlation coefficient closer to -1 or 1 represents a strong linear relationship.
o Correlation coefficient equal to exactly -1 or 1 would be considered perfectly linear.

Multiple Regression
o A technique using more than one independent variable to predict a single dependent variable.
o Multicollinearity describes a linear relationship between variables.
o Autocorrelation describes correlation of variable with itself given a time lag – generates concern.

Challenges with Regression Analysis


o Multiple Independent Variables
o Non-Linear Relationships
o Outliers

Time Series Analysis


A simple regression using time as the independent variable.

Data Patters (in Time Series Analysis)

Cluster Analysis
Also known as segmentation, is the process of arranging terms or values based on different variables
into "natural" groups. Most often with cluster analysis, these terms or values are survey responses from
people.

12
Decision Analysis
Proves of weighing all outcomes of a decision to determine the best course of action.

3 Contexts for making decisions:


o Certainty – each action has known outcomes
o Risk – calculation of the action’s worth based on probabilities.
o Decision Tree – shows a number of options (paths) and possible consequences for each

Simulations
Simulation is an attempt to emulate a real process or system through an imitative model. This allows
considering problems that may not lend themselves to direct experimentation and helps managers make
decisions. Common simulation tools include what-if analysis, and Monte Carlo simulation.

What-if analysis
A form of simulation analysis that involves selecting different values for the probabilistic inputs in a
model and then computing the possible outputs.

Monte Carlo Simulation


A problem-solving technique used to approximate the probability of certain outcomes by running
multiple trial runs, or simulations, using random variables. It lets us model situations that present
uncertainty and run them thousands of times on a computer.

Module 4: Quality Management Basics


Quality Management Principles
Plan-Do-Check-Act Cycle
PDCA cycle – is a four-step method for testing hypotheses and solving problems.
o Plan – where you identify a problem and develop plans to solve the problem.
o Do – where you run an experiment to see if your plans will work on a small scale.
o Check – where you analyze the results of your experiment and decide if they can be improved.
o Act – where you enact the change on a larger scale, making it a part of normal operations.
13
Quality Control vs. Quality Assurance

SIPOC (Supplier-Input-Process-Output-Customer)
SIPOC Benefits
o Helps define the boundaries of your operations by providing a high-level view of complete process.
o Helps understand how process elements fit together.
o It ensures that you take a broad view of work instead of focusing only on the internal work.
o Takes into account the quality of the work and materials that suppliers provide to the process.
o Checks how the outputs of the process are perceived and used by customers.
o Stops you from optimizing work to satisfy only the internal process stakeholders.

Statistics, Metrics, and Quality


Statistical Process Control (SPCE)
• Relies on metrics to illustrate results and to analyze the root cause of any deviations from plans.
• Prevents mistakes from being incorporated into an entire batch of a product
• Provides an objective way to:
o compare performance to a standard to see if corrective action is needed
o expose trends in performance data
o forecast performance
o show whether improvement practices are effective
o make informed decisions

Sampling
Involves choosing one or several outputs generated from process as representatives of the entire group.

14
Attribute Data
Collected to show if the result meets requirement or not; answers to yes/no question or pass/fail test.

Variable Data
Tests how well a result meets a requirement; results can be rated on a scale between 0 and infinity.

Common Cause Variations


Accepted as part of the normal process because they fall within the amounts that users will tolerate.

Special Cause Variations


Something unusual or unexpected has occurred to affect the process or system

Control Limits
Upper control limit and a lower control limit are equidistant from the mean by 3 standard deviations.

Ishikawa's 7 Basic Tools of Quality (8 mentioned in the course material)


Used to solve 90%–95% of the quality-related problems organizations see.

1. Run Chart – simple way to illustrate performance measurements over a period of time.

2. Control Chart
o Modified run chart—it shows the performance of a process over time, but it also includes limits or
constraints that a process should not exceed.
o Especially helpful in distinguishing special cause from common cause variation.

15
3. Cause-and-Effect Diagram
o Often called a fishbone diagram
o Helps project participants systematically uncover sources of problems
o Creates a hierarchy of the primary and underlying factors that cause an event or problem

4. Flowchart
o Graphic representation of the steps that make up a process.
o Documents a process as it currently exists and compare it to one that shows an ideal condition.

16
5. Check Sheet
o Structured form or table used to count how many times an event or problem happened.
o Ensures that everyone collecting data is compiling and recording it in a similar way.

6. Scatter Diagram
o Data are displayed as a collection of points, each having the value of one variable determining the
position on the horizontal axis and the value of the other variable determining the position on the
vertical axis.

17
7. Histogram

8. Pareto Chart
o Bar chart that sorts data into categories, then prioritizes them from most significant factors.
o Based on the 80/20 rule – 80% of problems are the result of a small number (about 20%) of causes.

Quality Management Programs


Lean Practices
o Lean practices focus on eliminating anything that does not add value for customers.
o Lean views work from the customer's perspective
o Waste is removed from all activities in process, so entire stream of activities is enhanced/optimized.
o Classifies every activity into 3 types:
o Value Add – activities that a customer would be willing to pay for
o Non-Value-add but Essential – things that need to be done and that don’t bring any value
o Waste – actions that bring no value to the article and are unnecessary

Six Sigma
o Statistical concept that places 6 standard deviations between the mean allowed limits.
o Processes working at six-sigma level are 99.9997% defect-free (only 3.4 defects per million outputs).

Design for Six Sigma


Does not wait to correct inefficiencies in processes—they incorporate Six Sigma practices into work as
they design the processes they'll use in upcoming activities.

DMAIC Framework
o Six Sigma employs a five-step framework to analyze an existing process and to incorporate changes.
o Define – Measure – Analyze – Improve – Control

ISO Certification

18
International Organization for Standardization (ISO) established a certification program that guarantees
that an organization is dedicated to quality concepts and is continually working to ensure that it is
producing the highest level of quality possible.

ISO Principal of Quality


1. Customer focus - The primary focus of quality management is to meet customer requirements and to
strive to exceed customer expectations.

2. Leadership - Leaders at all levels establish unity of purpose and direction and create conditions in
which people are engaged in achieving the organization’s quality objectives.

3. Engagement of people - Competent, empowered and engaged people at all levels throughout the
organization are essential to enhance its capability to create and deliver value.

4. Process approach - Consistent and predictable results are achieved more effectively/efficiently when
activities are understood and managed as interrelated processes that function as a coherent system.

5. System Approach to Management - Identifying, understanding and managing interrelated processes


as a system contributes to the organization’s effectiveness and efficiency in achieving its objectives.

6. Continual Improvement - Continual improvement of the organization’s overall performance should


be a permanent objective of the organization.

7. Factual approach to decision making - Effective decisions are based on the analysis of data and
information.

8. Mutually Beneficial Supplier Relationship - An organization and its suppliers are interdependent and
a mutually beneficial relationship enhances the ability of both to create value.

Seven New Tools for Improvement

19
1. Affinity Diagram
o Groups items based on relationships with are then analyzed.
o Used when confronted with many facts or ides in apparent chaos.

2. Interrelationship Digraph
o Displays all the interrelated cause-and-effect relationships and factors involved in a complex
problem and describes desired outcomes.

3. Tree Diagram
o Hierarchical tool that breaks a topic down into its components.
o Breaks down broad categories into finer and finer levels of detail.

4. Prioritization Matrix
o Prioritizes multiple options, based on how well these options satisfy preselected criteria.
o Prioritizes items in terms of weighted criteria.
o Popular applications: Return on Investment (ROI) or Cost/Benefit Analysis

5. Matrix Diagram
o Table or chart that shows the strength of the relationships between items or sets of items

6. Network Diagram
o A scheduling diagram that shows the relationships between project activities
o Helps in determining the critical path (longest sequence of tasks).

7. Process Decision Program Chart


o Similar to a Tree Diagram, but the intent of PDPC is more defined.
o Illustrates the corrective and preventive actions that can be taken to mitigate risks.

Module 5: Real World Data-Driven Decisions


Results-based Management (RBM)
• Management strategy that uses results as the central measurement of performance.
• Uses results as the central measure of performance
o Translates goals into results
o Clearly defined accountability of results
o Requires monitoring and self-assessment
• RBM Requires:
o Partnerships and Inclusiveness
o Shared expectations
o Transparency, simplicity, and flexibility
• Takes a life-cycle approach in which the processes are continuous and cyclical.
20
Performance Indicators – virtually anything that can be tracked and quantified, such as:
o Financial performance
o Customer satisfaction
o Quality of programs or services
o Employee retention
o Safety statistics
o Energy consumption

21
Business Improvement Analytics
Index Numbers – are a common analytic for business improvement.
Index = (Price / Base Period Price) x 100

Consumer Price Index (CPI)


o Defined as "basket" of assorted consumer goods and services purchased by a common household.
o This number gives everyone an idea of how the economy, as a whole, has changed over time.

Simple Index Number


o Shows the change in price or quantity of a single good or service over time.
o Price or quantity relative to a base period of 100
o This can be determined in three steps.

Simple Composite Index


o Compares the prices or quantities of a number of goods or services over time.
o Used to improve business performance
o Gathers data from many sources without weighing any data more significantly than any other data.

Weighted Composite Index


o More weight gets applied to certain goods or services based on quantity sizes or prices.
o Gives an understanding that is more proportionate to actual changes over time.

Healthcare Analytics
Epidemiology
o Studies incidence, distribution, and possible control of diseases and other factors relating to health.
o Rate is the measure of an event occurring over a period of time.
o Proportion – ratio of a group to the whole
o Prevalence counts all of the existing cases of a disease
o Incidence only counts new cases.
o Cumulative Incidence – measures the number of new cases that arise in a period of time.

Education Analytics
Helps education leaders to a better understanding of student progress, the effectiveness of different
questions, and the construction of tests.

Key Statistical Tools


o Percentiles
o Standard Scores (z-scores)

22
Test Construction
o Norm-referenced tests–compare an individual to others e.g., standard score (Z-score)
o Criterion-referenced tests –compare an individual to defined standards e.g., exam cut-score

True Score Model


o Observed Score is the score that is actually achieved by an individual on a test.
o True Score is the average score an individual would achieve if there were to take test infinite times.
o True Score Theory states that, in a test without systematic error, the observed score is the true score
plus the random error.

Item Response Theory (IRT)


o Also known as latent trait theory, is a model of designing, analyzing, and scoring tests.
o Does not assume that each question is of equal difficulty.
o Focuses on each question's implications of a correct answer on different scales.
o Looks at each individual question and tries to determine the meaning of a correct answer.

Public Sector Analytics


Used in government to understand past performance and deliver public services at a lower cost.
o Cost-benefit Analysis – Attempt to measure the benefit to the general welfare of the public.
o Benchmarking – Anticipated cost of new transit system relative to actual cost of similar transit
system of other cities.
o Payback Period – Example: installing solar panels on municipal buildings

Non-Profit Sector Analytics


o Cost-effectiveness of initiatives
o Benchmarking

23
Module 6: Improving Organizational Performance

Key Performance Indicators (KPIs)


o Performance measurement that organizations use to quantify their level of success.
o KPIs often follow "SMART" criteria.

KPI Performance Dashboard


o Displays key performance indicators using visual representations such as charts and graphs.
o Can reveal trends over time.

Advantages and Disadvantages of KPIs

24
Balanced Scorecard
Measures an organization's performance on balanced mix of financial and non-financial measures.

Financial – Customer – Internal Business Processes- Innovation and Learning

Advantages and Disadvantages of Balanced Scorecards

25
Net Promoter Score
Quantifies how strong an organization's customer relations are.

Advantages and Disadvantages of NPS

26
Performance Assessment and Strategy
Performance assessment can and should be linked to a company's strategy.

27

You might also like