Unit 3-5 15 Marks
Unit 3-5 15 Marks
Unit 3
1. Implement the data analysis for multiple data source using multiple connection.
2. For any health care data perform the extraction, transformation and finally visualizing
the output using Tableau.
Unit 3-5 15 M merged
Unit 4
1. You are working with customer data that includes customer IDs, names, ages,
purchase amounts, and satisfaction ratings (on a scale of 1 to 5). The dataset is
large and contains missing values, duplicates, and improper data types. Explain the
process you would follow in R to:
• Create appropriate variables for each data field and assign data.
• Use vectors and factors to handle the satisfaction ratings.
• Store the data in a list for later manipulation.
• Convert the variables to appropriate data classes for analysis (e.g., satisfaction
ratings as factors).
• Clean the dataset by removing duplicates and handling missing data.
• Write a function that takes the cleaned data and returns a summary report
including the average purchase amount and satisfaction rating
6. Summary Function
summary_report <- function(data) {
avg_purchase <- mean(data$purchase_amount)
avg_satisfaction <- mean(as.numeric(data$satisfaction_rating))
list(average_purchase_amount = avg_purchase,
average_satisfaction_rating = avg_satisfaction)
Unit 3-5 15 M merged
This streamlined version covers all the essential steps to process the customer data efficiently.
Here’s how to develop a decision support system in R for categorizing customers based on their
purchase frequency and total spending:
Implementation Steps
1. Create Sample Customer Data: Define a data frame with customer IDs, number of
purchases, and total spending.
2. Categorize Customers: Use a for loop with if-else statements to classify customers
into "Low Value," "Medium Value," or "High Value."
3. Store Results: Save results in a list and summarize the categories.
Sample Code
# Create sample customer data
customer_data <- data.frame(
customer_id = 1:10,
number_of_purchases = c(12, 7, 15, 4, 10, 3, 9, 8, 5, 11),
total_spending = c(1500, 700, 2000, 300, 900, 200, 600, 800, 550, 1200)
)
# Initialize a list to store results
customer_categories <- list()
# Iterate through each customer and categorize
for (i in 1:nrow(customer_data)) {
purchases <- customer_data$number_of_purchases[i]
spending <- customer_data$total_spending[i]
Unit 3-5 15 M merged
High Value X
Medium Value Y
Low Value Z
Extension Ideas for Complex Decision-Making
To enhance the decision support system for more complex decision-making, consider the
following extensions:
1. Incorporating Customer Feedback:
o Gather feedback ratings from customers and include this as a factor in the
categorization process. For instance, customers with high spending but low
satisfaction ratings might be categorized differently.
2. Utilizing Demographic Data:
o Include demographic information (e.g., age, location) to refine customer
categorization and personalize marketing strategies.
3. Seasonal Trends Analysis:
o Implement seasonal analysis to adjust categories based on purchase behavior
changes during holidays or sales periods.
4. Lifetime Value Prediction:
o Use predictive modeling to estimate customer lifetime value based on historical
data, allowing for proactive engagement with high-potential customers.
5. Machine Learning Integration:
o Develop machine learning models to continuously learn and adjust
categorization based on evolving customer behaviors and preferences.
6. Real-Time Analytics:
o Incorporate real-time data processing to adapt customer categorizations
dynamically as new data comes in.
Unit 3-5 15 M merged
These extensions can make the decision support system more robust, enabling the company to
better understand and respond to customer needs, ultimately enhancing customer satisfaction
and loyalty.
Unit 5
1. Power BI is designed to handle various types of data sources and large-scale
reporting needs. Critically analyze the components of Power BI architecture (Power
BI Desktop, Power BI Service, Gateways, Dataflow, etc.). How do these components
work together to enable real-time reporting and data visualization in a cloud-based
environment? Discuss potential challenges in scaling Power BI for large enterprises
and how these can be addressed through the architecture.
Power BI is a powerful business analytics tool that provides interactive visualizations and
business intelligence capabilities with an interface simple enough for end users to create their
reports and dashboards. Its architecture comprises several key components that work together
to facilitate data reporting and visualization, especially in a cloud-based environment. Here's a
critical analysis of the components and their interplay, along with challenges and solutions
related to scaling Power BI for large enterprises.
Real-Time Reporting
These components work together to enable real-time reporting:
• Data Collection: Power BI Desktop gathers data, which is published to the Power BI
Service.
• Real-Time Updates: Gateways ensure data is refreshed in real time for current insights.
• Collaboration: Teams share and work on reports collectively, promoting a data-driven
culture.
Unit 3-5 15 M merged
In a global enterprise with diverse departments and user roles, Power BI’s sharing and
collaboration features are essential for managing access and distribution of reports and
dashboards effectively. Here's how these features, particularly through the use of apps,
workspaces, and roles in Power BI Service, can be leveraged, along with strategies to address
governance, data security, and report versioning challenges.
Conclusion
By effectively utilizing Power BI’s features and implementing strategies for governance, security,
and collaboration, enterprises can manage access and distribution of reports while
safeguarding sensitive data and ensuring report integrity across teams.