0% found this document useful (0 votes)
238 views6 pages

CSE602 - Data Warehousing & Data Mining

This 3 credit postgraduate course introduces students to concepts of data warehousing and data mining. The course is divided into 5 modules that cover topics such as data warehousing components and design, online analytical processing, data mining approaches and techniques including classification, clustering, association rules. The course also includes advanced concepts like opinion mining, web mining, and text mining. Students learn to apply techniques like decision trees, neural networks, and association rules to datasets using Weka software. Assessment includes a theory exam, assignments, lab work and a project implementing a data mining technique.

Uploaded by

Arun Mohan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
238 views6 pages

CSE602 - Data Warehousing & Data Mining

This 3 credit postgraduate course introduces students to concepts of data warehousing and data mining. The course is divided into 5 modules that cover topics such as data warehousing components and design, online analytical processing, data mining approaches and techniques including classification, clustering, association rules. The course also includes advanced concepts like opinion mining, web mining, and text mining. Students learn to apply techniques like decision trees, neural networks, and association rules to datasets using Weka software. Assessment includes a theory exam, assignments, lab work and a project implementing a data mining technique.

Uploaded by

Arun Mohan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

L T P/S SW/F TOTAL

Course Title: DATA WAREHOUSING AND DATA MINING W CREDIT


UNITS
Course Level: PG 3 4 - 5
Course Code: CSE 602 Credit Units: 5

Course Objectives:

To demonstrate new concepts of organizing data ware house & data mining technique to drive the useful information out of the piles
of data. With the growth of large amount of data today it has become necessity to explore and mine the data so that we can have
hidden useful Information. This course will expose students to the process of extracting patterns and useful information from large
data sets by combining methods from data mining, statistics and artificial intelligence with database management. It will also expose
students to have data analysis using data mining tools. This course is also covering some advance topics in data mining like, opinion
mining, web mining etc.
Pre-requisites:NIL

Course Contents/Syllabus:

Weightage (%)
Module I: Data Warehousing 20
• Data warehousing, characteristics and components of a data warehouse,
• ETL process,
• Data marts,
• Data warehouse logical design : star schemas, snowflake, fact tables, dimensions, other schemas,
• Materialized views,
• Data warehouse physical design: hardware and i/o considerations,
• Parallelism, indexes.

Module II: On Line Analytical processing 20


• OLTP and OLAP systems,
• Multidimensional Modeling,
• OLAP Tools, web OLAP,
• Decision support system.
• Developing a Data Ware house: Architectural strategies and Organization Issues,
• Design Considerations,
• Tools for Data Warehousing
• Developing Financial Projections—How to Forecast Expenses and Revenue

Module III :Data Mining 18


• Data mining approaches and methods:
• Objectives of Data Mining the Technical context for Data Mining ,
• Data preprocessing, concept description,
• Research trends in data warehousing and data mining.,
• Machine learning,
• Decision support and computer technology.

Module IV :Data Mining Techniques and Algorithms 22


• Process of data mining
• Data Mining Techniques :Classification& Predication,
• Decision trees
• Neural Networks,
• Bayesian Classification,
• Association rules, Apriori, FP Tree,
• Clustering Techniques & algorithms,
• Automatic Cluster Detection,
• Mining complex types of data.

Module V: Advance Concepts in Data Mining 20


• Introduction to Opinion Mining,
• Web Mining,
• Mining unstructured data ,
• Link Analysis ,
• Text Mining and Information retrieval,
• Rough Set theory,
• Mining sequence data,
• Introduction to Genetic Algorithm.

Student Learning Outcomes:


• By the end of this course students will be able to design and develop a data warehouse.
• They will be able to analyze and evaluate data warehouse using a multidimensional model and by using various OLAP techniques
• Students will be able to display a comprehensive understanding of different data mining tasks and the algorithms most appropriate for
addressing them.
• Students will be able to evaluate models/algorithms with respect to their accuracy.
• Students will be able to demonstrate capacity to perform a self directed piece of practical work that requires the application of data
mining techniques.
• Students will be able to Analyze and critique the results of a data mining exercise.
Students will be able to conceptualize a data mining solution to a practical problem

Pedagogy for Course Delivery:


1. Classroom teaching using White board and Presentations.
2. Assignments and Tutorials for continuous assessment.
Lab
Based on Course Lab Credits, student is required to perform following assignments & practicals using Weka:

Data Mining Lab & Assignment


1. Data Preprocessing Using Weka: You are expected to explore, observe and understand the purpose of each
button
under the preprocess panel after loading the ARFF file you prepared in this lab. Also, try to interpret what you
observe using a different ARFF file, weather.arff, provided with WEKA.

2. Demonstrate and analyze the result of following Data mining techniques using weka on the data sets provided
with
WEKA

a) Classification (e.g., BayesNet, KNN, C4.5 Decision Tree, Neural Networks, SVM),
b) Regression (e.g., Linear Regression, Isotonic Regression, SVM for Regression),
c) Clustering (e.g., Simple K-means, Expectation Maximization (EM)),
d) Association rules (e.g., Apriori Algorithm, Predictive Accuracy, Confirmation Guided),
e) Feature Selection (e.g., Cfs Subset Evaluation, Information Gain, Chi-squared Statistic), and
f) Visualization (e.g., View different two-dimensional plots of the data).

3. Write a program to develop Snowflake Schema.

4. Write a program to implement BFS and DFS with respect to 2-D modeling.

5. Write a program to compare between Apriori & FP tree growth algorithm.

6. Write a Program to implement the K-means algorithm


7. Write a Program to implement PAM K-medoids algorithm

8. Write a Program to implement AGNES hierarchical clustering

9. Do the compare between K-Means, K-Medoid, Hierarchical clustering Results

Assessment/ Examination Scheme:

Theory L/T (%) Lab/Practical/Studio (%) Total

60 40 100

Theory Assessment (L&T):

Continuous Assessment/Internal Assessment End Term Examination

Components (Drop down) Attendance Class Test Assignment Case Study

Weightage (%) 5 10 8 7 70

Lab Assessment (L&T):


Continuous Assessment/Internal Assessment End Term Examination
(40) (60)
Components (Drop down) Attendance Performance Lab Record Presentation/Viva Practical Viva
(30 ) (30)

Weightage (%) 5 15 10 10 60
Text & References:
Text:
1 “Mastering Data Mining: The Art and Science of Customer Relationship Management”, by Berry and Lin off, John Wiley and Sons,
2001.
2 “Data Ware housing: Concepts, Techniques, Products and Applications”, by C.S.R. Prabhu, Prentice Hall of India, 2001.

References:
1 “Data Mining: Concepts and Techniques”, J.Han, M.Kamber, Academic Press, Morgan Kanf man Publishers, 2001.
2 “Data Mining”, by Pieter Adrians, DolfZantinge, Addison Wesley, 2000.
3 “Data Mining with Microsoft SQL Server”, by Seidman, Prentice Hall of India, 2001

You might also like