0% found this document useful (0 votes)
24 views7 pages

Chapter 1 - Introduction to Data Mining - Slide

Uploaded by

hanhntm22414c
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views7 pages

Chapter 1 - Introduction to Data Mining - Slide

Uploaded by

hanhntm22414c
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Introduction to Data Mining

1. What is data mining?


2. Motivating Challenges
3. The Origins of Data Mining
4. Data Mining Tasks

Data Mining
Prepared by Phan Huy Tam – Finance & Banking Dept - UEL
2

Data mining is the process of automatically discovering useful information in large data repositories

Data mining techniques are deployed to scour large data sets in order to find novel and useful patterns that
might otherwise remain unknown.

1. What is data mining? 2. Motivating Challenges 3. The Origins of Data Mining 4. Data Mining Tasks
3

Knowledge Discovery in Databases (KDD)


Data mining is an integral part of knowledge discovery in databases (KDD), which is the overall process of
converting raw data into useful information

Feature Selection
Filtering Pattern
Dimensionality Reduction
Visualization
Normalization
Pattern Interpretation
Data Sub setting

Data
Input Data Data Mining Postprocessing Information
Processing

1. What is data mining? 2. Motivating Challenges 3. The Origins of Data Mining 4. Data Mining Tasks
4

1. Scalability 4. Data Ownership and Distribution


Massive data sets: out-of-core algorithms may be Data is geographically distributed among resources
necessary when processing data sets that cannot fit belonging to multiple entities (distributed data
into main memory (parallel and distributed mining techniques).
algorithms).
5. Non-traditional Analysis
2. High Dimensionality Extremely labor-intensive (trial & error), desire to
Traditional data analysis techniques that were automate the process of hypothesis generation and
developed for low-dimensional data often do not evaluation
work well for such high-dimensional data due to
issues such as curse of dimensionality.

3. Heterogeneous and Complex Data


non-traditional types of data include web and social
media data containing text, hyperlinks, images, audio,
and videos, DNA sequence, climate data…

1. What is data mining? 2. Motivating Challenges 3. The Origins of Data Mining 4. Data Mining Tasks
5

Traced back to the late 1980s


Challenges and opportunities in applying computational techniques to extract actionable knowledge from large
databases and fueled the tremendous growth of this field.

sampling information retrieval


estimation
evolutionary computing
information theory
statistics
optimization
search algorithms
signal processing
modeling techniques
artificial intelligence
estimation
machine learning
big data
visualization
pattern recognition
hypothesis testing

1. What is data mining? 2. Motivating Challenges 3. The Origins of Data Mining 4. Data Mining Tasks
6

1. What is data mining? 2. Motivating Challenges 3. The Origins of Data Mining 4. Data Mining Tasks
Prepared by Phan Huy Tam – Finance & Banking Dept - UEL
Email: [email protected]
Phone: 0798109293

You might also like