Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • GfG 160: Daily DSA
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Python
  • R Language
  • Python for Data Science
  • NumPy
  • Pandas
  • OpenCV
  • Data Analysis
  • ML Math
  • Machine Learning
  • NLP
  • Deep Learning
  • Deep Learning Interview Questions
  • Machine Learning
  • ML Projects
  • ML Interview Questions
Open In App
Next Article:
SQL for Data Science
Next article icon

R Programming for Data Science

Last Updated : 27 Dec, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

R is an open-source programming language used statistical software and data analysis tools. It is an important tool for Data Science. It is highly popular and is the first choice of many statisticians and data scientists.

  • R includes powerful tools for creating aesthetic and insightful visualizations.
  • Facilitates data extraction, transformation, and loading, with interfaces for SQL, spreadsheets, and more.
  • Provides essential packages for cleaning and transforming data.
  • Enables the application of ML algorithms to predict future events.
  • Supports analysis of unstructured data through NoSQL database interfaces.

Syntax and Variables in R

In R, we use the <- operator to assign values to variables, though = is also commonly used. You can also add comments in your code to explain what’s happening, using the# symbol. It’s great practice to comment your code so that it’s easier to understand later.

R
x <- 5    # Assigns the value 5 to x
y <- 3    # Assigns the value 3 to y
sum_result <- x + y
product_result <- x * y

print(paste('Sum of x and y: ', sum_result))
print(paste('Product of x and y: ', product_result))

Output
[1] "Sum of x and y:  8"
[1] "Product of x and y:  15"

Data Types and Structure in R

In R, data is stored in various structures, such as vectors, matrices, lists, and data frames. Let’s break each one down.

1. Vectors: Vectors are like simple arrays that hold multiple values of the same type. You can create a vector using the c() function:

R
# Creating Vector in R 
vector <- c(1, 2, 3, 4, 5)  
print(vector)

Output
[1] 1 2 3 4 5

2. Matrices: Matrices are two-dimensional arrays where each element has the same data type. You create a matrix using the matrix() function:

R
# Creating Matrix in R 
matrix_data <- matrix(1:9, nrow = 3, ncol = 3) 
print(matrix_data)

Output
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

3. Lists: Lists can contain elements of different types, including numbers, strings, vectors, and another list inside it. Lists are created using the list() function:

R
# Creating list in R 
list_data <- list("Red", 20, TRUE, 1:5)
print(list_data)

Output
[[1]]
[1] "Red"

[[2]]
[1] 20

[[3]]
[1] TRUE

[[4]]
[1] 1 2 3 4 5

4. Data Frames: Data frames are the most commonly used data structure in R. They’re like tables, where each column can contain different data types. Use data.frame() to create one:

R
# Creating DataFrame in R 
data_frame <- data.frame(Name = c("Alice", "Bob"), Age = c(24, 28))
print(data_frame)

Output
   Name Age
1 Alice  24
2   Bob  28

These foundational concepts are a great starting point for your journey into data science. To dive deeper, consider exploring the following tutorial: R Programming Tutorial

In R Programming, several libraries are required in data science for tasks like data manipulation and statistical modeling to visualize and machine learning. The key libraries include:

  • dplyr
  • tidyr
  • ggplot2
  • xgboost
  • shiny
  • data.table

Data Manipulation with R Programming

R Libraries are effective for data manipulation, enabling analysts to clean, transform, and summarize datasets efficiently.

Using dplyr for Data Manipulation

The dplyr package provides a set of functions that make it easy to manipulate data frames in a clean and readable manner. Some of the key functions in dplyr include:

  • filter(): Filters rows based on conditions.
  • select(): Selects specific columns.
  • mutate(): Adds or modifies columns.
  • arrange(): Orders rows by specified columns.
  • summarize(): Summarizes data by applying functions (e.g., mean, sum).

Let's perform data manipulation using the above function using a sample dataset:

R
install.packages("dplyr")
library(dplyr)

data <- data.frame(
  Name = c("Alice", "Bob", "Charlie", "David", "Eve"),
  Age = c(24, 28, 35, 40, 22),
  Salary = c(50000, 60000, 70000, 80000, 45000)
)

# Filters rows based on conditions
filtered_data <- filter(data, Age > 25)
print("Filtered Data (Age > 25):")
print(filtered_data)

# Selects specific columns
selected_data <- select(data, Name, Salary)
print("Selected Data (Name and Salary columns):")
print(selected_data)

Output:

[1] "Filtered Data (Age > 25):"
Name Age Salary
1 Bob 28 60000
2 Charlie 35 70000
3 David 40 80000

[1] "Selected Data (Name and Salary columns):"
Name Salary
1 Alice 50000
2 Bob 60000
3 Charlie 70000
4 David 80000
5 Eve 45000

Data Cleaning and Transformation

Data cleaning involves correcting or removing errors and transforming data into a usable format. Key transformations include:

  • rename(): to rename columns
  • as.character(): to change the data type
  • mutate(): to create derived variables

Now, we will be using the previous dataset to perform data transformation:

R
# Renaming columns
data_renamed <- rename(data, Employee_Name = Name, Employee_Age = Age)
print("Renamed Data (Name to Employee_Name, Age to Employee_Age):")
print(data_renamed)

Output

[1] "Renamed Data (Name to Employee_Name, Age to Employee_Age):"
Employee_Name Employee_Age Salary Salary_per_year
1 Alice 24 50000 4166.667
2 Bob 28 60000 5000.000
3 Charlie 35 70000 5833.333
4 David 40 80000 6666.667
5 Eve 22 45000 3750.000

Handling Missing Values

Dealing with missing values is an essential part of data preparation. R provides several functions to identify, handle, and replace missing values in datasets. Key functions include:

  • is.na(): To identify missing values in the data.
  • na.omit(): To remove rows with missing values.
  • ifelse(): To replace missing values with a specific value or calculated result.
  • tidyr::fill(): To fill missing values using the previous or next non-missing value in the column.
R
data_missing <- data.frame(
  Name = c("Alice", "Bob", "Charlie", NA, "Eve"),
  Age = c(24, 28, 35, NA, 22),
  Salary = c(50000, NA, 70000, 80000, 45000)
)

# Identifying missing values
missing_data <- is.na(data_missing)
print("Identifying Missing Values:")
print(missing_data)

# Fill missing values 
install.packages("tidyr")
library(tidyr)
data_filled <- fill(data_missing, Age, .direction = "down")
print("Data After Filling Missing Values in Age (Downward Direction):")
print(data_filled)

Output:

[1] "Identifying Missing Values:"
Name Age Salary
[1,] FALSE FALSE FALSE
[2,] FALSE FALSE TRUE
[3,] FALSE FALSE FALSE
[4,] TRUE TRUE FALSE
[5,] FALSE FALSE FALSE

[1] "Data After Filling Missing Values in Age (Downward Direction):"
Name Age Salary
1 Alice 24 50000
2 Bob 28 NA
3 Charlie 35 70000
4 <NA> 35 80000
5 Eve 22 45000

Statistical Analysis in R

R provides tools for performing both descriptive and inferential statistical analysis, making it a preferred choice for statisticians and data scientists.

Descriptive Statistics

Descriptive statistics provide a summary of the data's key characteristics using measures like mean, median, variance, and standard deviation.

  • mean(): Calculates the average of a dataset.
  • median(): Identifies the middle value in a dataset.
  • sd(): Computes the standard deviation.
  • summary(): Provides a summary of key descriptive statistics.
R
# Define a vector with numeric values
vector <- c(10, 20, 30, 40, 50)

# Calculate the mean of the vector
mean_value <- mean(vector)
# Calculate the median of the vector
median_value <- median(vector) 
# Calculate the sum of the vector
total_sum <- sum(vector)

# Output the results
print(paste("Mean:", mean_value))
print(paste("Median:", median_value))
print(paste("Sum:", total_sum))

Output
[1] "Mean: 30"
[1] "Median: 30"
[1] "Sum: 150"

Inferential Statistics

Inferential statistics allow you to make predictions or generalizations about a population based on sample data.

1. Hypothesis Testing

Hypothesis Testing evaluates assumptions (hypotheses) about population parameters. In R, common hypothesis tests include:

  • t.test(): Performs t-tests to compare means between two groups.
  • aov(): Conducts Analysis of Variance (ANOVA) to compare means among three or more groups
  • chisq.test(): Performs Chi-Square tests for independence or goodness of fit.
  • wilcox.test(): A non-parametric test that compares two independent samples (Wilcoxon rank-sum test).
  • ks.test(): The Kolmogorov-Smirnov test compares two distributions to see if they are the same.
  • fisher.test(): Fisher's exact test is used for small sample sizes in contingency tables.
R
# T-test to compare means between two groups
group1 <- c(1, 2, 3, 4, 5)
group2 <- c(6, 7, 8, 9, 10)
t_test_result <- t.test(group1, group2)
print("T-test Result:")
print(t_test_result)

# Chi-Square test for independence
data_chisq <- matrix(c(10, 20, 20, 40), nrow = 2, byrow = TRUE)
chisq_result <- chisq.test(data_chisq)
print("Chi-Square Test Result:")
print(chisq_result)

Output:

[1] "T-test Result:"

Welch Two Sample t-test

data: group1 and group2
t = -5, df = 8, p-value = 0.001053
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-7.306004 -2.693996
sample estimates:
mean of x mean of y
3 8


[1] "Chi-Square Test Result:"

Pearson's Chi-squared test

data: data_chisq
X-squared = 0, df = 1, p-value = 1

2. Correlation and Regression Analysis

These techniques explore relationships between variables:

  1. Correlation Analysis: Measures the strength and direction of relationships using cor().
  2. Regression Analysis: Models relationships using lm()(linear regression).
R
# Correlation Analysis using cor(): Measure the strength and direction of a linear relationship
x <- c(1, 2, 3, 4, 5)
y <- c(5, 4, 3, 2, 1)
correlation_result <- cor(x, y)
print("Correlation Between x and y:")
print(correlation_result)

Output:

[1] "Correlation Between x and y:"
[1] -1

Machine Learning with R

Machine learning in R enables analysts to build predictive models, perform classification, and uncover patterns in data.

Supervised Learning

1. Linear Regression: Linear regression is used for predicting continuous numeric outcomes based on one or more predictors. In R, we can predict the continuous numeric outcomes using lm().

Python
# Sample Dataset 
set.seed(123)
train_data <- data.frame(
  predictor1 = rnorm(100, mean = 50, sd = 10),
  predictor2 = rnorm(100, mean = 30, sd = 5),
  target = rnorm(100, mean = 100, sd = 15)
)

model_lr <- lm(target ~ predictor1 + predictor2, data = train_data)
pred_lr <- predict(model_lr, newdata = train_data)
head(pred_lr)
mse <- mean((train_data$target - pred_lr)^2)
mse

Output:

197.509197666493

2. Logistic Regression: Logistic regression is used for binary classification tasks where the outcome variable is categorical (e.g., 0 or 1), in R, it is performed using glm() function.

R
set.seed(123)
train_data_logistic <- data.frame(
  predictor1 = rnorm(100, mean = 50, sd = 10),
  predictor2 = rnorm(100, mean = 30, sd = 5),
  target = sample(0:1, 100, replace = TRUE)
)

# Fit Logistic Regression model
model_logistic <- glm(target ~ predictor1 + predictor2, family = binomial, data = train_data_logistic)
pred_logistic <- predict(model_logistic, newdata = train_data_logistic, type = "response")
pred_logistic_class <- ifelse(pred_logistic > 0.5, 1, 0)  # Convert probabilities to binary predictions

accuracy_logistic <- mean(pred_logistic_class == train_data_logistic$target)
accuracy_logistic

Output:

0.63

3. Decision Trees: Decision trees are used for both classification and regression tasks. In this example, we perform classification using rpart() function:

R
install.packages("rpart")
library(rpart)

set.seed(123)
train_data_tree <- data.frame(
  predictor1 = rnorm(100, mean = 50, sd = 10),
  predictor2 = rnorm(100, mean = 30, sd = 5),
  target = sample(0:1, 100, replace = TRUE)
)

# Fit Decision Tree model
model_tree <- rpart(target ~ predictor1 + predictor2, data = train_data_tree, method = "class")
pred_tree <- predict(model_tree, newdata = train_data_tree, type = "class")
accuracy_tree <- mean(pred_tree == train_data_tree$target)
accuracy_tree

Output:

0.72

4. Random Forest: Random Forest is an ensemble learning technique to perform classification and regression using randomForest().

R
install.packages("randomForest")
library(randomForest)

set.seed(123)
train_data_rf <- data.frame(
  predictor1 = rnorm(100, mean = 50, sd = 10),
  predictor2 = rnorm(100, mean = 30, sd = 5),
  target = sample(0:1, 100, replace = TRUE)  
)

train_data_rf$target <- factor(train_data_rf$target, levels = c(0, 1))

# Random Forest model
model_rf <- randomForest(target ~ predictor1 + predictor2, data = train_data_rf)
pred_rf <- predict(model_rf, newdata = train_data_rf)
accuracy_rf <- mean(pred_rf == train_data_rf$target)
print(paste("Random Forest Accuracy: ", accuracy_rf))

Output:

Random Forest Accuracy: 1

Unsupervised Learning

Unsupervised learning involves learning patterns in data without labeled outputs. Common techniques include clustering and dimensionality reduction.

1. K-means Clustering: K-means partitions the data into K clusters based on the distance between data points. In R, kmeans() function is used perform clustering.

Python
set.seed(123)
data <- data.frame(
  predictor1 = rnorm(100, mean = 50, sd = 10),
  predictor2 = rnorm(100, mean = 30, sd = 5),
  target = sample(0:1, 100, replace = TRUE)  
)

# Perform K-means clustering
model_kmeans <- kmeans(data[, -3], centers = 3)  
cluster_centers <- model_kmeans$centers  
cluster_assignments <- model_kmeans$cluster 
withinss <- model_kmeans$tot.withinss 

print("Cluster Centers:")
print(cluster_centers)

print("Cluster Assignments:")
print(cluster_assignments)

print("Total Within-Cluster Sum of Squares:")
print(withinss)

Output:

[1] "Cluster Centers:"
predictor1 predictor2
1 62.48318 27.73121
2 51.24186 30.80630
3 41.05266 29.10471

[1] "Cluster Assignments:"
[1] 3 2 1 2 2 1 2 3 3 2 1 2 2 2 3 1 2 3 1 3 3 2 3 3 3 3 1 2 3 1 2 2 1 1 1 2 1
[38] 2 2 3 3 2 3 1 1 3 3 3 2 2 2 2 2 1 2 1 3 2 2 2 2 3 3 3 3 2 2 2 1 1 3 3 1 3
[75] 3 1 2 3 2 2 2 2 3 1 2 2 1 2 2 1 1 2 2 3 1 3 1 1 2 3

[1] "Total Within-Cluster Sum of Squares:"
[1] 3809.048

2. Principal Component Analysis (PCA): PCA transforms the data into a new coordinate system where the axes represent direction of maximum variance. In R, PCA is performed using prcomp() function.

R
set.seed(123)
data_pca <- data.frame(
  predictor1 = rnorm(100, mean = 50, sd = 10),
  predictor2 = rnorm(100, mean = 30, sd = 5),
  predictor3 = rnorm(100, mean = 60, sd = 15)
)

# Perform PCA
pca_result <- prcomp(data_pca, center = TRUE, scale. = TRUE)
summary(pca_result)  

Output

Importance of components:
PC1 PC2 PC3
Standard deviation 1.0726 0.9900 0.9324
Proportion of Variance 0.3835 0.3267 0.2898
Cumulative Proportion 0.3835 0.7102 1.0000

Model Evaluation

After building a model, it’s essential to evaluate its performance. We can evaluate models using the following metrics:

1. Classification Evaluation Metrics

  • Precision
  • Recall
  • F1 Score
  • AUC-ROC
  • Confusion Matrix

2. Regression Evaluation Metrics

  • Cross-validation
  • RMSE (Root Mean Squared Error)
  • Mean Absolute Error (MAE)
  • R-squared

Time Series Analysis in R

R provides multiple functions for creating, manipulating and analyzing time series data.

ts() function in R

The ts() function is used to convert a numeric vector into a time series object, where you can specify the start date and the frequency of the data (e.g., monthly, quarterly).

Decomposition of Time Series

In R, the decompose() function is used for decomposing time series into trend, seasonal, and residual components.

For more advanced decomposition, you can use STL (Seasonal and Trend decomposition using Loess), which is more robust for irregular seasonality. It is implemented using stl() function.

Time Series Forecasting using R

  • ARIMA Model: The auto.arima() function from the forecast package can automatically select the best ARIMA model for the given time series data based on criteria like AIC (Akaike Information Criterion).
  • SARIMA Model: The auto.arima() function in R can also be used to fit a SARIMA model by automatically selecting the seasonal components.
  • Exponential Smoothing (ETS): Another popular forecasting technique is Exponential Smoothing, which is available in R through the ets() function from the forecast package.
  • Prophet: For handling seasonality and holidays, Facebook's Prophet model can be used. The function used to perform forecasting is prophet(). It is particularly useful for forecasting time series data with strong seasonal effects and missing data.

Difference Between R Programming and Python Programming

FeatureRPython
IntroductionR is a language and environment designed for statistical programming, computing, and graphics.Python is a general-purpose programming language used for data analysis and scientific computing.
ObjectiveFocuses on statistical analysis and data visualization.Supports a wide range of applications, including GUI development, web development, and embedded systems.
WorkabilityOffers numerous easy-to-use packages for statistical tasks.Excels in matrix computation, optimization, and general-purpose tasks.
Integrated Development Environment (IDE)Popular IDEs include RStudio, RKward, and R Commander.Common IDEs are Spyder, Eclipse+PyDev, Atom, and more.
Libraries and PackagesIncludes packages like ggplot2 for visualization and caret for machine learning.Features libraries like Pandas, NumPy, and SciPy for data manipulation and analysis.
ScopePrimarily used for complex statistical analysis and data science projects.Offers a streamlined approach for data science, along with versatility in other domains.

R is ideal for statistical computing and visualization, while Python provides a more versatile platform for diverse applications, including data science.

Top Companies Using R for Data Science

  • Google: Utilizes R for analytical operations, including the Google Flu Trends project, which analyzes flu-related search trends.
  • Facebook: Leverages R for social network analytics, gaining user insights and analyzing user relationships.
  • IBM: A major investor in R, IBM uses it for developing analytical solutions, including in IBM Watson.
  • Uber: Employs R’s Shiny package for interactive web applications and embedding dynamic visual graphics.

Next Article
SQL for Data Science

A

AmiyaRanjanRout
Improve
Article Tags :
  • Data Science
  • R Language
  • Write From Home
  • AI-ML-DS
  • data-science
  • AI-ML-DS With R

Similar Reads

  • Data Science Tutorial
    Data Science is a field that combines statistics, machine learning and data visualization to extract meaningful insights from vast amounts of raw data and make informed decisions, helping businesses and industries to optimize their operations and predict future trends.This Data Science tutorial offe
    3 min read
  • Fundamental of Data Science

    • What is Data Science?
      Data science is the study of data that helps us derive useful insight for business decision making. Data Science is all about using tools, techniques, and creativity to uncover insights hidden within data. It combines math, computer science, and domain expertise to tackle real-world challenges in a
      8 min read

    • What Are the Roles and Responsibilities of a Data Scientist?
      In the world of data space, the era of Big Data emerged when organizations are dealing with petabytes and exabytes of data. It became very tough for industries for the storage of data until 2010. Now when the popular frameworks like Hadoop and others solved the problem of storage, the focus is on pr
      5 min read

    • Top 10 Data Science Job Profiles
      Data Science refers to the study of data to extract the most useful insights for the business or the organization. It is the topmost highly demanding field world of technology. Day by day the increasing demand of data enthusiasts is making data science a popular field. Data science is a type of appr
      8 min read

    • Applications of Data Science
      Data Science is the deep study of a large quantity of data, which involves extracting some meaning from the raw, structured, and unstructured data. Extracting meaningful data from large amounts usesalgorithms processing of data and this processing can be done using statistical techniques and algorit
      6 min read

    • Data Science vs Data Analytics
      In this article, we will discuss the differences between the two most demanded fields in Artificial intelligence that is data science, and data analytics.What is Data Science Data Science is a field that deals with extracting meaningful information and insights by applying various algorithms preproc
      3 min read

    • Data Science Vs Machine Learning : Key Differences
      In the 21st Century, two terms "Data Science" and "Machine Learning" are some of the most searched terms in the technology world. From 1st-year Computer Science students to big Organizations like Netflix, Amazon, etc are running behind these two techniques. Both fields have grown exponentially due t
      5 min read

    • Difference Between Data Science and Business Intelligence
      While they have different uses, business intelligence (BI) and data science are both essential for making data-driven decisions. Data science is the study of finding patterns and forecasts through sophisticated analytics, machine learning, and algorithms. In contrast, the main function of business i
      4 min read

    • Data Science Fundamentals
      In the world of data space, the era of Big Data emerged when organizations began dealing with petabytes and exabytes of data. It became very tough for industries the store data until 2010. Now, the popular frameworks like Hadoop and others have solved the problem of storage, the focus is on processi
      15+ min read

    • Data Science Lifecycle
      Data Science Lifecycle revolves around the use of machine learning and different analytical strategies to produce insights and predictions from information in order to acquire a commercial enterprise objective. The complete method includes a number of steps like data cleaning, preparation, modelling
      6 min read

    • Math for Data Science
      Data Science is a large field that requires vast knowledge and being at a beginner's level, that's a fair question to ask "How much maths is required to become a Data Scientist?" or "How much do you need to know in Data Science?". The point is when you'll be working on solving real-life problems, yo
      5 min read

    Programming Language for Data Science

    • Python for Data Science - Learn the Uses of Python in Data Science
      In this Python for Data Science guide, we'll explore the exciting world of Python and its wide-ranging applications in data science. We will also explore a variety of data science techniques used in data science using the Python programming language. We all know that data Science is applied to gathe
      6 min read

    • R Programming for Data Science
      R is an open-source programming language used statistical software and data analysis tools. It is an important tool for Data Science. It is highly popular and is the first choice of many statisticians and data scientists.R includes powerful tools for creating aesthetic and insightful visualizations.
      13 min read

    • SQL for Data Science
      Mastering SQL (Structured Query Language) has become a fundamental skill for anyone pursuing a career in data science. As data plays an increasingly central role in business and technology, SQL has emerged as the most essential tool for managing and analyzing large datasets. Data scientists rely on
      7 min read

    Complete Data Science Program

    • Data Science Tutorial
      Data Science is a field that combines statistics, machine learning and data visualization to extract meaningful insights from vast amounts of raw data and make informed decisions, helping businesses and industries to optimize their operations and predict future trends.This Data Science tutorial offe
      3 min read

    • Learn Data Science Tutorial With Python
      Data Science has become one of the fastest-growing fields in recent years, helping organizations to make informed decisions, solve problems and understand human behavior. As the volume of data grows so does the demand for skilled data scientists. The most common languages used for data science are P
      3 min read

    Data Analysis tutorial

    • Data Analysis (Analytics) Tutorial
      Data Analytics is a process of examining, cleaning, transforming and interpreting data to discover useful information, draw conclusions and support decision-making. It helps businesses and organizations understand their data better, identify patterns, solve problems and improve overall performance.
      4 min read

    • Data Analysis with Python
      In this article, we will discuss how to do data analysis with Python. We will discuss all sorts of data analysis i.e. analyzing numerical data with NumPy, Tabular data with Pandas, data visualization Matplotlib, and Exploratory data analysis.Data Analysis With Python Data Analysis is the technique o
      15+ min read

    • Data analysis using R
      Data Analysis is a subset of data analytics, it is a process where the objective has to be made clear, collect the relevant data, preprocess the data, perform analysis(understand the data, explore insights), and then visualize it. The last step visualization is important to make people understand wh
      9 min read

    • Top 80+ Data Analyst Interview Questions and Answers
      Data is information, often in the form of numbers, text, or multimedia, that is collected and stored for analysis. It can come from various sources, such as business transactions, social media, or scientific experiments. In the context of a data analyst, their role involves extracting meaningful ins
      15+ min read

    Data Vizualazation Tutotrial

    • Python - Data visualization tutorial
      Data visualization is a crucial aspect of data analysis, helping to transform analyzed data into meaningful insights through graphical representations. This comprehensive tutorial will guide you through the fundamentals of data visualization using Python. We'll explore various libraries, including M
      7 min read

    • Data Visualization with Python
      In today's world, a lot of data is being generated on a daily basis. And sometimes to analyze this data for certain trends, patterns may become difficult if the data is in its raw format. To overcome this data visualization comes into play. Data visualization provides a good, organized pictorial rep
      14 min read

    • Data Visualization in R
      Data visualization is the technique used to deliver insights in data using visual cues such as graphs, charts, maps, and many others. This is useful as it helps in intuitive and easy understanding of the large quantities of data and thereby make better decisions regarding it.Data Visualization in R
      6 min read

    Machine Learning Tutorial

    • Machine Learning Tutorial
      Machine learning is a branch of Artificial Intelligence that focuses on developing models and algorithms that let computers learn from data without being explicitly programmed for every task. In simple words, ML teaches the systems to think and understand like humans by learning from the data.It can
      5 min read

    • Maths for Machine Learning
      Mathematics is the foundation of machine learning. Math concepts plays a crucial role in understanding how models learn from data and optimizing their performance. Before diving into machine learning algorithms, it's important to familiarize yourself with foundational topics, like Statistics, Probab
      5 min read

    • 100+ Machine Learning Projects with Source Code [2025]
      This article provides over 100 Machine Learning projects and ideas to provide hands-on experience for both beginners and professionals. Whether you're a student enhancing your resume or a professional advancing your career these projects offer practical insights into the world of Machine Learning an
      5 min read

    • Top 50+ Machine Learning Interview Questions and Answers
      Machine Learning involves the development of algorithms and statistical models that enable computers to improve their performance in tasks through experience. Machine Learning is one of the booming careers in the present-day scenario.If you are preparing for machine learning interview, this intervie
      15+ min read

    • Machine Learning with R
      Machine Learning as the name suggests is the field of study that allows computers to learn and take decisions on their own i.e. without being explicitly programmed. These decisions are based on the available data that is available through experiences or instructions. It gives the computer that makes
      2 min read

    Deep Learning & NLP Tutorial

    • Deep Learning Tutorial
      Deep Learning tutorial covers the basics and more advanced topics, making it perfect for beginners and those with experience. Whether you're just starting or looking to expand your knowledge, this guide makes it easy to learn about the different technologies of Deep Learning.Deep Learning is a branc
      5 min read

    • 5 Deep Learning Project Ideas for Beginners
      Well, irrespective of our age or domain or background knowledge some things succeed in fascinating us in a way such that we're so motivated to do something related to it. Artificial Intelligence is one such thing that needs nothing more than just a definition to attract anyone and everyone. To be pr
      6 min read

    • Deep Learning Interview Questions
      Deep learning is a part of machine learning that is based on the artificial neural network with multiple layers to learn from and make predictions on data. An artificial neural network is based on the structure and working of the Biological neuron which is found in the brain. Deep Learning Interview
      15+ min read

    • Natural Language Processing (NLP) Tutorial
      Natural Language Processing (NLP) is the branch of Artificial Intelligence (AI) that gives the ability to machine understand and process human languages. Human languages can be in the form of text or audio format.Applications of NLPThe applications of Natural Language Processing are as follows:Voice
      5 min read

    • Top 50 NLP Interview Questions and Answers 2024 Updated
      Natural Language Processing (NLP) is a key area in artificial intelligence that enables computers to understand, interpret, and respond to human language. It powers technologies like chatbots, voice assistants, translation services, and sentiment analysis, transforming how we interact with machines.
      15+ min read

    Computer Vision Tutorial

    • Computer Vision Tutorial
      Computer Vision is a branch of Artificial Intelligence (AI) that enables computers to interpret and extract information from images and videos, similar to human perception. It involves developing algorithms to process visual data and derive meaningful insights.Why Learn Computer Vision?High Demand i
      8 min read

    • 40+ Top Computer Vision Projects [2025 Updated]
      Computer Vision is a branch of Artificial Intelligence (AI) that helps computers understand and interpret context of images and videos. It is used in domains like security cameras, photo editing, self-driving cars and robots to recognize objects and navigate real world using machine learning.This ar
      4 min read

  • Why Data Science Jobs Are in High Demand
    Jobs are something that can help you enable your disabled dreams. This is why many aspirants, who fail to achieve milestones in their businesses in one go, prefer to apply for that job they can pursue. With the same context, you need to know that Data Science jobs are trending in this pandemic era t
    6 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

'); // $('.spinner-loading-overlay').show(); let script = document.createElement('script'); script.src = 'https://assets.geeksforgeeks.org/v2/editor-prod/static/js/bundle.min.js'; script.defer = true document.head.appendChild(script); script.onload = function() { suggestionModalEditor() //to add editor in suggestion modal if(loginData && loginData.premiumConsent){ personalNoteEditor() //to load editor in personal note } } script.onerror = function() { if($('.editorError').length){ $('.editorError').remove(); } var messageDiv = $('
').text('Editor not loaded due to some issues'); $('#suggestion-section-textarea').append(messageDiv); $('.suggest-bottom-btn').hide(); $('.suggestion-section').hide(); editorLoaded = false; } }); //suggestion modal editor function suggestionModalEditor(){ // editor params const params = { data: undefined, plugins: ["BOLD", "ITALIC", "UNDERLINE", "PREBLOCK"], } // loading editor try { suggestEditorInstance = new GFGEditorWrapper("suggestion-section-textarea", params, { appNode: true }) suggestEditorInstance._createEditor("") $('.spinner-loading-overlay:eq(0)').remove(); editorLoaded = true; } catch (error) { $('.spinner-loading-overlay:eq(0)').remove(); editorLoaded = false; } } //personal note editor function personalNoteEditor(){ // editor params const params = { data: undefined, plugins: ["UNDO", "REDO", "BOLD", "ITALIC", "NUMBERED_LIST", "BULLET_LIST", "TEXTALIGNMENTDROPDOWN"], placeholderText: "Description to be......", } // loading editor try { let notesEditorInstance = new GFGEditorWrapper("pn-editor", params, { appNode: true }) notesEditorInstance._createEditor(loginData&&loginData.user_personal_note?loginData.user_personal_note:"") $('.spinner-loading-overlay:eq(0)').remove(); editorLoaded = true; } catch (error) { $('.spinner-loading-overlay:eq(0)').remove(); editorLoaded = false; } } var lockedCasesHtml = `You can suggest the changes for now and it will be under 'My Suggestions' Tab on Write.

You will be notified via email once the article is available for improvement. Thank you for your valuable feedback!`; var badgesRequiredHtml = `It seems that you do not meet the eligibility criteria to create improvements for this article, as only users who have earned specific badges are permitted to do so.

However, you can still create improvements through the Pick for Improvement section.`; jQuery('.improve-header-sec-child').on('click', function(){ jQuery('.improve-modal--overlay').hide(); $('.improve-modal--suggestion').hide(); jQuery('#suggestion-modal-alert').hide(); }); $('.suggest-change_wrapper, .locked-status--impove-modal .improve-bottom-btn').on('click',function(){ // when suggest changes option is clicked $('.ContentEditable__root').text(""); $('.suggest-bottom-btn').html("Suggest changes"); $('.thank-you-message').css("display","none"); $('.improve-modal--improvement').hide(); $('.improve-modal--suggestion').show(); $('#suggestion-section-textarea').show(); jQuery('#suggestion-modal-alert').hide(); if(suggestEditorInstance !== null){ suggestEditorInstance.setEditorValue(""); } $('.suggestion-section').css('display', 'block'); jQuery('.suggest-bottom-btn').css("display","block"); }); $('.create-improvement_wrapper').on('click',function(){ // when create improvement option clicked then improvement reason will be shown if(loginData && loginData.isLoggedIn) { $('body').append('
'); $('.spinner-loading-overlay').show(); jQuery.ajax({ url: writeApiUrl + 'create-improvement-post/?v=1', type: "POST", contentType: 'application/json; charset=utf-8', dataType: 'json', xhrFields: { withCredentials: true }, data: JSON.stringify({ gfg_id: post_id }), success:function(result) { $('.spinner-loading-overlay:eq(0)').remove(); $('.improve-modal--overlay').hide(); $('.unlocked-status--improve-modal-content').css("display","none"); $('.create-improvement-redirection-to-write').attr('href',writeUrl + 'improve-post/' + `${result.id}` + '/', '_blank'); $('.create-improvement-redirection-to-write')[0].click(); }, error:function(e) { showErrorMessage(e.responseJSON,e.status) }, }); } else { if(loginData && !loginData.isLoggedIn) { $('.improve-modal--overlay').hide(); if ($('.header-main__wrapper').find('.header-main__signup.login-modal-btn').length) { $('.header-main__wrapper').find('.header-main__signup.login-modal-btn').click(); } return; } } }); $('.left-arrow-icon_wrapper').on('click',function(){ if($('.improve-modal--suggestion').is(":visible")) $('.improve-modal--suggestion').hide(); else{ } $('.improve-modal--improvement').show(); }); const showErrorMessage = (result,statusCode) => { if(!result) return; $('.spinner-loading-overlay:eq(0)').remove(); if(statusCode == 403) { $('.improve-modal--improve-content.error-message').html(result.message); jQuery('.improve-modal--overlay').show(); jQuery('.improve-modal--improvement').show(); $('.locked-status--impove-modal').css("display","block"); $('.unlocked-status--improve-modal-content').css("display","none"); $('.improve-modal--improvement').attr("status","locked"); return; } } function suggestionCall() { var editorValue = suggestEditorInstance.getValue(); var suggest_val = $(".ContentEditable__root").find("[data-lexical-text='true']").map(function() { return $(this).text().trim(); }).get().join(' '); suggest_val = suggest_val.replace(/\s+/g, ' ').trim(); var array_String= suggest_val.split(" ") //array of words var gCaptchaToken = $("#g-recaptcha-response-suggestion-form").val(); var error_msg = false; if(suggest_val != "" && array_String.length >=4){ if(editorValue.length { jQuery('.ContentEditable__root').focus(); jQuery('#suggestion-modal-alert').hide(); }, 3000); } } document.querySelector('.suggest-bottom-btn').addEventListener('click', function(){ jQuery('body').append('
'); jQuery('.spinner-loading-overlay').show(); if(loginData && loginData.isLoggedIn) { suggestionCall(); return; } // script for grecaptcha loaded in loginmodal.html and call function to set the token setGoogleRecaptcha(); }); $('.improvement-bottom-btn.create-improvement-btn').click(function() { //create improvement button is clicked $('body').append('
'); $('.spinner-loading-overlay').show(); // send this option via create-improvement-post api jQuery.ajax({ url: writeApiUrl + 'create-improvement-post/?v=1', type: "POST", contentType: 'application/json; charset=utf-8', dataType: 'json', xhrFields: { withCredentials: true }, data: JSON.stringify({ gfg_id: post_id }), success:function(result) { $('.spinner-loading-overlay:eq(0)').remove(); $('.improve-modal--overlay').hide(); $('.create-improvement-redirection-to-write').attr('href',writeUrl + 'improve-post/' + `${result.id}` + '/', '_blank'); $('.create-improvement-redirection-to-write')[0].click(); }, error:function(e) { showErrorMessage(e.responseJSON,e.status); }, }); });
"For an ad-free experience and exclusive features, subscribe to our Premium Plan!"
Continue without supporting
`; $('body').append(adBlockerModal); $('body').addClass('body-for-ad-blocker'); const modal = document.getElementById("adBlockerModal"); modal.style.display = "block"; } function handleAdBlockerClick(type){ if(type == 'disabled'){ window.location.reload(); } else if(type == 'info'){ document.getElementById("ad-blocker-div").style.display = "none"; document.getElementById("ad-blocker-info-div").style.display = "flex"; handleAdBlockerIconClick(0); } } var lastSelected= null; //Mapping of name and video URL with the index. const adBlockerVideoMap = [ ['Ad Block Plus','https://media.geeksforgeeks.org/auth-dashboard-uploads/abp-blocker-min.mp4'], ['Ad Block','https://media.geeksforgeeks.org/auth-dashboard-uploads/Ad-block-min.mp4'], ['uBlock Origin','https://media.geeksforgeeks.org/auth-dashboard-uploads/ub-blocke-min.mp4'], ['uBlock','https://media.geeksforgeeks.org/auth-dashboard-uploads/U-blocker-min.mp4'], ] function handleAdBlockerIconClick(currSelected){ const videocontainer = document.getElementById('ad-blocker-info-div-gif'); const videosource = document.getElementById('ad-blocker-info-div-gif-src'); if(lastSelected != null){ document.getElementById("ad-blocker-info-div-icons-"+lastSelected).style.backgroundColor = "white"; document.getElementById("ad-blocker-info-div-icons-"+lastSelected).style.borderColor = "#D6D6D6"; } document.getElementById("ad-blocker-info-div-icons-"+currSelected).style.backgroundColor = "#D9D9D9"; document.getElementById("ad-blocker-info-div-icons-"+currSelected).style.borderColor = "#848484"; document.getElementById('ad-blocker-info-div-name-span').innerHTML = adBlockerVideoMap[currSelected][0] videocontainer.pause(); videosource.setAttribute('src', adBlockerVideoMap[currSelected][1]); videocontainer.load(); videocontainer.play(); lastSelected = currSelected; }

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences