0% found this document useful (0 votes)

33 views14 pages

BCE Report

The report discusses the significance of Big Data as a transformative force in various sectors, emphasizing its role in decision-making and efficiency improvements. It outlines the complexities of Big Data, including its volume, variety, velocity, and veracity, and highlights the need for advanced analytics to extract value from large datasets. The document also details the architecture, components, technologies, and applications of Big Data, underscoring its relevance in a digital economy like Hong Kong's.

Uploaded by

Biya Rahul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views14 pages

BCE Report

Uploaded by

Biya Rahul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

A Report on

Big Data

59 Steve Correia

60 Latesh Billava

62 Nathen Vaz

63 Nathen Carneiro

64 Justin Madhri

A report submitted in partial fulfilment of the requirements for

TE V semester Business Communication and Ethics Course

Under the guidance of

Ms. Eden Fernandes

Department of Information Technology

St. Francis Institute of Technology

September 2022

1
Abstract

Big data is a new driver of the world economic and societal changes. The world’s data
collection is reaching a tipping point for major technological changes that can bring new ways
in decision making, managing our health, cities, finance and education. While the data
complexities are increasing including data’s volume, variety, velocity and veracity, the real
impact hinges on our ability to uncover the ‘value’ in the data through Big Data Analytics
technologies. Big Data Analytics poses a grand challenge on the design of highly scalable
algorithms and systems to integrate the data and uncover large hidden values from datasets
that are diverse, complex, and of a massive scale. Potential breakthroughs include new
algorithms, methodologies, systems and applications in Big Data Analytics that discover useful
and hidden knowledge from the Big Data efficiently and effectively. Big Data Analytics is
relevant to Hong Kong as it moves towards a digital economy and society. Hong Kong is already
among the best in the world in Big Data Analytics. Big data analytics must also be team effort
cutting across academic institutions, government and society and industry, and by researchers
from multiple disciplines including computer science and engineering, health, data science and
social and policy areas

2
Acknowledgement

A project is always coordinated, guided and scheduled team effort aimed at realizing a common
goal. We are grateful and gracious to all those people who have helped and guided us through this
project and make this experience worthwhile. We wish to sincerely thank our Director Brother
Shantilal Kujur and Principal Dr. Sincy George and our HOD of Information Technology Dr.
Prachi Raut for giving us this opportunity to prepare a project in the Third Year of Information
Technology. We are highly indebted to our institute St. Francis Institute of Technology and the
Department of Information Technology for providing us with this learning opportunity with the
required resources to accomplish our task so far. We are truly grateful to our mentor Ms. Eden
Fernandes who persistently guided us for the betterment of our project, report and the presentation.
This work would not have been possible without her necessary insights and intellectual suggestions
that have helped us achieve so much. We also take the opportunity to thank all teaching and non-
teaching staff for their endearing support and cooperation.

3
Table of Contents

Sr.No Title Pg.no.

1 Introduction 6

2 Problem Definition 7

3 Architecture 8

4 Components of Big Data 10

5 Technology 11

6 Application 12

7 Conclusion 13

8 Reference 14

4
List of Illustrations

Fig No. Title Page No.

Fig 3.1 Big Data Architecture 8

Fig 4.1 Components of Big Data 10

5
Chapter 1: Introduction

Big data is a broad term for data sets so large or complex that traditional data
processing applications are inadequate. Challenges include analysis, capture, data curation,
search, sharing, storage, transfer, visualization, and information privacy. The term often refers
simply to the use of predictive analytics or other certain advanced methods to extract
value from data, and seldom to a particular size of data set. Accuracy in big data may lead to more
confident decision making. And better decisions can mean greater operational efficiency, cost
reductions and reduced risk.
Data sets grow in size in part because they are increasingly being gathered by cheap
and numerous information-sensing mobile devices, aerial (remote sensing), software logs,
cameras, microphones, radio-frequency identification (RFID) readers, and wireless sensor
networks. The world's technological per-capita capacity to store information has roughly
doubled every 40 months since the 1980s; as of 2012, every day 2.5 exabytes (2.5×1018) of
data were created; The challenge for large enterprises is determining who should own big data
initiatives that straddle the entire organization.
Work with big data is necessarily uncommon; most analysis is of "PC size" data, on a
desktop PC or notebook that can handle the available data set.
Relational database management systems and desktop statistics and visualization
packages often have difficulty handling big data. The work instead requires "massively parallel
software running on tens, hundreds, or even thousands of servers". What is considered
"big data" varies depending on the capabilities of the users and their tools, and expanding
capabilities make Big Data a moving target.

6
Chapter 2: Problem Definition

Problem Definition is probably one of the most complex and heavily neglected stages
in the big data analytics pipeline. In order to define the problem a data product would solve,
experience is mandatory. Most data scientist aspirants have little or no experience in this stage.
Most big data problems can be categorized in the following ways −
• Supervised Regression
In this case, the problem definition is rather similar to the previous example; the difference
relies on the response. In a regression problem, the response y ∈ ℜ, this means the response is
real valued. For example, we can develop a model to predict the hourly salary of individuals
given the corpus of their CV.
• Unsupervised Learning
Management is often thirsty for new insights. Segmentation models can provide this insight in
order for the marketing department to develop products for different segments. A good
approach for developing a segmentation model, rather than thinking of algorithms, is to select
features that are relevant to the segmentation that is desired.
• Learning to Rank
This problem can be considered as a regression problem, but it has particular characteristics
and deserves a separate treatment. The problem involves given a collection of documents we
seek to find the most relevant ordering given a query. In order to develop a supervised learning
algorithm, it is needed to label how relevant an ordering is, given a query.

7
Chapter 3: Architecture

Fig 3.1 Big Data Architecture

• Data sources: All big data solutions start with one or more data sources. Examples
include:
o Application data stores, such as relational databases.
o Static files produced by applications, such as web server log files.
o Real-time data sources, such as IoT devices.

• Data storage: Data for batch processing operations is typically stored in a distributed file
store that can hold high volumes of large files in various formats. This kind of store is
often called a data lake. Options for implementing this storage include Azure Data Lake
Store or blob containers in Azure Storage.

• Batch processing: Because the data sets are so large, often a big data solution must
process data files using long-running batch jobs to filter, aggregate, and otherwise
prepare the data for analysis. Usually these jobs involve reading source files, processing
them, and writing the output to new files.
• Real-time message ingestion: If the solution includes real-time sources, the
architecture must include a way to capture and store real-time messages for stream
processing. This might be a simple data store, where incoming messages are dropped into
a folder for processing.

8
• Stream processing: After capturing real-time messages, the solution must process
them by filtering, aggregating, and otherwise preparing the data for analysis. The
processed stream data is then written to an output sink.
• Analytical data store: Many big data solutions prepare data for analysis and then
serve the processed data in a structured format that can be queried using analytical tools.
The analytical data store used to serve these queries can be a Kimball-style relational
data .
• Analysis and reporting: The goal of most big data solutions is to provide insights
into the data through analysis and reporting. To empower users to analyze the data,
the architecture may include a data modeling layer, such as a multidimensional OLAP
cube or tabular data model in Azure Analysis Services. It might also support self-
service BI, using the modeling and visualization technologies in Microsoft Power BI or
Microsoft Excel.
• Orchestration: Most big data solutions consist of repeated data processing operations,
encapsulated in workflows, that transform source data, move data between multiple
sources and sinks, load the processed data into an analytical data store, or push the results
straight to a report or dashboard.

9
Chapter 4: Components of Big Data

Big-data projects have a number of different layers of abstraction from abstraction of the
data through to running analytics against the abstracted data. Following figure shows the
basic elements of analytical Big-data and their interrelationships. The higher level
components help make big data projects easier and more dynamic. Hadoop is often at the
center of Big-data projects, but it is not a precondition.

Fig 4.1 Components of Big Data

The Components of analytical Big-Data are given below:

• Hadoop packaging and support organizations like Cloudera; to include Map Reduce-
essentially the compute layer of big data.
• Any File system like Hadoop Distributed File System (HDFS), that manages the retrieval
and storing of data and metadata required for computation. Databases such as Hbase? can
also be used.
• A higher level language such as Pig (part of Hadoop) can be used instead of using JAVA
to simplify the writing of computations
• A data warehouse layer named Hive is a built on top of Hadoop

10
Chapter 5: Technology

Big data requires exceptional technologies to efficiently process large quantities of

data within tolerable elapsed times
Multidimensional big data can also be represented as tensors, which can be
more efficiently handled by tensor-based computation, such as multilinear subspace
learning. Additional technologies being applied to big data include massively parallel-
processing (MPP) databases, search-based applications, data mining, distributed file
systems, distributed databases, cloud based infrastructure (applications, storage and
computing resources) and the Internet.
Some but not all MPP relational databases have the ability to store and manage
petabytes of data. Implicit is the ability to load, monitor, back up, and optimize the use of
the large data tables in the RDBMS.
Real or near-real time information delivery is one of the defining characteristics
of big data analytics. Latency is therefore avoided whenever and wherever possible. Data
in memory is good—data on spinning disk at the other end of a FC SAN connection is not.
The cost of a SAN at the scale needed for analytics applications is very much higher
than other storage techniques.

11
Chapter 6: Applications

The applications of the big data are in the following fields:

1. Government: For example in the United States of America, in the year of
2012, the administration of Obama declared the big data research and
development initiative, because it is used to address many issues faced by the
government. The big data is also utilized by the Indian government.
2. International development: The development in the big data analysis furnishes
cost-effective opportunities to enhance the decision in critical advancement
areas like health care, employment opportunities and crime, security and natural
disaster. Hence, in this way, the big data is helpful for the international
development.
3. Manufacturing: In manufacturing, the big data furnishes an infrastructure for
transparency in manufacturing or producing industry.
4. Cyber-physical models: The present PHM implementations make avail of data
during the actual usage while the analytical step by step procedures can do more
precisely when more data is included. This is the role of big data in the cyber-
physical models.
5. Media: In the media, it is used in the internet of things which do the activities like
targeting of computers and data capturing.
6. Technology: In the technology, it is used in the websites like eBay, Amazon
and Facebook and Google utilize it.
7. Private sector: The application of big data in the private sector includes the
retail, retail banking, and real estate.
8. Science: The best example for its application in science is about the Large
Hardom collider that represented 150 million sensors transmitting information
40 million times per second.
9. The big data also has the application in the science and research.

12
Chapter 7: Conclusion

The availability of Big Data, low-cost commodity hardware, and new information
management and analytic software have produced a unique moment in the history of data
analysis. The convergence of these trends means that we have the capabilities required to
analyze astonishing data sets quickly and cost-effectively for the first time in history.
As more and more data is generated and collected, data analysis requires scalable, flexible,
and high performing tools to provide insights in a timely fashion. However, organizations are
facing a growing big data ecosystem where new tools emerge and “die” very quickly.
Therefore, it can be very difficult to keep pace and choose the right tools.
The Age of Big Data is here, and these are truly revolutionary times if both business and
technology professionals continue to work together and deliver on the promise.

13
Chapter 8: References

[1]https://www.oracle.com/in/big-data/what-is-big-data/#:~:text=Big%20data%20defined,-
What%20exactly%20is&text=The%20definition%20of%20big%20data,especially%20from%20
new%20data%20sources.

[2]https://www.google.com/imgres?imgurl=https%3A%2F%2Flearn.microsoft.com%2Fen-
us%2Fazure%2Farchitecture%2Fguide%2Farchitecture-styles%2Fimages%2Fbig-data-
logical.svg&imgrefurl=https%3A%2F%2Flearn.microsoft.com%2Fen-
us%2Fazure%2Farchitecture%2Fguide%2Farchitecture-styles%2Fbig-
data&tbnid=JXDcfbAxV60fkM&vet=12ahUKEwjLo7H7uLD6AhWsKbcAHeJDAHwQMygAe
gUIARDcAQ..i&docid=CaW5MoAE7PDdYM&w=751&h=267&q=big%20data%20architectur
e&ved=2ahUKEwjLo7H7uLD6AhWsKbcAHeJDAHwQMygAegUIARDcAQ

[3]https://www.google.com/imgres?imgurl=https%3A%2F%2Fstatic.packt-
cdn.com%2Fproducts%2F9781784391409%2Fgraphics%2F4008_01_02.jpg&imgrefurl=https%
3A%2F%2Fsubscription.packtpub.com%2Fbook%2Fbig-data-and-business-
intelligence%2F9781784391409%2F1%2Fch01lvl1sec12%2Fcomponents-of-the-big-data-
ecosystem&tbnid=xqwfeKqU07ihM&vet=12ahUKEwjNnczmurD6AhUmk9gFHaSNBJQQMyg
DegUIARDOAQ..i&docid=CBIJo1w1l4bLAM&w=1000&h=738&q=components%20of%20big
%20data&ved=2ahUKEwjNnczmurD6AhUmk9gFHaSNBJQQMygDegUIARDOAQ

[4]https://www.techtarget.com/searchdatamanagement/definition/big-data

[5] https://www.javatpoint.com/what-is-big-data

[6]https://www.sap.com/india/insights/what-is-big-data.html

SL Unit I
No ratings yet
SL Unit I
12 pages
Big Data Components
No ratings yet
Big Data Components
31 pages
UNIT 1 -BIG DATA ANALYTICS Full
No ratings yet
UNIT 1 -BIG DATA ANALYTICS Full
28 pages
Basic Programming Simatic S7-300
No ratings yet
Basic Programming Simatic S7-300
40 pages
Big Data ANALYSIS LONG
No ratings yet
Big Data ANALYSIS LONG
117 pages
HPE Aruba Networking ClearPass Policy Manager-A00136883enw
No ratings yet
HPE Aruba Networking ClearPass Policy Manager-A00136883enw
26 pages
BIG DATA_UNIT-I
No ratings yet
BIG DATA_UNIT-I
17 pages
Big Data Analytics - Project
50% (2)
Big Data Analytics - Project
27 pages
Introduction To Big Data and Hadoop
No ratings yet
Introduction To Big Data and Hadoop
31 pages
BDA-Unit-1 (2)
No ratings yet
BDA-Unit-1 (2)
39 pages
Unit 1 Big Data
No ratings yet
Unit 1 Big Data
124 pages
Dynamic Plot Allocation Concept
No ratings yet
Dynamic Plot Allocation Concept
6 pages
Basic 3 Ict
No ratings yet
Basic 3 Ict
2 pages
Big Data Analytics M1
No ratings yet
Big Data Analytics M1
27 pages
Manual A.C Parte 3
No ratings yet
Manual A.C Parte 3
100 pages
Big-Data-Analytics Notes For Ug
No ratings yet
Big-Data-Analytics Notes For Ug
10 pages
Big Data: Jump To Navigation Jump To Search
No ratings yet
Big Data: Jump To Navigation Jump To Search
50 pages
Big Data Unit 1 Notes - 240311 - 100703
No ratings yet
Big Data Unit 1 Notes - 240311 - 100703
15 pages
Seminar Report Alisha
No ratings yet
Seminar Report Alisha
22 pages
Big Data Seminar Report Rahul Jain
No ratings yet
Big Data Seminar Report Rahul Jain
41 pages
Bigdata Notes
No ratings yet
Bigdata Notes
136 pages
Big Data Manual - Edited
No ratings yet
Big Data Manual - Edited
69 pages
User Manual DSS Control Room - MSEDCL
100% (1)
User Manual DSS Control Room - MSEDCL
33 pages
Chapter 3
No ratings yet
Chapter 3
58 pages
Getting Started With ATmega328P
No ratings yet
Getting Started With ATmega328P
9 pages
1_introduction_to_big_data_management_and_processing
No ratings yet
1_introduction_to_big_data_management_and_processing
46 pages
BDA UNIT 1
No ratings yet
BDA UNIT 1
20 pages
Unit 1 Big Data Notes
No ratings yet
Unit 1 Big Data Notes
48 pages
Introduction To Big Data Ecosystem V 2.0
No ratings yet
Introduction To Big Data Ecosystem V 2.0
76 pages
Computer Networks TCP
No ratings yet
Computer Networks TCP
48 pages
Big Data Technology Report With Pages Removed
No ratings yet
Big Data Technology Report With Pages Removed
32 pages
Unit 1 Big Data Analytics Full
No ratings yet
Unit 1 Big Data Analytics Full
29 pages
Scripting Language Lab Report
No ratings yet
Scripting Language Lab Report
27 pages
Software Engineering
No ratings yet
Software Engineering
79 pages
G12 It Unit 2
No ratings yet
G12 It Unit 2
30 pages
IDAV_Unit-1
No ratings yet
IDAV_Unit-1
20 pages
Introduction To Big Data Platform
No ratings yet
Introduction To Big Data Platform
20 pages
Unit - 1 - Big Data - RCA - E 45
No ratings yet
Unit - 1 - Big Data - RCA - E 45
42 pages
UNIT_1 BDA
No ratings yet
UNIT_1 BDA
14 pages
Unit 3 Big Data
No ratings yet
Unit 3 Big Data
31 pages
BIG DATA Notes
No ratings yet
BIG DATA Notes
11 pages
The AutoCAD Working Environment
100% (1)
The AutoCAD Working Environment
39 pages
OOP PROJECT REPORT - Rashid
No ratings yet
OOP PROJECT REPORT - Rashid
22 pages
Big Data Summery
No ratings yet
Big Data Summery
9 pages
Trader On Chart v1.6cc Instruction Manual (2015!09!21)
No ratings yet
Trader On Chart v1.6cc Instruction Manual (2015!09!21)
14 pages
BDAchap 1
No ratings yet
BDAchap 1
15 pages
_big Data Analytics
No ratings yet
_big Data Analytics
5 pages
Assignment Stid (Group 18) - Big Data
No ratings yet
Assignment Stid (Group 18) - Big Data
28 pages
Schema Inference For Massive JSON Datasets
No ratings yet
Schema Inference For Massive JSON Datasets
12 pages
Big Data-Introduction
No ratings yet
Big Data-Introduction
14 pages
Apowersoft Phone Manager User Guide
No ratings yet
Apowersoft Phone Manager User Guide
28 pages
Form Guide For Form NP 728 As of 26 Nov 2022
No ratings yet
Form Guide For Form NP 728 As of 26 Nov 2022
8 pages
A Big Data Analytics Study Challenges, Unresolved Research Issues, and Techniques
100% (1)
A Big Data Analytics Study Challenges, Unresolved Research Issues, and Techniques
8 pages
Seminar Topic
No ratings yet
Seminar Topic
13 pages
Ijcrt2108014 - 2021
No ratings yet
Ijcrt2108014 - 2021
5 pages
1.big Data and Its Importance
No ratings yet
1.big Data and Its Importance
17 pages
Demystifying Big Data RGc1.0
100% (1)
Demystifying Big Data RGc1.0
10 pages
Project FInal Report
No ratings yet
Project FInal Report
67 pages
BDCC 03 00032 v2 PDF
No ratings yet
BDCC 03 00032 v2 PDF
30 pages
Introduction to Big Data
No ratings yet
Introduction to Big Data
4 pages
If Else: Int Input
No ratings yet
If Else: Int Input
4 pages
Introduction To Big Data - Report 1
No ratings yet
Introduction To Big Data - Report 1
5 pages
What Is Big Data - Introduction
No ratings yet
What Is Big Data - Introduction
6 pages
Interview Preparation and Questions Devops
No ratings yet
Interview Preparation and Questions Devops
65 pages
Big Data Analytics
100% (1)
Big Data Analytics
11 pages
Data Protector 7.0x Upgrade To 8.x or 9.x - Zerofocus
No ratings yet
Data Protector 7.0x Upgrade To 8.x or 9.x - Zerofocus
13 pages
Musfequr Rahman ID - 191051015
No ratings yet
Musfequr Rahman ID - 191051015
4 pages
A Seminar Presentation On "Big Data": Presented By: Divyanshu Bhardwaj Department of Computer Science VIII Semester
No ratings yet
A Seminar Presentation On "Big Data": Presented By: Divyanshu Bhardwaj Department of Computer Science VIII Semester
19 pages
Big Data
No ratings yet
Big Data
25 pages
Worthington Case Study v2
No ratings yet
Worthington Case Study v2
1 page
RR, HTML
No ratings yet
RR, HTML
4 pages
Veloci Q
No ratings yet
Veloci Q
28 pages
Reading Assignment - 474 Final
No ratings yet
Reading Assignment - 474 Final
2 pages
International Journal of Engineering Research and Development (IJERD)
No ratings yet
International Journal of Engineering Research and Development (IJERD)
6 pages
Architecture of The CRM Used by The Bank
No ratings yet
Architecture of The CRM Used by The Bank
6 pages
Bigdata Documentation
No ratings yet
Bigdata Documentation
20 pages
Big Data Analysis Using Apache HADOOP (November 2013) : Abstract-Big Data Problems Are Often Complex To
No ratings yet
Big Data Analysis Using Apache HADOOP (November 2013) : Abstract-Big Data Problems Are Often Complex To
11 pages
Big Data Analytics
100% (1)
Big Data Analytics
3 pages
Peter Gabriel
0% (1)
Peter Gabriel
3 pages
HW 3 - ch12 - 10 07 17 PDF
No ratings yet
HW 3 - ch12 - 10 07 17 PDF
7 pages
4 3 07
No ratings yet
4 3 07
2 pages
Siterra Executive Overview
0% (1)
Siterra Executive Overview
2 pages
Decision Science Project Report On "Big Data"
No ratings yet
Decision Science Project Report On "Big Data"
9 pages
Inter Company Billing Configuration ...
No ratings yet
Inter Company Billing Configuration ...
10 pages
Object Oriented Programming
No ratings yet
Object Oriented Programming
11 pages
I Jcs It 2015060405
No ratings yet
I Jcs It 2015060405
6 pages
Cloud Pak Offering Evolution: Predict Secure Automate
No ratings yet
Cloud Pak Offering Evolution: Predict Secure Automate
1 page
The Power of Big Data: Transforming Industries and Shaping the Future
From Everand
The Power of Big Data: Transforming Industries and Shaping the Future
Tom Henricksen
No ratings yet
Big Data for Enterprise Architects
From Everand
Big Data for Enterprise Architects
Dr Mehmet Yildiz
4.5/5 (3)
Architecting Big Data & Analytics Solutions - Integrated with IoT & Cloud
From Everand
Architecting Big Data & Analytics Solutions - Integrated with IoT & Cloud
Dr Mehmet Yildiz
4.5/5 (2)

Uploaded by

Uploaded by

A Report on

A report submitted in partial fulfilment of the requirements for

Under the guidance of

Department of Information Technology

St. Francis Institute of Technology

Sr.No Title Pg.no.

4 Components of Big Data 10

Fig No. Title Page No.

Fig 3.1 Big Data Architecture 8

Fig 4.1 Components of Big Data 10

Fig 3.1 Big Data Architecture

Fig 4.1 Components of Big Data

The Components of analytical Big-Data are given below:

Big data requires exceptional technologies to efficiently process large quantities of

The applications of the big data are in the following fields:

You might also like