0% found this document useful (0 votes)
114 views50 pages

distributed-dbms-2170714-lab-manual

The document is a laboratory manual for a Distributed Database Management Systems course at Gujarat Technological University, detailing the curriculum and objectives for various experiments related to database management. It includes instructions on SQL concepts, Oracle database operations, and practical exercises for creating, inserting, updating, and querying data in databases. The manual aims to equip students with the necessary skills and knowledge to effectively manage and manipulate databases.

Uploaded by

Computer Science
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
114 views50 pages

distributed-dbms-2170714-lab-manual

The document is a laboratory manual for a Distributed Database Management Systems course at Gujarat Technological University, detailing the curriculum and objectives for various experiments related to database management. It includes instructions on SQL concepts, Oracle database operations, and practical exercises for creating, inserting, updating, and querying data in databases. The manual aims to equip students with the necessary skills and knowledge to effectively manage and manipulate databases.

Uploaded by

Computer Science
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

lOMoARcPSD|47487884

Distributed DBMS (2170714) Lab Manual

Distributed Database Management Systems (Gujarat Technological University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by Computer Science ([email protected])
lOMoARcPSD|47487884

Distributed DBMS (2170714)


Laboratory Manual
Year:2020-2021

CE Department Vision:

To produce technically sound and ethically responsible Computer Engineers to the society by providing Quality
Education.
CE Department Mission:
1) To provide healthy Learning Environment based on current and future Industrial demands.
2) To promote curricular, co-curricular and extra-curricular activities for overall personality development of the
students.
3) To groom technically powerful and ethically dominant engineers having real life problem solving capabilities.
4) To provide platform for Effective Teaching Learning.

IT Department Mission

To provide quality education and assistance to the students through innovative teaching learning methodology for
shaping young mind technically sound and ethically strong.

IT Department Mission:
1) To serve society by producing technically and ethically sound engineers.
2) To generate groomed and efficient problem solvers as per Industrial needs by adopting innovative teaching
learning methods.
3) To emphasis on overall development of the students through various curricular, co-curricular and extra-curricular
activities.

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

INDEX

Page No.
Sr.No. Experiment Date Marks Signature
From To
A) Introduction of Database
1. management systems, Oracle 3 9
concepts and Create a table.
B) How to insert data in a table using
insert and display the records in a
table.
A) Update or Delete records of a
2. table and modifying structure of a 10 15
table using Alter and Drop
command.
B) Study of character functions for
manipulation of data items.
To perform join operation between
3. various tables. 16 17

4. Applying constraint using two 18 21


tables.
5. How to retrieve data from different 22 23
tables using sub queries and
correlated queries.
6. Create two databases either on single 24 25
DBMS and Design Database to
fragment and share the fragments from
both database and write single query
for creating view.
7. Understanding of Database Objects: 26 29
synonym, sequence, index and view.
To study the concepts of
8. Normalization. 30 31

9. Case study on noSQL. 32 42

10. Case study on hadoop. 43 48

Page 2 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

EXPERIMENT NO: 1 DATE: / /


A) TITLE: Introduction of Database Management Systems, SQL Concepts,
Oracle concepts and Create a table.

OBJECTIVES: On completion of this experiment student will able to…


➢ know the concept of database management system.
➢ know the concept of Oracle.
➢ create a table in database.

THEORY:

❖ Introduction of Database Management Systems:


DBMS is a collection of interrelated data and a set of programs to access those
data.
Primary goal of DBMS is to provide a way to store and retrieve database
information that is both convenient andefficient.
Examples of DBMS are Banking System, Universities, Airlines, etc.

❖ Introduction of Oracle:
The relational model, sponsored by IBM (in June 1970), then came to accepted as
the definitive model for RDBMS. The language developed by IBM to manipulate
the data stored within model (Dr. E.F.Codd model) was originally called
Structured English Query Language (SEQUEL) with the word English later
dropped in favor Structured Query Language(SQL).
In 1979 a company called Relational Software, Inc. released the first commercially
available implementation of SQL. Relational Software later come to be known as
Oracle Corporation. Oracle Corporation is a company that produces the most
widely used, Server based, Multi user RDBMS named Oracle.

❖ Oracle Tools:
The Oracle product is primarily divided into
Oracle Server tools: Oracle Server Product is either called Oracle Workgroup
Server or Oracle Enterprise Server. Oracle Workgroup Server or Oracle
Enterprise Server is used for datastorage.
Oracle Client tools: The client roll most commonly used for Commercial
Application Development is called Oracle Developer 2000. Oracle Developer
2000, Oracle’s tool box which consists of Oracle Forms, Oracle Reports and
Oracle Graphics. This suite of tools is used to capture, validate and display
data according to user and system needs.

Page 3 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

SQL*Plus is a separate Oracle client-side tool. Oracle Workgroup or


Enterprise Server is bundled with this SQL*Plus. It is a product that works on
Microsoft Windows 95 and Windows NT both of which are standard Client
based GUI operating systems.
Oracle Workgroup Server and Oracle Developer 2000 are separate products
and they must be purchased separately.

❖ SQL (Structured Query Language):


SQL (Structured Query Language) is a database sublanguage for querying and
modifying relational databases. It was developed by IBM Research in the mid
1970 and standardized by ANSI in 1986.

❖ Components of SQL:
1) DDL (Data Definition Language):
Is a language, which includes the commands, which are used dynamically to set
up, change and remove any data structure e.g. tables, views and indexes. The
examples are CREATE, ALTER & DROP.
2) DML (Data Manipulation Language):
Is a language, which includes the commands, which are used to enter new rows,
change existing rows and remove unwanted rows from the tables in database.
The examples are INSERT, UPDATE & DELETE.
3) DCL (Data Control Language):
Is a language, which includes the commands, which are used to give or remove
access rights to both the Oracle database and the structures within it. The
examples are GRANT & REVOKE.
4) DQL (Data Query Language):
It is the component of SQL statement that allows getting data from the database
and imposing ordering upon it. In includes the SELECT statement. It allows
getting the data out of the database perform operations with it.

❖ The CREATE TABLE command: The CREATE TABLE command defines


each column of the table uniquely. Each column has a minimum of three
attributes, a name, datatype and size (i.e. column width).

Rules for creating Tables:


1. A name can have maximum upto 30 characters.
2. Alphabets from A-Z, a-z and numbers from 0-9 are allowed.
3. A name should begin with an alphabet.

Page 4 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

4. The use of the special character like _ is allowed and alsorecommended


(Special characters like $, # are allowed only in Oracle).
5. SQL reserved words not allowed. For example: create, select and so on.

Syntax: CREATE TABLE <tablename>


(<ColumnName1> <DataType>(<size>),
<ColumnName2> <DataType>(<size>), …… );

Example: Create table client_master


(c_no varchar2(5), name varchar2(10), address varchar2(20),
pincode number(6), bal_due number(10,2));

EXCERCISE:
1) Create a table “emp” with the following fields:
EMPNO ENAME JOB HIREDATE SAL COMM DEPTNO MGR

2) Create a table “dept” with the following fields:


DEPTNO DNAME LOCATION

3) Create a table “stud_master” with the following fields:


REG_NO S_NAME BRANCH

4) Create a table “stud_detail” with the following fields:


REG_NO COURSE_CODE COURSE_NAME MARKS SEM

EVALUATION:

Understanding /
Involvement Timely
Problem solving Total
Completion
(10)
(4) (3)
(3)

Signature with date:

Page 5 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

DATE: / /

B) TITLE: How to insert data in a table using insert command


and display the records in a table.

OBJECTIVES: On completion of this experiment student will able to…


➢ insert records into a table.
➢ display records from a table.

THEORY:
❖ Inserting Data into Tables using INSERT INTO command:
Once a table is created, most natural thing to do is load this table with data to be
manipulated later.

When inserting a single raw of data into the table, insert operation:
Creates a new raw (empty) in the databasetable.
Loads the values passed (by the SQL insert) into the columns specified.

Syntax: INSERT INTO <tablename> [(<ColumnName1>, <ColumnName2>, ……


)] VALUES(<value1>,< value2>, ……);

Example: INSERT INTO client_master (c_no, name, address, pincode, bal_due)


VALUES (‘C001’, ‘Ajay’, ‘A-5, Bhandu’, 384120, 500 );

Note: Character value (expression) placed within the INSERT INTO statement
must be enclosed in single quotes (‘).

❖ Display / Viewing data in the Tables using SELECT command:


Once data has been inserted into a table, the next most logical operation would
be to view what has been inserted. The SELECT SQL verb is used to achieve
this. The SELECT command is used to retrieve rows selected from one or
more tables.
The SELECT statement can be used to Display some or all the columns from a
specified table.
Display some or all of the rows from a specified table.
Display calculated values from the table.
Display statistical information from the tables, like averages or sums of column
values.
Combine information from two or moretables.

Page 6 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

In order to view global table data the syntax is:


SELECT <ColumnName 1> TO <ColumnName N> from TableName;

Note: Here, ColumnName 1 to ColumnName N represents table column names and


they separated by ‘,’.

All Rows and All Columns: When data from all rows and columns from the table
are to be viewed the syntax of the SELECT statement will be used. The syntax is:

Syntax: SELECT * FROM <TableName>;

Example: SELECT * FROM client_master;

Oracle allows the use of the Meta character asterisk (*), this is expanded by Oracle to
mean all rows and all columns in the table.

Displaying Some Columns from a Table:

Syntax: SELECT <ColumnName 1>,<ColumnName 2>, …, <ColumnName N>


FROM <TableName>;

Example: SELECT c_no, name FROM client_master;

Displaying Some Specified Rows from the Table:


If you want conditional retrieval of rows i.e. only those rows which satisfy
certain condition. You can use WHERE clause in the SELECT statement.

Syntax: SELECT <ColumnName 1>,<ColumnName 2>, …, <ColumnName N>


FROM <TableName>
WHERE <Condition>;
Here, <Condtion> is always quantified as <ColumnName = Value>.

Example: SELECT c_no, name FROM client_master WHERE bal_due>500;

Elimination of duplicates from the select statement:


A table could hold duplicate rows. In such a case, to see only unique rows, you have

Page 7 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

to use DISTINCT clause. The DISTINCT clause allows removing duplicates from the
result set. The DISTINCT clause can be only be used with SELECT statements.

Syntax: SELECT DISTINCT <ColumnName 1>, …, <ColumnName N>


FROM <TableName>;

Example: SELECT DISTINCT job FROM emp;

The SELECT DISTINCT * SQL syntax scans through entire rows, and
eliminates rows that have exactly the same contents in each column.

Syntax: SELECT DISTINCT * FROM <TableName>;

Example: SELECT DISTINCT * FROM client_master;

Sorting data in a Table:


Oracle allows data from a table to be viewed in a sorted order. The rows retrieved
from the table will be sorted in either ascending or descending order depending on
the condition specified in the SELECTsentence.

Syntax: SELECT <ColumnName 1>, …, <ColumnName N> FROM


<TableName> ORDER BY <ColumnName 1>, …, <ColumnName N>
<[Sort Order]>;
Sort Order can be ascending (use word asc) or descending (use word desc). In case
there is no mention of the sort order, the Oracle engine sorts in ascending order by
default.

Example: a. SELECT * FROM client_master ORDER BY Name; (In


ascending order)
b. SELECT * FROM emp ORDER BYJob DESC; (In descending
order)

Page 8 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

The format to display the records -

SELECT[DISTINCT]{*, column[alias],…}
FROM table
WHERE condition(s)
Group by column(s)
HAVING group of row condition(s)
ORDER BY {column. Expr} [ASC/DESC];

EXERCISES:

1) Insert records into emp table.


2) Insert records into dept table.
3) Insert records into stud_master table.
4) Insert records into stud_detail table.
5) Select all information from emp table.
6) List all the employees who have salary between 1000 and 2000.
7) List names and jobs of all clerks in department 20.
8) Display all the different job types.
9) List department numbers and names in department name order.
10) Select all information from stud_master table.
11) Display Registration number and name of students whose department is
“computer engineering”.

EVALUATION:

Understanding /
Involvement Timely
Problem solving Total
Completion
(10)
(4) (3)
(3)

Signature with date:

Page 9 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

EXPERIMENT NO: 2 DATE: / /


A) TITLE: Update or Delete records of a table and modifying structure of a table
using Alter and Drop command.

OBJECTIVES: On completion of this experiment student will able to…


➢ update or delete the data or records in the table.
➢ add or delete column in the table.
➢ change data type of given column and rename column.
➢ drop a table.

THEORY:

❖ Updating Rows: The UPDATE command is used to change or modify data


values in table.

Updating of All Rows:


Syntax: UPDATE <TableName> SET <ColumnName 1> = <Expression 1 or
Value 1>, <ColumnName N> = <Expression N or Value N>;

Example: Update the address details by changing its city name to Ahmedabad.
UPDATE ADDR_DTLS SET City = ‘Ahmedabad’;

Updating Records Conditionally:


Syntax: UPDATE <TableName> SET <ColumnName 1> = <Expression 1 or
Value 1>, <ColumnName N> = <Expression N or Value N> WHERE
<Condition>;

Example: Update the branch details by changing the AMP (HO) to Head Office.
UPDATE BRANCH_MSTR SET NAME = ‘Head Office’
WHERE NAME = ‘AMP (HO)’;

❖ Delete Operations:
The DELETE command deletes rows from the table that satisfies the condition
provided by its WHERE clause, and returns the number of records deleted.

Removal of All Rows:


Syntax: DELETE FROM <TableName>;

Example: Empty the ACCT_DTLS table.


DELETE FROM ACCT_DTLS;

Page 10 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

Removal of Specific Rows:


Syntax: DELETE FROM <TableName> WHERE <Condition>;

Example: Remove only the savings bank account details from the ACCT_DTLS
table.
DELETE FROM ACCT_DTLS WHERE ACCT_NO LIKE ‘SB%’;

❖ Inserting Data into a Table from another Table:


To insert data one row at a time into a table, it is quite possible to populate a
table with data that already exists in another table.

Syntax: INSERT INTO <TableName> SELECT <ColumnName1>, … ,


<ColumnName N> FROM < TableName> [WHERE <Condition>];

Here the WHERE clause is optional. If you are not specify the WHERE clause
then all the from source table to target table is copied.

Example: Insert only the savings bank accounts details in the target table
ACCT_DTLS from the source table ACCT_MSTR.

INSERT INTO ACCT_DTLS SELECT


ACCT_NO,BRANCH_NO,CURBAL FROM ACCT_MSTR
WHERE ACCT_NO LIKE ‘SB%’;

❖ Modifying the Structure of Tables:


The structure of a table can be modified by using ALTER TABLE command.
ALTER TABLE allows changing the structure of an existing table. With ALTER
TABLE it is possible to add or delete columns, change the data type of existing
columns.

Adding New Columns:


Syntax: ALTER TABLE <TableName>
ADD (<NewColumnName> <DataType>(<Size>),
<NewColumnName> <DataType>(<Size>), …);

Example: Enter a new field called city in the table BRANCH_MSTR.


ALTER TABLE BRANCH_MSTR ADD(CITY VARCHAR2(25));

Dropping a Column from a Table:

Page 11 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

Syntax: ALTER TABLE <TableName> DROP COLUMN <ColumnName>;


Example: Drop the column city from the table BRANCH_MSTR.
ALTER TABLE BRANCH_MSTR DROP COLUMN CITY;

Modifying Existing Columns:


Syntax: ALTER TABLE <TableName>
MODIFY(<ColumnName> <NewDataType>(<NewSize>));

Example: Alter table BRANCH_MSTR to allow the NAME field to hold


maximum of 30 characters.
ALTER TABLE BRANCH_MSTR MODIFY(NAME VARCHAR2(30));

❖ Destroying Tables:
Sometimes tables within a particular database become obsolete and need to be
discarded. In such situation using DROP TABLE statement with the table name
can destroy a specific table. If a table is dropped all records held within it are lost
and cannot be recovered.
Syntax: DROP TABLE <TableName>;
Example: Remove the table BRANCH_MSTR along with the data held.
DROP TABLE BRANCH_MSTR;

EXCERCISES:
1) Add a column “SPOUSE” to the emp table that will hold the name of an
employee’s spouse.
2) Modify the job of employees to “programmer” whose job is “trainee”.
3) Delete record whose location is “Baroda” from dept table.
4) Drop a table “stud_master”.
5) Create a table “ManagerHist” from emp whose job is “Manager”.
6) Copy all the information of department 20 into the “ManagerHist” table.

EVALUATION:
Understanding /
Involvement Timely
Problem solving Total
Completion
(10)
(4) (3)
(3)

Signature with date:

Page 12 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

DATE: / /
B) TITLE: Study of character functions for manipulation of data items.

OBJECTIVES: On completion of this experiment student will able to…


➢ know the concept of character function.

THEORY:

❖ Character functions:
Character functions are described as follow:

Function Description Example


Returns the string s with the first character of
INITCAP('hello SIR') = 'Hello
INITCAP(s) each word in uppercase and all others in Sir'
lowercase.
Returns the number of characters in the string
LENGTH(s) LENGTH('Welcome Sir') = 11
s.
Returns the string s with all characters in LOWER('Welcome Sir') =
LOWER(s)
lowercase. 'welcome sir'

Returns s without any leading character that


LTRIM('welcome', 'slow') =
LTRIM(s, s1) appear in s1. If no s1 character are leading
'elcome'
characters in s, then s is returned unchanged.
Returns s without any trailing character that
RTRIM('Mississippi', 'pi') =
RTRIM(s, s1) appear in s1. If no s1 character are trailing 'Mississ'
characters in s, then s is returned unchanged.
Returns s with all occurrences of substring s1 REPLACE('www.yahoo.com',
REPLACE(s, s1[,
s2])
replaced with s2. By default s2 is NULL and 'yahoo', 'google') =
all occurrences of s1 are removed. 'www.google.com'

Returns the portion of the string s that is len


characters long, beginning at position pos. If
pos is negative, the position is counted SUBSTR('welcome', 4) = 'come'

SUBSTR(s, pos[, backwards from the end of the string. The


SUBSTR('welcome',2,2) = 'el'
len]) function returns NULL if len is less or equalto
zero. If len is skipped, it returns the remaining SUBSTR('welcome',-3,2) = 'om'
of s.

Page 13 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

Returns the string s with all occurrences of


characters in s1 replaced with the positionally
TRANSLATE(s, s1, TRANSLATE('alfabet',
corresponding characters is s2.If s2 is shorter
s2) 'abscde', 'BCDE') = 'BlfBCt'
than s1, the unmatched characters in s1 are
removed.
This function returns the string s with all s1
TRIM([LEADING|
(leading, trailing, or both (by default it's
TRAILING| TRIM(BOTH '.' FROM 'etc ...')
BOTH)) occurrences of characters in s
BOTH [s1 = 'etc '
FROM]] s) removed. The default value of s1 is a space
character.
Returns the string s with all characters in UPPER('Welcome Sir') =
UPPER(s)
uppercase. 'WELCOME SIR'

Returns the number of bytes in the internal


VSIZE(s) VSIZE('SCT on the net') = 14
representation of an s.
Returns s, left-padded to length n with the
LPAD('Page 1',10,’*’) = ****Page
LPAD(s,n[,s1]) sequence of characters specified in s1. if s1 is
1
not specified Oracle uses blanks by default.
Returns s, right-padded to length n with the
RPAD('Page 1',10,’*’) = Page
RPAD(s,n[,s1]) sequence of characters specified in s1. if s1 is
1****
not specified Oracle uses blanks by default.
Returns the location of a substring in a string.
String2 is the substring to search for in
string1. The default start position is 1.
INSTR(<string1, If the start position is negative, the function INSTR(‘SCT on the net’,’t’) = 8
string2, [<start
counts back start position number of
position>],[nth INSTR(‘SCT on the net’,’t’,1,2)
appearance]) characters from the end of string1 and then = 14
searches towards the beginning of string1.
nth appearance is the nth appearance of
string2 and default is 1.
CONCAT CONCAT(‘Nirma’,
(<string1>, String2 is concated to string1. ’University’) = ‘Nirma
<string2>) University’

Page 14 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

EXCERCISE:

1) Produce the following output


EMPLOYEE_AND_SALARY
AJAY 10000
JASHVANT 52000
PRAKASH 40850
SUDHA 65000
-------- -------
-------- -------
-------- -------

2) Produce the following output


EMPLOYEE

AJAY (Assistant Professor)


JASHVANT (Manager)
RAHUL (Project Leader)

3) Write the difference between REPLACE and TRANSLATE functions.


4) The LENGTH function returns the length of a word. (State True / False with
justification.)
5) The function removes characters from the left of char with initial
characters removed upto the first character not inset.
6) LPAD returns the string passed as a parameter after justify padding it to a
specified length. (State True / False with justification.)

EVALUATION:

Understanding /
Involvement Timely
Problem solving Total
Completion
(10)
(4) (3)
(3)

Signature with date:

Page 15 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

EXPERIMENT NO: 3 DATE: / /


TITLE: To perform join operation between various tables.

OBJECTIVES: On completion of this experiment student will able to…


➢ fetch the data from more then one table ondatabase.
➢ know different types of join.
THEORY:

❖ Join: A join is used when a SQL query requires data from more than one table on
database.
There are two main types of joinconditions: -
• Equi-join
• Non-equi join

❖ Equi-join: The relationship between two tables is equi join when any one column
corresponds to the same column in oyher table e.g. deptno in EMP table as well as in
DEPT table. Here relationship is obtained using “=”operator.

❖ Non Equi-join: The relationship between two tables is non equi join when no
column in one table corresponds directly to a column in other table. Here
relationship is obtained other than “=” operator

❖ Self Joins:
A self join is a join of a table to itself. This table appears twice in the FROM clause
and is followed by table aliases that qualify column names in the join condition.
To perform a self join, Oracle combines and returns rows of the table that satisfy
the join condition.

❖ Inner Joins:
An inner join (sometimes called a "simple join") is a join of two or more tables that
returns only those rows that satisfy the joincondition.

❖ Cross Joins:
If two tables in a join query have no join condition, Oracle returns their Cartesian
product. Oracle combines each row of one table with each row of the other. A
Cartesian product always generates many rows and is rarely useful. For example,
the Cartesian product of two tables, each with 100 rows, has 10,000 rows. Always
include a join condition unless you specifically need a Cartesianproduct.

Page 16 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

Outer Joins:
An outer join extends the result of a simple join. An outer join returns all rows
that satisfy the join condition and also returns some or all of those rows from one
table for which no rows from the other satisfy the join condition.
• To write a query that performs an outer join of tables A and B and returns all
rows from A (a left outer join), use the LEFT [OUTER] JOIN syntax in the
FROM clause, or apply the outer join operator (+) to all columns of B in the
join condition in the WHERE clause. For all rows in A that have no matching
rows in B, Oracle returns null for any select list expressions containing
columns of B.
• To write a query that performs an outer join of tables A and B and returns all
rows from B (a right outer join), use the RIGHT [OUTER] JOIN syntax in the
FROM clause, or apply the outer join operator (+) to all columns of A in the
join condition in the WHERE clause. For all rows in B that have no matching
rows in A, Oracle returns null for any select list expressions containing
columns of A.
• To write a query that performs an outer join and returns all rows from A and
B, extended with nulls if they do not satisfy the join condition (a full outer
join), use the FULL [OUTER] JOIN syntax in the FROM clause.

EXCERCISE:
1) Define: Join. Explain self join.
2) Retrieve employee number, employee name and their department name, in department
name order.
3) Show all employee details who lives in Baroda.
4) Display the name, salary and department number of employees whose salary is more
than 10000.
5) List the employee name, job, salary and department name for everyone in the company
except clerks. Sort on salary displaying the highest salary first.
6) List all employees by name and number along with their manager’s name and number.
7) Display all the employees who earn less than theirmanagers.

EVALUATION:
Understanding /
Involvement Timely
Problem solving Total
Completion
(10)
(4) (3)
(3)

Signature with date:

Page 17 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

EXPERIMENT NO: 4 DATE: / /

TITLE: Applying constraint using two tables.

OBJECTIVES: On completion of this experiment student will able to…


➢ learn the different types of constraints.

THEORY:

❖ Constraints are classed as either:


1. Table constraints
These may reference one or more columns and are defined separately from the
definitions of the columns in thetable.

2. Column constraints
These reference a single column and are defined within the specification for the
owning column.

❖ Constraint types-
You may define the following constrainttypes-
1. Primary key
2. Foreign key
3. Unique
4. Null /Not null
5. Check

Primary key constraint: A primary key is a one or more column(s) in a table used to
uniquely identify each row in the table. None of the fields that are part of the primary key
can contain a null value. A table can have only one primary key.

PRIMARY KEY Constraint Defined at Column Level:

Syntax: <ColumnName> <Datatype> (<Size>) [Constraintconstraint_name]


PRIMARY KEY

PRIMARY KEY Constraint Defined at Table Level:

Syntax: [Constraint constraint_name] PRIMARY KEY (<ColumnName 1>,


<ColumnName 2>)
Foreign key constraint:

Page 18 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

➢ Foreign key represent relationships between tables. A foreign key is table whose values

are derived from the primary key or unique key of some other table.

➢ The table in which the foreign key is defined is called a foreign table or Detail table.

➢ The table that defines the primary or unique key and is referenced by the foreign key is

called the Primary table or Mastertable.

➢ The master table can be referenced in the foreign key definition by using the

REFERENCES adverb. If the name of the column is not specified, by default, oracle

references the primary key in mastertable.

FOREIGN KEY Constraint Defined at the Column Level:

Syntax: <ColumnName> <DataType>(<size>) [Constraint constraint_name]


REFERENCES <TableName> [(<ColumnName>)]

FOREIGN KEY Constraint Defined at the Table Level:

Syntax: [Constraint constraint_name] FOREIGN KEY ( <ColumnName>


[,<ColumnName>] ) REFERENCES<TableName>(<ColumnName>
[,<ColumnName>])

Unique constraint: The Unique column constraint permits multiple entries of NULL
into a column. These NULL values are clubbed at the top the column in order in which
they were entered into the table. This is the essential difference between the Primary Key
and Unique Constraints when applied to tablecolumn(s).

UNIQUE Constraint Defined at Column Level:

Syntax: <ColumnName> <Datatype> (<Size>) [Constraintconstraint_name]


UNIQUE

UNIQUE Constraint Defined at the Table Level:

Syntax: [Constraint constraint_name] UNIQUE (<ColumnName 1>,


<ColumnName 2>)

Page 19 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

The CHECK Constraint: Business rule validations can be applied to a table column by
using CHECK constraint. It must be specified as a logical expression that evaluates either
to TRUE or FALSE.

Note: A CHECK constraint takes substantially longer to execute as compared to NOT


NULL, PRIMARY KEY, FOREIGN KEY OR UNIQUE. Thus CHECK constraints must be
avoided if the constraint can be defined using the Not Null, Primary Key or Foreign Key
Constraint.

The CHECK Constraint defined at the Column Level:


Syntax:<ColumnName><DataType>(<Size>) CHECK(<Logical Expression>)
Example: Create a table CUST_MSTR with the following CHECK constraints:
• Data values being inserted into the column CUST_NO must start with the capital
letter C.
• Data values being inserted into the column FNAME, LNAME should be in upper
case only.

CREATE TABLE CUST_MSTR


(CUST_NO VARCHAR2(10) CHECK (CUST_NO LIKE ‘C%’),
FNAMEVARCHAR2(20) CHECK(FNAME=UPPER(FNAME)),
LNAME VARCHAR2(20) CHECK (LNAME = UPPER(LNAME)),
DOB DATE, PANCOPY VARCHAR2(1), PHOTOGRAPH VARCHAR2(25));

The CHECK Constraint defined at the Table Level:


Syntax: CHECK (<Logical Expression>)
Example: Create a table CUST_MSTR with the following CHECK constraints:
• Data values being inserted into the column CUST_NO must start with the capital
letter C.
• Data values being inserted into the column FNAME, LNAME should be in upper
case only.

CREATE TABLE CUST_MSTR


(CUST_NO VARCHAR2(10), FNAME VARCHAR2(20), LNAME VARCHAR2(20),
DOB DATE, PANCOPY VARCHAR2(1), PHOTOGRAPH VARCHAR2(25), CHECK
(CUST_NO LIKE ‘C%’), CHECK (FNAME = UPPER(FNAME)), CHECK (LNAME =
UPPER(LNAME)));

Page 20 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

EXCERCISE:
1) Create the a table client_master with the following fields:
clientno, name, address, city, pincode, state, bal_due.
Consider the appropriate data type and size for the columns. In addition, define
clientno as the primary key column.

2) Create the a table product_master with the following fields:


Productno, Description, Qty_on_hand, Sell_price, Cost_price.
Consider the appropriate data type and size for the columns. In addition, define
Productno as the primary key column and check data values being inserted into the
column Productno must start with the capital letter ‘P’.

3) Create the a table salesman_master with the following fields:


Sno, S_name, Address, city, Pincode, State, Sal_amt, Tgt_to_get
Consider the appropriate data type and size for the columns. In addition, define Sno as
the primary key column.

4) Create the a table sales_order with the following fields:


Orderno, clientno, orderdate, delyaddr, sno, delydate.
Consider the appropriate data type and size for the columns. In addition, define
Orderno as the primary key column and define its clientno column as a foreign key,
which references the client_master table & sno column as a foreign key, which
references the salesman_master table.

5) Create the a table sales_order_details with the following fields:


Orderno, Productno, qtyordered
Consider the appropriate data type and size for the columns. In addition, define its
Orderno column as a foreign key, which references the sales_order table & Productno
column as a foreign key, which references the product_master table.

EVALUATION:

Understanding /
Involvement Timely
Problem solving Total
Completion
(10)
(4) (3)
(3)

Signature with date:

Page 21 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

EXPERIMENT NO: 5 DATE: / /

TITLE: How to retrieve data from different tables using sub queries and correlated
queries.

OBJECTIVES: On completion of this experiment student will able to…


usetheclauses ORDERBY, GROUPBY& HAVINGBY.
know the concept of Sub Queries

THEORY:

The format of the sub query is:


SELECT column 1, column 2, …. , column N FROM table
WHERE column <operator> (SELECT column FROM table WHERE condition);

❖ Steps:
1. The inner queries must be enclosed in parentheses, and must be on the right hand
side of the condition.

2. The sub query may not have an ORDER BY clause.

3. The ORDER BY clause appears at the end of the main select statement.

4. Sub queries are always executed from the most deeply nested to the least deeply
nested, unless they are correlated queries.

5. Logical and SQL operators may not be used as well as ANY and ALL.

EXCERCISE:

1. Find the employees who earn the maximum salary for their department. Display the
result in ascending order ofsalary.
2. Find the most recently hired employees in each department. Order by hire date.
3. Find the employees who earn the highest salary in each job type. Sort in descending
salary order.

Page 22 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

4. Show the following details for any employee who earns a salary less than the average
for their department.
ENAME SALARY DNAME JOB
5. Who are the top three earners in the company? Display their name andsalary.
6. Display the empno, name, job and deptno for employees whose salary is greaterthan
the highest salary in any SALES department.

EVALUATION:
Understanding /
Involvement Timely
Problem solving Total
Completion
(10)
(4) (3)
(3)

Signature with date:

Page 23 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

EXPERIMENT NO: 6 DATE: / /

TITLE: Create two databases either on single DBMS and Design


Database to fragment and share the fragments from both database
and write single query for creating view

OBJECTIVES: On completion of this experiment student will able to…


➢ learn view,
➢ create view,

❖ THEORY: Introduction of View:


A VIEW is a virtual table in the database whose contents are defined by a query it
can represent.
A view holds no data at all, until a specific call to the view is made. This reduces
redundant data on a HDD to a very large extent.

❖ Creation of views:
Syntax: CREATE VIEW <ViewName> AS
SELECT <ColumnName1>, <ColumnName2>
FROM <TableName>
WHERE <ColumnName>=expression list
GROUP BY <Grouping Criteria>
HAVING <Predicate>;

Note: The ORDER BY clause cannot be used while creating a view.

Example: Create view on the emp table for the Department 10 which access for the
columns empno,ename,sal.
Answer: create view vw_emp10 as select empno,ename,sal from emp
where deptno = 10;

❖ Selecting a data set from a view:


Once a view has been created, it can be queried exactly like a base table.
The select statement can have the clause like WHERE, ORDER BY etc.

Syntax: SELECT <ColumnName1>, <ColumnName2> FROM <ViewName>;


Example: select * from vw_emp10 where sal < 35000 order by empno;

EXCERCISE:
Page 24 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

1. Create view on the emp table for the job “Clerk” which access for the columns empno,
ename, job, sal and rename the column empno as
“empnumber”. And access the data of view.

EVALUATION:

Understanding /
Involvement Timely
Problem solving Total
Completion
(10)
(4) (3)
(3)

Signature with date:

Page 25 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

EXPERIMENT NO: 7 DATE: / /

TITLE: Understanding of Database Objects: synonym, sequence, index and view.

OBJECTIVES: On completion of this experiment student will able to…


➢ learn view, synonym, index & sequence.
➢ create view, synonym, index & sequence.

THEORY:

❖ Introduction of Index:
An index is an ordered list of the contents of a column, (or a group of columns) of a
table.
Indexing involves forming a two dimensional matrix completely independent of the
table on which the index is being created. This two dimensional matrix will have a
single column, which will hold sorted data, extracted from the table column(s) on
which the index is created.
Another column called the address field identifies the location of the record in the
oracle database.

❖ Creation of an Index:
An index can be created on one or more columns. Based on the number of columns
included in the index, an index canbe:
• Simple Index
• Composite Index
• Unique Index

❖ Creation of Index:
An index is created on a single column of a table is called a Simple Index. The syntax
for creating simple index that allows duplicate values is asdescribed:

Syntax: CREATE [UNIQUE] INDEX <IndexName> ON


TableName>(<ColumnName1>[,<ColumnName2>,..,<ColumnNameN>]);
Example: CREATE INDEX idx_c_no ON client_master(c_no);

❖ Dropping Index:
Indexes associated with the tables can be removed by using the DROP INDEX
command.
Syntax: DROP INDEX <IndexName>;

Page 26 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

Example: DROP INDEX idx_c_no;

When a table, which has associated indexes, is dropped, the oracle engine
automatically drops all the associated indexes aswell.

❖ Introduction of View:
A VIEW is a virtual table in the database whose contents are defined by a query it
can represent.
A view holds no data at all, until a specific call to the view is made. This reduces
redundant data on a HDD to a very large extent.

❖ Creation of views:
Syntax: CREATE VIEW <ViewName> AS
SELECT <ColumnName1>, <ColumnName2>
FROM <TableName>
WHERE <ColumnName>=expression list
GROUP BY <Grouping Criteria>
HAVING <Predicate>;

Note: The ORDER BY clause cannot be used while creating a view.

Example: Create view on the emp table for the Department 10 which access for the
columns empno,ename,sal.
Answer: create view vw_emp10 as select empno,ename,sal from emp
where deptno = 10;

❖ Selecting a data set from a view:


Once a view has been created, it can be queried exactly like a base table.
The select statement can have the clause like WHERE, ORDER BY etc.

Syntax: SELECT <ColumnName1>, <ColumnName2> FROM <ViewName>;


Example: select * from vw_emp10 where sal < 35000 order by empno;

❖ Introduction of Sequence:
Most application requires automatic generation of numeric value.
Sequences are tools used to generate a unique sequential number that can be used
in the data tables. One of the best features of sequences is that they guarantee that
you will get a unique value when you access the sequence.
The value generated can have a maximum of 38 digits.

Page 27 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

❖ Creation of Sequence:
Syntax:CREATESEQUENCE<SequenceName>
[INCREMENT BY <IntegerValue>
START WITH <IntegerValue>
MAXVALUE <IntegerValue> / NOMAXVALUE
MINVALUE <IntegerValue> / NOMINVALUE
CYCLE/ NOCYCLE
CACHE <IntegerValue>/ NOCACHE
ORDER / NOORDER]

Note:
Sequence is always given a name so that it can be referenced later whenrequired.
The ORDER, NOORDER Clause has no significance, if Oracle is configured with
Single server option. It is useful only when you are using Parallel Server in Parallel
mode option.
If the CACHE / NOCACHE clause is omitted oracle caches 20 sequence numbersby
default.

Example:
Create sequence order_seq, which will generate numbers from 1 to 9999 in ascending
order with an interval of 1. The sequence must restart from the number 1 after
generating number 9999.
CREATE SEQUENCE order_seq INCREMENT BY 1 START WITH 1
MINVALUE 1 MAXVALUE 9999 CYCLE;

❖ Referencing a Sequence:
Once a sequence is created SQL can be used to view the values held in its cache. To
simply view sequence value use a select sentence as described below.
SELECT <sequence_name>.NextVal FROM dual;
This will display the next value held in the cache on the VDU screen. Every time
nextval references a sequence its output is automatically incremented from the old
value to the new value ready foruse.
After creating a table you can add the data by using the INSERT command like this:
INSERT INTO sales_order(o_no, o_date, c_no)
VALUES (order_seq.nextval, sysdate, ‘c0001’);
To references the current value of asequence:
SELECT <sequence_name>.CurrVal FROM dual;

Page 28 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

❖ Introduction of Synonyms:
A synonym is an alternative name for objects such as tables, views, sequences,
stored procedures, and other database objects.
Syntax: CREATE [OR REPLACE] [PUBLIC] SYNONYM [SCHEMA.]
SYNONYM_NAME FOR [SCHEMA.] OBJECT_NAME [@DBLINK];

Example: Create a synonym to a table EMP held by the user SCOTT.

CREATE PUBLIC SYNONYM EMPLOYEES FOR SCOTT.EMP;

Now, users of other schemas can references the table EMP, which is now called
EMPLOYEES without having the prefix the table name with the schema named
SCOTT.

For example, SELECT * FROM EMPPLOYEES;

EXCERCISE:
1. Create a sequence “seq3” with the followingparameters:
Increment by -1, cache 20, cycle, noorder and which will generate the numbers
from 1 to 5000 in descending order.
2. Create a simple index on “orderid” column of a table ‘order’.
3. Create a synonym “employee“ from the tableemp.

EVALUATION:

Understanding /
Involvement Timely
Problem solving Total
Completion
(10)
(4) (3)
(3)

Signature with date:

Page 29 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

EXPERIMENT NO: 8 DATE: / /

TITLE: To study the concepts of Normalization.

OBJECTIVES: On completion of this experiment student will able to…


what is normalization?
reasons of normalization.
forms of normalization.

THEORY:

❖ WHAT IS NORMALIZATION?
➢ “Normalization is essentially the process of taking a wide table with lots of columns
but few rows and redesigning it as several narrow tables with fewer columns but
more rows.”
A properly normalized design allows you to use storage space efficiently, eliminate
redundant data, reduce or eliminate inconsistent data, and ease the data
maintenance burden. Before looking at the forms of normalization, you need to
know one cardinal rule for normalizing adatabase:
“You must be able to reconstruct the original flat view of the data.”

❖ Normalization is carried out for the following reasons:


1. To structure the data between tables so that data maintenance is simplified.
2. To allow data retrieval at optimalspeed.
3. To simplify data maintenance through updates, inserts and deletes.
4. To reduce the need to restructure tables as new application requirements arise.
5. To improve the quality of design for an application by rationalization of tabledata.

❖ Forms of normalization:
Relational database theorists have divided normalization into several rules called
normal forms.
• First Normal Form: No repeatinggroups.
• Second Normal Form: No nonkey attributes depend on a portion of the primary
key.
• Third Normal Form: No attributes depend on other non-key attributes.
• Boyce-Codd normal form (BCNF): Every non-trivial functional dependency in
the table is a dependency on asuperkey.
• Fourth Normal Form: Every non-trivial multivalued dependency in the table is a
dependency on a superkey.

Page 30 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

• Fifth Normal Form: Every non-trivial join dependency in the table is implied by
the superkeys of the table.

EXCERCISE:
6) Normalize the following table upto third normalform:

Author Author
Collection or
Last First Book Title Subject Publisher Building
Library
Name Name
PCL General
Berdahl Robert Politics History Wiley B – Block
Stacks
Legal
Yudof Mark Child Abuse Person Law Library C – Block
Procedures
Human Memory Cognitive PCL General
Harmon Glynn TMH B – Block
and Knowledge Psychology Stacks
Greek
Graves Robert The Golden Fleece Wiley Classics Library D – Block
Literature
Library and
Charles Ammi Library Information
Miksa Francis Person B – Block
Cutter Biography Science
Collection
Music Publishing Music
Hunter David TMH Fine Arts Library A – Block
and Collecting Literature
English and PCL General
Graves Robert Folksong Mahajan B – Block
Scottish Ballads Stacks

EVALUATION:

Understanding /
Involvement Timely
Problem solving Total
Completion
(10)
(4) (3)
(3)

Signature with date:

Page 31 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

EXPERIMENT NO: 9 DATE: / /

TITLE: Case study on noSQL

OBJECTIVES: On completion of this experiment student will able to…


➢ Understand to handle the huge data properly.

THEORY:

❖ WHAT IS NOSQL?
➢ NoSQL is a non-relational database management systems, different from traditional relational
database management systems in some significant ways. It is designed for distributed data stores
where very large scale of data storing needs (for example Google or Facebook which collects terabits
of data every day for their users). These type of data storing may not require fixed schema, avoid join
operations and typically scale horizontally.

Example:

Social-network graph:

➢ Each record: UserID1, UserID2


➢ Separate records: UserID, first_name,last_name, age, gender,...
➢ Task: Find all friends of friends of friends of ... friends of a given user.
Wikipedia pages :

➢ Large collection of documents


➢ Combination of structured and unstructured data
Page 32 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

➢ Task: Retrieve all pages regarding athletics of Summer Olympic before 1950.

❖ RDBMS vs NoSQL
➢ RDBMS
- Structured and organized data
- Structured query language (SQL)
- Data and its relationships are stored in separate tables.
- Data Manipulation Language, Data Definition Language
- Tight Consistency
➢ NoSQL


- Stands for Not Only SQL
- No declarative query language
- No predefined schema
- Key-Value pair storage, Column Store, Document Store, Graph databases
- Eventual consistency rather ACID property
- Unstructured and unpredictable data
- CAP Theorem
- Prioritizes high performance, high availability and scalability
- BASE Transaction

➢ Brief history of NoSQL


The term NoSQL was coined by Carlo Strozzi in the year 1998. He used this term to name his Open Source, Light
Weight, DataBase which did not have an SQL interface.
In the early 2009, when last.fm wanted to organize an event on open-source distributed databases, Eric Evans, a
Rackspace employee, reused the term to refer databases which are non-relational, distributed, and does not conform
to atomicity, consistency, isolation, durability - four obvious features of traditional relational database systems.
In the same year, the "no:sql(east)" conference held in Atlanta, USA, NoSQL was discussed and debated a lot.
And then, discussion and practice of NoSQL got a momentum, and NoSQL saw an unprecedented growth.

➢ CAP Theorem (Brewer’s Theorem)


You must understand the CAP theorem when you talk about NoSQL databases or in fact when designing any
distributed system. CAP theorem states that there are three basic requirements which exist in a special relation when
designing applications for a distributed architecture.

Consistency - This means that the data in the database remains consistent after the execution of an operation. For
example after an update operation all clients see the same data.
Availability - This means that the system is always on (service guarantee availability), no downtime.

Page 33 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

Partition Tolerance - This means that the system continues to function even the communication among the servers
is unreliable, i.e. the servers may be partitioned into multiple groups that cannot communicate with one another.

In theoretically it is impossible to fulfill all 3 requirements. CAP provides the basic requirements for a distributed
system to follow 2 of the 3 requirements. Therefore all the current NoSQL database follow the different
combinations of the C, A, P from the CAP theorem. Here is the brief description of three combinations CA, CP, AP :
CA - Single site cluster, therefore all nodes are always in contact. When a partition occurs, the system blocks.
CP -Some data may not be accessible, but the rest is still consistent/accurate.
AP - System is still available under partitioning, but some of the data returned may be inaccurate.

❖ NoSQL pros/cons
➢ Advantages :

• High scalability
• Distributed Computing
• Lower cost
• Schema flexibility, semi-structure data
• No complicated Relationships

➢ Disadvantages

• No standardization
• Limited query capabilities (so far)
• Eventual consistent is not intuitive to program for

Page 34 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

➢ The BASE
The CAP theorem states that a distributed computer system cannot guarantee all of the following three properties at
the same time:

• Consistency
• Availability
• Partition tolerance
A BASE system gives up on consistency.

• Basically Available indicates that the system does guarantee availability, in terms of the CAP theorem.
• Soft state indicates that the state of the system may change over time, even without input. This is because of
the eventual consistency model.
• Eventual consistency indicates that the system will become consistent over time, given that the system
doesn't receive input during that time.

➢ ACID vs BASE
ACID BASE

Atomic Basically Available

Consistency Soft state

Isolation Eventual consistency

Durable

BigTable, Cassandra, SimpleDB

❖ NoSQL Categories
There are four general types (most common categories) of NoSQL databases. Each of these categories has its own
specific attributes and limitations. There is not a single solutions which is better than all the others, however there
are some databases that are better to solve specific problems. To clarify the NoSQL databases, lets discuss the most
common categories :

• Key-value stores
• Column-oriented

Page 35 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

• Graph
• Document oriented

➢ Key-value stores

• Key-value stores are most basic types of NoSQL databases.


• Designed to handle huge amounts of data.
• Based on Amazon’s Dynamo paper.
• Key value stores allow developer to store schema-less data.
• In the key-value storage, database stores data as hash table where each key is unique and the value can be
string, JSON, BLOB (Binary Large OBjec) etc.
• A key may be strings, hashes, lists, sets, sorted sets and values are stored against these keys.
• For example a key-value pair might consist of a key like "Name" that is associated with a value like "Robin".
• Key-Value stores can be used as collections, dictionaries, associative arrays etc.
• Key-Value stores follow the 'Availability' and 'Partition' aspects of CAP theorem.
• Key-Values stores would work well for shopping cart contents, or individual values like color schemes, a
landing page URI, or a default account number.
Example of Key-value store DataBase : Redis, Dynamo, Riak. etc.

Pictorial Presentation :

Page 36 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

➢ Column-oriented databases

• Column-oriented databases primarily work on columns and every column is treated individually.
• Values of a single column are stored contiguously.
• Column stores data in column specific files.
• In Column stores, query processors work on columns too.
• All data within each column datafile have the same type which makes it ideal for compression.
• Column stores can improve the performance of queries as it can access specific column data.
• High performance on aggregation queries (e.g. COUNT, SUM, AVG, MIN, MAX).
• Works on data warehouses and business intelligence, customer relationship management (CRM), Library
card catalogs etc.
Example of Column-oriented databases : BigTable, Cassandra, SimpleDB etc.

Page 37 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

Pictorial Presentation :

➢ Graph databases
A graph data structure consists of a finite (and possibly mutable) set of ordered pairs, called edges or arcs, of certain
entities called nodes or vertices.
The following picture presents a labeled graph of 6 vertices and 7 edges.

What is a Graph Databases?

• A graph database stores data in a graph.


• It is capable of elegantly representing any kind of data in a highly accessible way.
• A graph database is a collection of nodes and edges
• Each node represents an entity (such as a student or business) and each edge represents a connection or
relationship between two nodes.

Page 38 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

• Every node and edge are defined by a unique identifier.


• Each node knows its adjacent nodes.
• As the number of nodes increases, the cost of a local step (or hop) remains the same.
• Index for lookups.
Here is a comparison between the classic relational model and the graph model :

Relational model Graph model

Tables Vertices and Edges set

Rows Vertices

Columns Key/value pairs

Joins Edges

Example of Graph databases : OrientDB, Neo4J, Titan.etc.

Pictorial Presentation :

➢ Document Oriented databases

Page 39 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

• A collection of documents
• Data in this model is stored inside documents.
• A document is a key value collection where the key allows access to its value.
• Documents are not typically forced to have a schema and therefore are flexible and easy to change.
• Documents are stored into collections in order to group different kinds of data.
• Documents can contain many different key-value pairs, or key-array pairs, or even nested documents.
Here is a comparison between the classic relational model and the document model :

Relational model Document model

Tables Collections

Rows Documents

Columns Key/value pairs

Joins not available

Example of Document Oriented databases : MongoDB, CouchDB etc.

Pictorial Presentation :

Page 40 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

➢ Production deployment
There is a large number of companies using NoSQL. To name a few :

• Google
• Facebook
• Mozilla
• Adobe
• Foursquare
• LinkedIn
• Digg
• McGraw-Hill Education
• Vermont Public Radio

EVALUATION:

Understanding /
Involvement Timely
Problem solving Total
Completion
(10)
(4) (3)
(3)

Signature with date:

Page 41 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

Page 42 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

EXPERIMENT NO: 10 DATE: / /

TITLE: Case study on hadoop

OBJECTIVES: On completion of this experiment student will able to…


➢ Understand to handle the huge amount of unstructured data properly.

THEORY:

❖ WHAT IS BIGDATA?
Data which are very large in size is called Big Data. Normally we work on data of size MB(WordDoc ,Excel)
or maximum GB(Movies, Codes) but data in Peta bytes i.e. 10^15 byte size is called Big Data. It is stated that
almost 90% of today's data has been generated in the past 3 years.

➢ From where this data comes from

These data come from many sources like

o Social networking sites: Facebook, Google, LinkedIn all these sites generates huge amount of data on a day
to day basis as they have billions of users worldwide.
o E-commerce site: Sites like Amazon, Flipkart, Alibaba generates huge amount of logs from which users
buying trends can be traced.
o Weather Station: All the weather station and satellite gives very huge data which are stored and
manipulated to forecast weather.
o Telecom company: Telecom giants like Airtel, Vodafone study the user trends and accordingly publish
their plans and for this they store the data of its million users.
o Share Market: Stock exchange across the world generates huge amount of data through its daily transaction.

➢ 3V's of Big Data


1. Velocity: The data is increasing at a very fast rate. It is estimated that the volume of data will double in
every 2 years.
2. Veracity: Now a days data are not stored in rows and column. Data is structured as well as unstructured.
Log file, CCTV footage is unstructured data. Data which can be saved in tables are structured data like the
transaction data of the bank.
3. Volume: The amount of data which we deal with is of very large size of Peta bytes.

❖ WHAT IS HADOOP?

Hadoop is an open source framework from Apache and is used to store process and analyze data which are very
huge in volume. Hadoop is written in Java and is not OLAP (online analytical processing). It is used for
batch/offline processing.It is being used by Facebook, Yahoo, Google, Twitter, LinkedIn and many more. Moreover
it can be scaled up just by adding nodes in the cluster.
Page 43 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

➢ Modules of Hadoop
1. HDFS: Hadoop Distributed File System. Google published its paper GFS and on the basis of that HDFS
was developed. It states that the files will be broken into blocks and stored in nodes over the distributed
architecture.
2. Yarn: Yet another Resource Negotiator is used for job scheduling and manage the cluster.
3. Map Reduce: This is a framework which helps Java programs to do the parallel computation on data using
key value pair. The Map task takes input data and converts it into a data set which can be computed in Key
value pair. The output of Map task is consumed by reduce task and then the out of reducer gives the desired
result.
4. Hadoop Common: These Java libraries are used to start Hadoop and are used by other Hadoop modules.

➢ Advantages of Hadoop
o Fast: In HDFS the data distributed over the cluster and are mapped which helps in faster retrieval. Even the
tools to process the data are often on the same servers, thus reducing the processing time. It is able to
process terabytes of data in minutes and Peta bytes in hours.
o Scalable: Hadoop cluster can be extended by just adding nodes in the cluster.
o Cost Effective: Hadoop is open source and uses commodity hardware to store data so it really cost effective
as compared to traditional relational database management system.
o Resilient to failure: HDFS has the property with which it can replicate data over the network, so if one
node is down or some other network failure happens, then Hadoop takes the other copy of data and use it.
Normally, data are replicated thrice but the replication factor is configurable.

➢ Hadoop Installation

Environment required for Hadoop: The production environment of Hadoop is UNIX, but it can also be used in
Windows using Cygwin. Java 1.6 or above is needed to run Map Reduce Programs. For Hadoop installation from
tar ball on the UNIX environment you need

1. Java Installation
2. SSH installation
3. Hadoop Installation and File Configuration

➢ 1) Java Installation

Step 1. Type "java -version" in prompt to find if the java is installed or not. If not then download java from
http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html . The tar filejdk-7u71-
linux-x64.tar.gz will be downloaded to your system.

Step 2. Extract the file using the below command

#tar zxf jdk-7u71-linux-x64.tar.gz

Step 3. To make java available for all the users of UNIX move the file to /usr/local and set the path. In the prompt
switch to root user and then type the command below to move the jdk to /usr/lib.

# mv jdk1.7.0_71 /usr/lib/

Page 44 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

Now in ~/.bashrc file add the following commands to set up the path.

# export JAVA_HOME=/usr/lib/jdk1.7.0_71
# export PATH=PATH:$JAVA_HOME/bin

Now, you can check the installation by typing "java -version" in the prompt.

➢ 2) SSH Installation

SSH is used to interact with the master and slaves computer without any prompt for password. First of
all create a Hadoop user on the master and slave systems

# useradd hadoop
# passwd Hadoop

To map the nodes open the hosts file present in /etc/ folder on all the machines and put the ip address along
with their host name.

# vi /etc/hosts

Enter the lines below

190.12.1.114 hadoop-master
190.12.1.121 hadoop-salve-one
190.12.1.143 hadoop-slave-two

Set up SSH key in every node so that they can communicate among themselves without password.
Commands for the same are:

# su hadoop
$ ssh-keygen -t rsa
$ ssh-copy-id -i ~/.ssh/id_rsa.pub tutorialspoint@hadoop-master
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop_tp1@hadoop-slave-1
$ ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop_tp2@hadoop-slave-2
$ chmod 0600 ~/.ssh/authorized_keys
$ exit

➢ 3) Hadoop Installation

Hadoop can be downloaded from http://developer.yahoo.com/hadoop/tutorial/module3.html

Now extract the Hadoop and copy it to a location.

$ mkdir /usr/hadoop
$ sudo tar vxzf hadoop-2.2.0.tar.gz ?c /usr/hadoop

Page 45 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

Change the ownership of Hadoop folder

$sudo chown -R hadoop usr/hadoop

Change the Hadoop configuration files:

All the files are present in /usr/local/Hadoop/etc/hadoop

1) In hadoop-env.sh file add

export JAVA_HOME=/usr/lib/jvm/jdk/jdk1.7.0_71

2) In core-site.xml add following between configuration tabs,

1. <configuration>
2. <property>
3. <name>fs.default.name</name>
4. <value>hdfs://hadoop-master:9000</value>
5. </property>
6. <property>
7. <name>dfs.permissions</name>
8. <value>false</value>
9. </property>
10. </configuration>

3) In hdfs-site.xmladd following between configuration tabs,

1. <configuration>
2. <property>
3. <name>dfs.data.dir</name>
4. <value>usr/hadoop/dfs/name/data</value>
5. <final>true</final>
6. </property>
7. <property>
8. <name>dfs.name.dir</name>
9. <value>usr/hadoop/dfs/name</value>
10. <final>true</final>
11. </property>
12. <property>
13. <name>dfs.replication</name>
14. <value>1</value>
15. </property>
16. </configuration>

4) Open the Mapred-site.xml and make the change as shown below

1. <configuration>
2. <property>
Page 46 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

3. <name>mapred.job.tracker</name>
4. <value>hadoop-master:9001</value>
5. </property>
6. </configuration>

5) Finally, update your $HOME/.bahsrc

1. cd $HOME
2. vi .bashrc
3. Append following lines in the end and save and exit
4. #Hadoop variables
5. export JAVA_HOME=/usr/lib/jvm/jdk/jdk1.7.0_71
6. export HADOOP_INSTALL=/usr/hadoop
7. export PATH=$PATH:$HADOOP_INSTALL/bin
8. export PATH=$PATH:$HADOOP_INSTALL/sbin
9. export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
10. export HADOOP_COMMON_HOME=$HADOOP_INSTALL
11. export HADOOP_HDFS_HOME=$HADOOP_INSTALL
12. export YARN_HOME=$HADOOP_INSTALL

On the slave machine install Hadoop using the command below

1. # su hadoop
2. $ cd /opt/hadoop
3. $ scp -r hadoop hadoop-slave-one:/usr/hadoop
4. $ scp -r hadoop hadoop-slave-two:/usr/Hadoop

Configure master node and slave node

1. $ vi etc/hadoop/masters
2. hadoop-master
3.
4. $ vi etc/hadoop/slaves
5. hadoop-slave-one
6. hadoop-slave-two

After this format the name node and start all the deamons

1. # su hadoop
2. $ cd /usr/hadoop
3. $ bin/hadoop namenode -format
4.
5. $ cd $HADOOP_HOME/sbin
6. $ start-all.sh

Page 47 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

EVALUATION:

Understanding /
Involvement Timely
Problem solving Total
Completion
(10)
(4) (3)
(3)

Signature with date:

Page 48 of 48

Downloaded by Computer Science ([email protected])


lOMoARcPSD|47487884

Page 49 of 48

Downloaded by Computer Science ([email protected])

You might also like