(AMIT) WaterMarking On Database Microproject Report
(AMIT) WaterMarking On Database Microproject Report
By
Submitted by
Prof.S.G.Deshmukh
Principal
ABSTRACT
With the rapid growth and internet and networks techniques, multimedia
data transforming and sharing is common to many people. Multimedia data
is easily copied and modified, so necessarily for copyright protection is
increasing. It is the imperceptible marking of multimedia data to “brand”
ownership. Digital watermarking has been proposed as technique for
copyright protection of multimedia data.
Key Words:
Multimedia data
Copyright protection
Digital watermarking
Invisibility
Copyright information
Finger protection
Fingerprinting
Copy protection
Broadcast monitoring
Images
Music clips
Digital video
Still images
Robust digital watermarking
Copyright infringement
Watermarking removal
INDEX
CHAPTER PAGE
NO.
WATERMARKING ON DATABASE NO.
Abstract 4
Index 3
List of Figures 5
List of Tables 5
Acknowledgment 6
1 Introduction 7--14
2 Software and Hardware Required 14
3 Proposed Approach
3.1 Architecture 15
3.2 System Design 16
3.3 Procedure along With algorithm followed 17-20
3.4 Individual Contribution 21
4 Literature Review 22-23
5 Testing 23
6 Result and Discussion 24
7 Conclusion and future scope 26-27
REFERENCES 28
List of Figure
SR NAMES PAGE
NO NO.
Fig 1.1 Water Marking On database 7
Table 3 Testing 25
ACKNOWLEDGMENT
Learning Department for his constant encouragement and patience throughout this
Machine Learning Department and Prof. A.N. Taur Coordinator for their constant
Amit Giram
Artificial Intelligence and Machine Learning
MIT Polytechnic, Chh. Sambhajinagar
1. Introduction 1.
Figure 1.1
Creating a watermarking database involves embedding watermarks into the data stored in the
database for various purposes, such as copyright protection, data tracking, or security. Here's
an overview of how a watermarking database is formed:
1. Select Data for Watermarking:
Determine which data in the database needs watermarking. This may include specific records,
columns, or files, depending on your objectives.
2.Choose a Watermarking Technique:
1. Select the appropriate watermarking technique based on your goals. Common techniques
include:
Visible Watermarking: Overlaying visible marks like text, logos, or symbols on the data
to indicate ownership or copyright.
Invisible Watermarking: Embedding information within the data in a way that is not
immediately apparent, often for tracking and identification.
3.Embed the Watermark:
Apply the chosen watermarking technique to the selected data. The method for
embedding the watermark will depend on the specific technique used.
For visible watermarks, you would overlay the mark on the data.
For invisible watermarks, you might alter specific bits or embed information in a way
that doesn't affect the visual appearance.
5.
Data Encryption Component: In some cases, data may be encrypted before being
stored in the database for additional security. This component manages data encryption
and decryption.
User Access Control: This component ensures that only authorized users have access to
the data, and it can enforce access policies.
The process typically involves the embedding component adding watermarks to data before it's
stored in the database. When data is retrieved, the extraction component checks for watermarks,
verifying data integrity and authenticity.
Data encryption and user access control may be integrated to enhance security and control over
data access. This architecture provides a high-level overview of how watermarking can be
integrated into a database system to enhance data security, integrity, and traceability. The
specific implementation details will vary based on the chosen watermarking technique, database
technology, and security requirements.
Figure 1.3
Regular Auditing:
Conduct regular audits of your database to ensure compliance with copyright and
licensing agreements. This can help identify any unauthorized or improper use of the
data.
Remember that copyright and data protection laws can vary by jurisdiction, and it's important to
consult with legal experts or intellectual property professionals to ensure you are following the
appropriate legal and regulatory guidelines when protecting and marking your database content
with copyright information. Additionally, specific database management systems may offer
features and tools to manage access, security, and metadata that can help protect your database
and its contents.
3. Proposed Approach
3.1 – Architecture Diagram
Figure 3.1
Watermarking a database typically involves adding metadata or audit information to the database
records. The architecture for watermarking a database is relatively straightforward and involves
the following components:
1. Database Management System (DBMS):
The database management system is responsible for storing and managing the data in the
database. It includes the database engine, which handles data storage, retrieval, and
modification.
2. Application Layer:
This layer consists of applications or services that interact with the database. These
applications can be web applications, desktop software, or other systems that read and
write data to the database.
3. Watermarking Module:
The watermarking module is responsible for adding metadata or watermark information
to the database records. It can be a part of the application layer and can be implemented
in various ways, such as through stored procedures, triggers, or application code.
4. Metadata Repository:
This component is where the watermarking information is stored. It can be a separate
table in the same database or a separate database. The metadata repository should include
fields for the watermark information, such as timestamps, user IDs, descriptions, or any
other relevant details.
5. Users and Auditors:
Users and auditors interact with the application layer to access and modify data in the
database. They may also have access to the watermark information stored in the
metadata repository for auditing and tracking purposes.
9.
Figure 3.2
Here's an explanation of the components in the system diagram:
Data Source / App: This represents the source of data, which could include data
ingestion processes, various applications, or user interactions.
Database Server: This is where your data is stored, which can be a relational or NoSQL
database server.
Data Watermarking System: This component is responsible for handling
watermarking-related tasks. It includes various functionalities:
Watermark Generation Module: Generates watermarks containing relevant
information.
Watermark Insertion Module: Embeds the watermark into the data stored in the
database.
Watermark Verification Module: Verifies the watermarks to ensure data integrity.
Access Control Mechanism: Controls and manages who can access and modify
watermarked data.
Encryption Mechanism: Provides an additional layer of security by encrypting the data,
making unauthorized access more difficult.
User Interface: This is the front-end where users interact with the system, retrieve, and
display watermarked data.
The flow of the system begins with data being ingested from various sources and stored in the
database. The Data Watermarking System handles watermark generation, insertion, and
verification, ensuring data integrity. The User Interface allows users to interact with watermarked
data.
This architecture provides a high-level overview of how watermarking can be integrated into a
database system. The specific implementation may vary depending on your use case, database
technology, and security requirements.
10.
3.3 Procedure along with algorithms followed
Watermarking is a technique used to embed information into digital media, such as images,
audio, or video, in order to protect intellectual property rights and prevent unauthorized copying
or distribution. The insertion algorithm is one of the methods used to embed watermarks into
digital data.
In the context of watermarking on a database, the insertion algorithm can be utilized to embed
watermarks directly into the database records. This approach allows for the seamless integration
of watermarking techniques with the database management system (DBMS), providing an
efficient and effective way to protect data integrity and ownership.
The insertion algorithm works by modifying the original data in a way that is imperceptible to
human observers but can be detected and extracted by authorized parties. The process involves
selecting specific locations within the database records where the watermark bits can be inserted
without causing significant changes to the original data. These locations can be determined based
on various factors, such as the characteristics of the data and the desired robustness of the
watermark.
Once the locations are identified, the insertion algorithm modifies the selected bits in a controlled
manner to embed the watermark information. This modification can be achieved through various
techniques, such as bit substitution, quantization, or spread spectrum modulation. The goal is to
ensure that the embedded watermark remains robust against common attacks, such as
compression, filtering, or cropping.
Secondly, it enables seamless integration with existing DBMS functionalities, such as querying
and indexing, without compromising performance or functionality. Lastly, it provides a
transparent and non-intrusive approach to watermarking that does not require modifications to
11.
the underlying database structure or schema.
It is important to note that while insertion algorithms can effectively embed watermarks into a
database, the extraction and verification of these watermarks require additional techniques. These
techniques involve analyzing the database records and comparing them against the original data
or reference watermarks to detect any modifications or tampering.
12 11 13 5 6
First Pass:
Initially, the first two elements of the array are compared in insertion sort.
12 11 13 5 6
12.
Here, 12 is greater than 11 hence they are not in the ascending order and 12 is not at its
correct position. Thus, swap 11 and 12.
So, for now 11 is stored in a sorted sub-array.
11 12 13 5 6
Second Pass:
Now, move to the next two elements and compare them
11 12 13 5 6
Here, 13 is greater than 12, thus both elements seems to be in ascending order, hence, no
swapping will occur. 12 also stored in a sorted sub-array along with 11
Third Pass:
Now, two elements are present in the sorted sub-array which are 11 and 12
Moving forward to the next two elements which are 13 and 5
11 12 13 5 6
Both 5 and 13 are not present at their correct place so swap them
11 12 5 13 6
After swapping, elements 12 and 5 are not sorted, thus swap again
11 5 12 13 6
5 11 12 13 6
5 11 12 13 6
Clearly, they are not sorted, thus perform swap between both
5 11 12 6 13
5 11 6 12 13
Figure 3.3
Insertion Sort is a simple comparison-based sorting algorithm that builds the final sorted array
one item at a time. It is a particularly efficient algorithm for small datasets and is known for its
simplicity and ease of implementation. Here's a brief overview of the Insertion Sort algorithm:
Idea:
Start with a small portion of the array as a sorted segment (often a single element).
Repeatedly take the next unsorted element and insert it into its correct position within the
sorted segment.
Algorithm:
Iterate through the array from left to right.
14.
For each element, compare it with the elements in the sorted segment and shift elements
to the right until the correct position is found.
Continue this process until the entire array is sorted.
Key Points:
It has a time complexity of O(n^2) in the worst and average cases, where n is the number
of elements to be sorted.
It's an in-place sorting algorithm, meaning it doesn't require additional memory for
sorting.
It's stable, meaning it preserves the relative order of equal elements.
It's best suited for small datasets or partially sorted data, where the number of inversions
is low.
It's simple to understand and implement.
Advantages:
Insertion Sort performs well for small arrays or nearly sorted data.
It's efficient for lists that are already partially sorted, as the inner loop has fewer
iterations in such cases.
Disadvantages:
It's not suitable for large datasets due to its quadratic time complexity.
There are more efficient sorting algorithms, such as Quick Sort and Merge Sort, for
larger datasets.
In summary, Insertion Sort is a basic and intuitive sorting algorithm that is useful for small lists
or situations where the data is mostly ordered. However, for larger datasets, other sorting
algorithms like Merge Sort or Quick Sort are typically preferred due to their better performance
characteristics.
4. Literature Review
Watermarking in databases is an important and evolving area of research with various
applications in data security, privacy, access control, and provenance tracking. Below is a
literature review that highlights key studies, trends, and findings related to watermarking in
databases.
"Database Watermarking: A Survey" (2012)
This comprehensive survey paper provides an overview of watermarking techniques in
databases. It covers various watermarking schemes, their applications, and challenges.
The paper discusses both image watermarking and relational database watermarking,
outlining techniques and the issues related to protecting data integrity and ownership.
"Secure Database Systems: Watermarking Approaches and Challenges" (2008)
This research paper delves into the challenges and approaches for secure database
watermarking. It discusses the integration of watermarking within database management
systems (DBMS) and explores techniques for data provenance tracking, privacy
preservation, and authentication. The authors highlight the need for watermarking that is
adaptable to different DBMSs and emphasize the importance of watermarking schemes
that do not significantly degrade performance.
"Digital Watermarking for Relational Databases" (2013)
This paper introduces a watermarking technique specifically designed for relational
databases. It focuses on embedding watermarks in structured data, and it addresses issues
related to watermark placement, capacity, and performance. The study emphasizes the
importance of watermark detection mechanisms and the trade-offs between security and
efficiency.
"Database Watermarking: A Comprehensive Evaluation of Data Hiding in Database"
(2020)
This research evaluates various database watermarking techniques and assesses their
robustness, capacity, and computational overhead. It explores methods for protecting
data confidentiality, ownership, and tracking changes in databases. The study provides a
comparative analysis of watermarking techniques and their effectiveness.
"Watermarking in Big Data: Techniques and Challenges" (2017)
As big data becomes more prevalent, this paper discusses the application of
watermarking techniques to large-scale databases. It addresses the challenges of
watermarking in distributed and parallel processing environments. The authors highlight
the importance of scalable watermarking solutions and the need for considering the
unique characteristics of big data in watermarking schemes.
"Database Watermarking with Complex Queries" (2018)
This paper explores watermarking in the context of complex database queries. It
discusses techniques for embedding watermarks within query results and emphasizes the
importance of maintaining data integrity and authenticity, especially when databases are
queried for analytics or reporting.
"Privacy-Preserving Database Watermarking" (2016)
This study focuses on watermarking techniques that aim to protect the privacy of
individuals while maintaining data utility. It addresses the challenges of balancing data
privacy and watermarking in databases, considering regulations like GDPR and HIPAA.
"Scalable and Efficient Watermarking of Relational Databases" (2019)
Scalability is a significant concern in database watermarking. This paper discusses
techniques for efficient and scalable watermarking, addressing issues related to large
16.
databases and distributed systems. The study also evaluates the performance of
watermarking schemes under various conditions.
In conclusion, the literature on watermarking in databases highlights the importance of protecting
data integrity, ownership, and privacy while considering the performance and scalability of
watermarking techniques. Researchers continue to explore innovative approaches to address the
evolving challenges in data security and privacy within the context of databases. Watermarking
techniques are essential for various applications, including intellectual property protection, data
provenance, access control, and compliance with data protection regulations.
5. TESTING
SR TYPES DESCRIPTION
NO.
1. Unit Testing Unit testing is a fundamental practice in software
development that involves the testing of
individual components or units of a program to
ensure they function correctly in isolation. These
units are typically small, self-contained portions
of code, such as functions, methods, or classes.
The primary goal of unit testing is to validate that
each unit of code behaves as expected and to
detect and fix any defects or bugs early in the
development process.
17.
18.
Watermarking techniques need to be adaptable to various database management systems
(DBMSs). This adaptability is essential as organizations may use different DBMSs, and
a one-size-fits-all approach may not be suitable.
Privacy and Regulations:
Watermarking is increasingly relevant in the context of data privacy regulations such as
GDPR and HIPAA. It helps organizations protect personal and sensitive data while
ensuring compliance with privacy laws.
Data Provenance and Accountability:
Watermarking provides a means of establishing data provenance, which is valuable for
tracking changes in data and holding individuals or systems accountable for those
changes. It is crucial in scenarios where data integrity and auditability are paramount.
Scalability for Big Data:
Watermarking must adapt to the realities of big data. Researchers are working on
scalable watermarking solutions that can handle the enormous volumes of data generated
and processed in modern systems.
Customization and Trade-offs:
Watermarking solutions often need to be customized based on the specific needs of an
organization. There may be trade-offs between watermark capacity, security, and
performance that need to be carefully considered.
Future Scope:
The future of watermarking in databases holds significant potential for advancements and
innovations, including:
Performance Optimization:
Ongoing research will continue to focus on improving the performance of watermarking
techniques, minimizing computational overhead, and ensuring efficient processing.
Scalability for Big Data:
Watermarking solutions will evolve to accommodate the growing volumes of data in big
data environments, including handling large databases and distributed systems
effectively.
Customization and Adaptability:
Future watermarking solutions will be highly customizable and adaptable to various
database management systems (DBMSs) and tailored to the specific requirements of
organizations.
Privacy-Preserving Watermarking:
In line with data privacy regulations, future research will prioritize the development of
privacy-preserving watermarking techniques, offering robust protection for personal and
sensitive data.
Machine Learning Integration:
20.
Machine learning algorithms will play a more significant role in watermarking,
automating watermark placement and optimizing detection, particularly for recognizing
anomalies and suspicious activities.
Cross-Domain Applications:
Watermarking in databases will find applications in diverse domains, extending beyond
traditional data management to data warehousing, cloud computing, the Internet of
Things (IoT), and blockchain.
Interoperability:
Efforts will be directed toward ensuring interoperability between various watermarking
solutions and databases, promoting flexibility in adopting different techniques.
Blockchain Integration:
Integration with blockchain technology can create tamper-evident data records,
contributing to data immutability and auditability.
In summary, watermarking in databases has already demonstrated its value in data protection and
management. The future scope of watermarking in databases lies in continuous research and
development, emphasizing performance enhancement, scalability, customization, privacy
preservation, integration with machine learning and emerging technologies, cross-domain
applications, interoperability, and blockchain integration. These developments will be essential
in addressing the evolving challenges in data security and privacy while meeting the demands of
a data-driven world.
21.
REFERENCE