Monitoring and Supporting Data Conversion
Monitoring and Supporting Data Conversion
NO: _____
MODULE TITLE: Monitoring and Supporting Data Conversion
NOMINAL DURATION: 40 hrs
LO1. Monitor data conversion
1.1. Defining concepts of data conversion and Data Terminologies
Data is raw facts or unorganized things (such as alphabets, numbers, or symbols) that refers to, or represent,
conditions, ideas, or objects.
It can be qualitative or quantitative.
Qualitative data is descriptive information (it describes something)
Quantitative data is numerical information (numbers).
Discrete data can only take certain values (like whole numbers)
Continuous data can take any value (within a range)
Put simply: Discrete data can be counted, Continuous data can be measured
Example:
Qualitative:
It is brown and black
It has long hair
It has lots of energy
Quantitative:
Discrete:
o It has 4 legs
o It has 10 fingers
Continuous:
o It weighs 25.5 kg
o It is 565 mm tall
Data conversion is the conversion of one file or database from one format (from one physical environment)
to another.
Often, when data is moved from one system to another, some form of data conversion is required to convert the data
to a format the receiving system can interpret.
Types of conversion:
Database conversion (SQL, MySQL, MS Access, XLS, XML etc)
File format conversion (PDF to Word)
Image conversion (GIF to JPG, TIFF, PNG etc)
Character or string conversion(numeric to alphabet or viceversa)
NO: _____
1.2. Reading and Analyzing Existing Data Conversion Documents
The data conversion process can often be a complex and difficult task during an implementation.
When performing data conversions, you must include analysis of your source data and continues through to system
testing and user acceptance.
Throughout the conversion process, we perform quality control checks to ensure correctness of the conversion.
1.3. Understanding Data and Its Characteristics
1.3.1. Data Conversion Systems and Tools
Data Conversion Tool allows you to convert data both from and to (both sides are supported) a wide variety of
formats, including:
SQLServer Tables
Oracle Tables
ODBC Tables
OleDb Tables
Microsoft Access Tables
XML Files
Once a conversion type is defined, it can be saved and reused either in a future conversion or as a step within a
batch conversion.
1.3.2. Data Modeling Methodologies
Data modeling is the formalization and documentation of existing processes and events that occur during
application software design and development.
Data modeling techniques and tools capture and translate complex system designs into easily understood
representations of the data flows and processes, creating a blueprint for construction or re-engineering.
A data model can be thought of as a diagram or flowchart that illustrates the relationships between data.
There are several different approaches of data modeling, including:
- Conceptual Data Modeling - identifies the highest-level relationships between different entities.
- Logical Data Modeling - illustrates the specific entities, attributes and relationships involved in a business
function.
- Physical Data Modeling - represents an application and database-specific implementation of a logical data
model.
1.3.3. Data Conditioning and cleaning
Data conditioning (Pre-processing) is the use of data management and optimization techniques which result in
the intelligent routing, optimization and protection of data for storage or data movement in a computer system.
Data cleaning is the act of detecting and removing or correcting dirty data (i.e.: data that is incorrect, out-of-date,
redundant, incomplete, or formatted incorrectly).
Data Cleaning helps to increase the overall efficiency of your data management systems and leads to an increase
in the productivity of the organization.
1.3.4. Data Transformation and integration
Data transformation is one of the collective processes known as extract, transform or load which is one of the
most important processes in data warehouse implementation from different data sources.
Data Integration is the process of combining heterogenous data sources in to a single queriable schema so as to
get a unified view of these data.
NO: _____
1.3.5. Sorting, updating, exporting and convert data
Sorting data
Sorting data is the process of arranging items into meaningful order so that you can analyze it more effectively.
Example:
sort text data into alphabetical order
sort numeric data into numerical order
Updating Data
The modification of data that is already in the database is referred to as updating. The update operation allows you
to change an existing database record in a logical or physical file. You can update individual rows, all the rows in a
table. Each column can be updated separately without affecting other columns.
UPDATE table_name
SET column1=value, column2=value2, ...
WHERE some_column=some_value
To perform an update, you need three pieces of information:
1. The name of the table and column to update,
2. The new value of the column,
NO: _____
This step is also crucial to maintaining high rates of data completeness during the course of the monitoring program.
Therefore data must be validated as soon as possible, within one - two days, after they are transferred. The sooner the site
operator is notified of a potential measurement problem, the lower the risk of data loss.
• Verified data must be presented and approved by appropriate persons.
Data can be validated either manually or automatically (computer-based). The latter is preferred to take advantage of the
power and speed of computers, although some manual review will always be required. Validation software may be
purchased from some data logger vendors, created in-house using popular spreadsheet programs (e.g., Microsoft Excel,
Quatro Pro, Lotus 123), or adapted from other utility environmental monitoring projects. An advantage of using spreadsheet
programs is that they can also be used to process data and generate reports.
There are essentially two parts to data validation, data screening and data verification:
Data Screening: The first part uses a series of validation routines or algorithms to screen all the data for suspect
(questionable and erroneous) values.
Data Verification: The second part requires a case-by-case decision on what to do with the suspect values retain them as
valid, reject them as invalid, or replace them with redundant, valid values (if available). This part is where personal judgment
by a qualified person familiar with the monitoring equipment and local device is needed.
NO: _____
• Back-up copies of conversion files must be maintained and documented according to requirements
To preserve the original raw data, make a copy of the original raw data set and apply the validation steps to the copy.
• Developing clear and coherent technical documentation
The Handbook of Technical Writing, 6th edition (Alred et al), lists five steps to successful technical
writing:
1. Preparation
2. Research
3. Organization
4. Writing a draft
5. Revision
6. Let's look at this document. Before I sat down to write this tutorial, I asked myself some
questions.
The audience is members of the Perl Monks community. This community is focused on
Perl. It's a diverse community with wide skill levels from complete novice to Perl gurus,
but since this is a tutorial, I'm covering fairly basic material. It's a fairly friendly and
informal place, so a conversational writing tone is appropriate.
This is an introductory tutorial. The purpose is not to turn each and every Perl Monk into
a technical writer, but to give an overview of the documentation process.
Perl Monks, of course! More generally, this tutorial is an online article presented on a
web site. This is important: writing for the web is not the same as writing for print or any
other media. What works well on paper may not work on the web. The web also has
features that print does not, such as the ability to link to more information. Use these
features to the fullest.
NO: _____
Research
The purpose of documentation is to convey information. In order to convey information, you must
understand it. In this case we are talking about your code, so hopefully you understand it - but can you
explain it?
If you have worked with a technical writer, you have undoubtedly faced a barrage of questions. "What
does this do?" "What does that do?" "How does one...?" This is research.
Organization
Poorly organized documentation may in fact be worse than no documentation. Consider what you
need to tell users, and how that information can best be conveyed. For example, if you're writing
installation instructions for a program, you will want to go with a sequential method of development:
step 1, step 2, and so on. If you're writing a history of versions, use a chronological method of
development. Choose the method that best suits your subject, your readers, and your purpose.
In software, often used methods of development are division and classification (explain each parts
function and how the parts work together) and general-to-specific. In general-to-specific
development, you begin with general information about the function of the software, and move to
more specific information.
Once you've decided on an organization scheme, prepare an outline. This provides a road map for
your writing. How you outline is a personal choice. Personally, I use an iterative approach: I start with
a very broad outline, such as
Technical Writing/Documentation Tutorial
Introduction
Principles of Technical Writing
Preparation
Research
Organization
Draft
Revision
Conclusion
References
I then go through my outline and break categories into sub-categories and sub-sub-categories until I
feel I have a clear enough map. For example, I broke the Preparation category up in this manner:
Preparation
Purpose
Audience
Scope
Medium
Name of trainer: Sisay Date: ____/____/04
Institution Name Document No.
ባህር ዳር ፖሊ ቴክኒክ ኮሌጅ BTC/133-
BAHIR DAR POLYTECHNIC COLLEGE 14
Issue No. Page No.
Title:
B0 Page 7 of 7
INFORMATION SHEET
NO: _____
Writing a Draft
Once you have your outline, you can begin writing. Expand your outline into paragraphs. Some
advise not worrying about matters such as grammar, spelling, punctuation, and layout at this point.
Personally, I find it difficult to follow this advice. I usually write fairly polished drafts. Regardless of
your tactic, you will be revising your work in the next step.
You may wish to save the introduction for last, since you will have a better idea of what is covered in
the body of the document. You will also need to write a conclusion to your document - remember the
basic rule of technical writing:
You may regard documentation as a chore. You may regard it as an opportunity to learn. Regardless,
you do need to document your code. In this tutorial, I've presented the steps used in technical
writing/documentation: preparation, research, organization, writing, and revision.
In preparation, key points discussed were: establishing the purpose, assessing the audience,
determining the scope, and selecting the appropriate medium. The research section covered the
reasons for research and strategies for commenting code. In organization, methods of development
used in software documentation were introduced: sequential, chronological, division and
classification, and general-to-specific. Outlining was also discussed. Finally, strategies for writing and
revision were covered.
NO: _____
References
Aldred, Gerald J. et al, 2000. Handbook of Technical Writing, 6th edition. Boston: Bedford/St.
Martin's.