Data Processing
Data Processing
DATA PROCESSING.
Data processing deals with how data is organized & processed in the computer.
DATA:
Data is a collection of facts & figures, which can be processed to produce information.
Examples:
In an educational environment, when students sit for exams, the grades obtained represent the
data to be processed by the computer. In this case, data can be Names of students & Marks
obtained.
In a business environment, data can be the No. of Hours worked, names of employees, Stock
Data can also be described as Raw data, if they are not yet processed, i.e. if they do not convey
particular meaning to a given activity within any given environment.
It therefore means that, Data are unprocessed information consisting of details relating to
business transactions. For example, in a Payroll system, data are employee’s names, basic
salary, department number, marital status, etc.
DATA PROCESSING:
The collection, manipulation & distribution of data (i.e.) letters, numbers & graphic symbols,
to achieve certain objectives.
The processing may involve calculations, comparisons, decision-making and/or any other
logic to produce the required result.
The activity of manipulating the raw facts to generate a set of meaningful data (described as
Information), which is able to convey some meaning.
Those activities, which are concerned with the systematic recording, arranging, filing,
processing, and dissemination of facts relating to the physical events occurring in a business.
Data processing is a very important activity in any organization of any size or nature because it
generates information for decision-making.
If the data processing uses complicated processing tools or aids, e.g. the computer, it is described
as Electronic Data Processing (EDP).
1
Data processing techniques
INFORMATION.
Information is data, which is summarized and processed in the way you want it, so that it is
useful in your work.
The information in Payroll activity includes; Net pay, Total Tax deductions, etc. In Stock
Control, the information generated includes; Closing stock, Total cost of the items, Purchases,
Sales, etc.
The information is obtained by applying some processing procedures onto the raw data being
input. For example, to get the Net pay in a Payroll activity, the procedure would be;
Information is the end product of data processing available at the right place, the right time and
in the right form.
The information generated by the data processing activities is very important in the working
strategies of any organization, because it is used by the organization to make decisions.
It should: -
Data are the facts which relate to any particular activity, and do not have any specific meaning.
In a Manufacturing industry, data may be compared to raw materials and Information to finished
products. Just as raw materials are transformed into finished products, raw data are transformed
into information.
2
Data processing techniques
In order to generate information from data items, a set of processing activities have to be
performed on the data items in a specific sequence depending on the desired final result.
Performing these processes is known as Data processing.
Exercise.
Data processing cycle refers to the various stages involved in converting data into information.
There are 5 primary elements/functions of data processing system. They include; Input,
Processing, Storage, Output, and Control.
3
Data processing techniques
ORIGINATION OF DATA
Data originates from Source documents,
Time cards, Sales orders, Purchase
orders, Invoices, etc
INPUT OF DATA
Data is recorded in medium suitable
for Input & handling by the data
processing system, e.g. Punched cards,
floppy disks, etc
STORAGE OF DATA
Data is stored in Filing cabinets,
Microfilms, floppy disks, magnetic
PROCESSING OF DATA tapes, etc.
Data is entered into the data processing
system, Processed, Sorted, Calculated,
Compared, Analyzed, etc
OUTPUT OF INFORMATION
Output consisting of printed or
typewritten forms, etc
Summaries, Reports, & documents are
prepared.
Notes.
Exercise I.
4
Data processing techniques
DATA COLLECTION.
Data Collection is the process involved in getting the data from the point of its origin to the
computer in a form suitable for processing.
Note. Data collection starts at the source of the raw data & ends when valid data is within the
computer in a form ready for processing.
Data Entry:
Nowadays, most end-users input data to the computer using Keyboards on PCs, Workstations, or
Terminals.
Data can originate in many forms, but the computer can only accept it in a machine-sensible
form.
1. The data to be processed by the computer must be presented in a Machine-sensible form (i.e.
in the language of a particular input device).
Note that most of the data originates in a form that is not machine-sensible. Therefore, the
data must undergo the process of Transcription before it is suitable for input to the
computer.
2. The process of Data collection involves getting the original data to the “processing center”,
transcribing it, sometimes converting it from one medium to another, and finally getting it
into the computer. This process involves a great number of people, many machines, and
much expense.
Data Capture:
Data Capture is the process of obtaining data in a computer-sensible form at the point of origin.
Obtaining of data in a computer-sensible form helps to avoid many of the problems of data entry.
The captured data may be stored in some intermediate form for later entry into the main
computer in the required form. If data is input directly into the computer at its point of origin,
the data entry is said to be On-Line. In addition, if the method of direct input is a terminal or
workstation, the method of input is known as Direct Data Entry (DDE).
5
Data processing techniques
STAGES IN DATA COLLECTION.
The process of data collection may involve any number of the following stages depending on the
methods used.
1. Data Creation.
Source document is the original document used to record data and/or instructions.
Most of the data is in form of a manually scribed or typewritten documents, i.e. the data
is on clerically prepared source documents.
(b). Data capture. This involves preparing the source document itself in a machine-
sensible form so that it may be used as input to the computer without the need for
transcription. The prepared source document is then read directly by a suitable device,
e.g. a Bar code reader.
Note. The method and medium adopted for data creation will depend on factors such as
Cost, Type of application, etc.
2. Data Transmission.
This will depend on the method & medium of data collection involved/adopted.
If the computer is located at a central point, the documents will be physically “transmitted”,
i.e. by the Post office or a Courier to the central point.
The data can also be transmitted by means of Telephone lines to the central computer. In this
case, no source documents would be involved in the transmission process.
3. Data Preparation.
Data Preparation is the term given to the transcription of data from the source document to
a machine-sensible medium.
Data is prepared in a particular medium & converted to another medium for faster input into
the computer.
6
Data processing techniques
For example; data might be prepared on Diskette, or captured onto Cassette, and then
converted to magnetic Tape for input.
The conversion will be done on a computer that is separate from the one for which the data is
intended.
5. Input.
The data, now in magnetic form, is put into the computer and subjected to validity checks by
a computer program before being used for processing.
6. Sorting.
This stage is required to re-arrange the data into the sequence required for processing.
7. Control.
In all the stages of data collection, control must be established and applied where necessary.
In other words, Control is usually applied through out the whole process of data collection.
The System designer must guard against the following types of errors:
7
Data processing techniques
DATA INTEGRITY.
(i). Accuracy:
(ii). Timeliness:
(iii). Relevance:
DATA CONTROL.
The quality of Input data is important to the accuracy of output. Control must be instituted as
early as possible in the system & everything possible must be done to ensure that data is
complete and accurate before being input to the computer.
Note. Control must be designed into the system & thoroughly tested. Failure to build in
adequate control may cause expensive systems to fail. In addition, all users must be fully
consulted to ensure that adequate controls are implemented.
The following are controls that can be used to ensure data accuracy:
(1). Verification:
This is the process of checking & ensuring that data has been transcribed/ written out
correctly.
Verification is whereby several computer users are given data to enter into the computer and
the results are compared. Or else, a second transcription is compared with the first one. If
the results are different, then there is inaccuracy in that data.
Note. Verification calls for manual intervention, hence errors are possible. Note that some
copying/transcription mistakes that bypass the verification stage are difficult to isolate during
8
Data processing techniques
verification, e.g. the confusion of l (letter l) and 1 (one). In this case, l might be input instead
of 1 and vice versa, hence such mistakes go undetected.
(2).Manual controls.
VALIDATION CHECKS.
A Computer cannot notice errors in the data being processed in the way that a Clerk or Machine
operator does.
Data validation is the process of preventing wrong data from being processed. It involves
checking whether the results generated by the computer are valid or applicable. During input or
data preparation, the data must be checked for transcription errors, through a process known as
Verification.
Once the data is brought into the computer memory directly from an input device, immediately
before processing, the data is again subjected to checks built in the program described as
validation checks, to check the data integrity or the conformity of the data to the processing
requirements.
9
Data processing techniques
(b). Test for numbers.
E.g., numbers should not be given as alphabets.
(1).Input stage: When data is first input to the computer, different checks can be applied to
prevent errors going forward for processing. For this reason, the first computer run is often
referred to as Validation or Data vet.
(2).Updating stage: Further checking is possible during data processing (or when the data input
are being processed).
The program checks the consistency of the input data with existing stored data. This check is
possible during the input run if the stored data is on-line at the time.
Note. Validation is an online process (i.e. validation checks are build into the computer
programs using the input data, so that incorrect data items are detected and reported). Since the
checks are under the influence of the computer, they are not prone to errors.
Exercise I.
1. Distinguish between Data verification and Validation as used in the context of data
collection.
10
Data processing techniques
(1).MANUAL SYSTEMS.
In Manual systems, the data processing activities are carried out manually by the human
Clerks assisted by some calculating tools such as Slide rule, Logarithms, etc.
In individual business units, the transactions are recorded on the source documents, which are
taken to the data processing department for processing. Human beings work on source
documents mentally or with the aid of some simple manipulation tools.
11
Data processing techniques
The files maintained are updated appropriately to reflect the correct image of the business.
The records are stored in form of Ledger cards, in the filing trays or in cabinets. The Ledger
cards contain the sales data (the amount owed by customers) and purchases data (the
amounts owed to suppliers).
The Information (in the form of business documents) is generated, e.g., Statements of
Accounts, and sent to the customers.
Control is carried out/ monitored by the Supervisor guided by the instructions written down
in a Procedure manual.
In Manual systems, the data being used by one individual becomes inaccessible to another
individual.
(2).MECHANICAL SYSTEMS.
Mechanical systems are data processing systems whose activities are carried out by
Keyboard devices operated by human beings. The devices include; Accounting machines,
Cash registers, Calculators, etc.
Data is keyed in by the Machine operator, manipulated by the machine, and the output is
obtained in form of printed documents.
Once the machine is switched on & given the relevant instructions, it works on the data input
automatically.
Note. The instructions, in this case, may be pressing the relevant Keyboard button, e.g.
pressing the button for addition, after a set of values have already been keyed in or as they
are being keyed in.
The control activity is carried out automatically by the machine itself or by a human
machine-operator guided by the instructions laid down in a Procedure manual. Other control
strategies include; Self-experience on the job and Supervision.
(3).ELECTRONIC SYSTEMS.
Electronic Data Processing (E.D.P) systems use electronic machines, such as Computers, to
process data. This is because of the volume of data to be processed, and timing of the
information expected from such processing activities.
Data that is to be input into an ED.P system should be first prepared into machine-sensible
form. This means that, data cannot be input directly through the terminal or Keyboard,
connected online to the computer system. In such a case, the Key-to-disk data preparation
method could be used. The contents of the disk are input using the reading/writing unit of the
12
Data processing techniques
disk. The disk pack is mounted onto its drive and the computer is activated to read or
transfer the contents of the disk into its memory, where the data are held temporarily to await
processing based on the instructions given.
The processing is done automatically by the computer under the influence of a set of
instructions (programs).
The master files are stored in the mass storage media, e.g. Disk. The disk contents are
updated accordingly during the processing run.
The type of output generated by the E.D.P system is influenced by the type of output device
used, e.g. hardcopy outputs are produced through the Printers, while Softcopies are produced
through the Screen displays.
The control of the Electronic systems is automatic under the influence of the Control unit
(CU) of the computer, whose actions are influenced by stored programs.
13
Data processing techniques
experience. experience.
- Supervision. - Supervision.
The following are the factors, which may necessitate the change from Manual to Mechanical or
to Electronic data processing method:
The timing aspect of information availability (i.e. when the information is required) is very
important.
Electronic & Mechanical systems provide automatic processing of the input data. This
quickens the operations on the input data to produce timely information.
For example, a Clerk assisted by mechanical or electronic devices takes shorter time to
complete the posting of a transaction.
The use of mechanical or electronic data processing tools makes information more accurate
& neat, by removing the use of illegible hand written entries.
In addition, verification is made easy; hence wrong data are easily prevented from entering
the processing stage.
The data processing method selected should be able to cope with the processing tasks, in
respect to the data held. The data (records) of an organization depends on the size & the
nature of the business.
Small organizations with low volumes of data, require few personnel with little or no data
processing aids.
Large or complex business organizations, with high volumes of data, require the use of
sophisticated processing tools, if the information is to be produced on time.
(iv). Convenient.
Data processing that requires repeated operations may be boring & tedious when carried out
manually. In such a case, mechanical or computer machines may be employed to assist in
the processing depending on the nature of the business.
14
Data processing techniques
In a situation where there is a common data pool that supports several applications, and e.g.,
Manual D.P method is used, then different operations may be required to produce different
informations. However, if Electronic D.P method is used, the informations can be easily
produced from the same data. This is because, the computer is versatile, and can operate in
any desired manner provided the relevant programs are available.
As Data processing systems produce information, the recipient of such information should
receive them immediately to enable them take decisions that control their business
operations.
Using the sophisticated processing aids, such as Computer as in Electronic D.P systems,
improves the quality of information produced, e.g. statistical summaries are produced in
good time, enquiries are answered in good time, and orders are dispatched promptly.
The following are the factors that influence the method of data processing selected:
Simple or small business organizations require relatively fewer personnel and processing
methods that are less complicated.
In a very small company, a single person can be used to produce all the information required, but
as the volume of business increases, more people and tools/aids in the form of Calculators and
small Computers may be employed. Large volumes of data and information will require the use
of large computers.
For example;
In some companies, the Payroll may involve paying a member of staff the same amount each
month, while in others a complex payment system may be involved.
Similarly, producing an Invoice may be a matter of simply copying from the customer’s order, or
it may require complex discount calculations.
Simple calculations indicate the need for fewer people and tools to produce the information,
while complex situations indicate the need for more people and aids.
15
Data processing techniques
Timing Aspects of the information produced.
Some applications/ jobs require much shorter time between the origination of the transaction and
the production of information (e.g. Hotel bookings), while other business applications may
require the information to be made available after a relatively longer period, e.g. in Passport
application, where information is required periodically.
Some information requirements are less important than others. E.g., the Payroll and Statement of
Accounts may only be produced once a month, whereas in certain companies, the Invoices may
be produced all the time (i.e. as a customer collects the goods).
In some applications, the same data items may be used in producing more than one information;
hence, the most suitable data processing system should be used depending on circumstances
surrounding these information requirements.
E.g. a particular item sold may be needed to produce the Invoice & to amend the recorded Stock
position (i.e. to make adjustment of Stock level, and the Bank account or Cash account).
Exercise I.
Exercise II.
16
Data processing techniques
COMPUTER FILES
A File is a collection of related records (i.e. several records put together) that give a complete set
of information about a certain item or a particular business entity.
Files are important in any business because; they provide up-to-date information relating to the
entity sets of the business, e.g., the suppliers, employees, customers, etc of the organization.
Entities are things whose facts need to be recorded. Each entity has its attributes (i.e.,
individual properties), e.g., Employee (which is an entity) has attributes such as; Name, salary,
address, etc.
17
Data processing techniques
A file can be stored manually in a file cabinet or electronically in a computer’s secondary storage
device such as a Floppy disk or hard disk.
Logical files.
A Logical file is a type of file viewed by the user in terms of what data items it contains & what
processing operations may be performed on the stored data items.
Physical files.
A Physical file is viewed in terms of how the data items found in a file are arranged on the
surface of the storage media (e.g., disk, tape), and how the stored data items can be processed.
DATA HIERARCHY.
In data processing, data is organized from the smallest element to the most comprehensive.
Bits
Characters
18
Data processing techniques
Bit:
A Bit is the smallest item that can be stored in a physical file.
The bit can either be a ‘0’ or a ‘1’; the two states that define the storage cells of a computer
memory & a storage media.
Bits combine together to form the Byte (which is the unit of measuring the computer storage). A
Byte is the collection of several bits that represent a Character.
1. Characters.
2. Fields.
3. Records.
Character
A character is the smallest element in a computer file, and can refer to a letter, number, &
symbol that can be entered, stored and output by a computer.
A character is formed by several bits combined together, depending on the character coding
system used, e.g., in a 6-bit character coding system, a character is represented by a combination
of 6 bits.
Characters are normally used to represent data items such as Names, Prices, etc.
Fields
A Field is made up of a combination of characters, and forms the attribute of a given entity, e.g.,
in a student’s record, the students Admission number is a field.
Fixed length fields – these are fields with the same numbers of characters.
Variable length fields - fields within a record that are made up of different numbers of
characters (i.e., fields with different spaces allocated for their characters).
Records
A record is a collection of related fields, which together form or represents a single entity.
19
Data processing techniques
In any particular file, there is a separate record for each entity, e.g., in a class score sheet, the
details of each individual student in a row such as name, admission number, total marks, and
position form a record.
Fixed length records – records in a file that are made up of the same number of fields.
Variable length records – records that have different number of fields making them. If the
records have different spaces preserved for them, then it implies that, all the records in the
file will not have the same size.
Note. Variable length records normally utilize the storage efficiently. However, processing
or updating them in a computer is difficult because; the programmer is dealing with unknown
quantities.
On the other hand, fixed length records do not utilize the storage efficiently, but they are easy
to process because; the programmer is dealing with known character quantities.
There are various types of files used to store data needed for processing. Data processing files
are classified according to:
- Their uses within the overall data processing activities.
- The kind of data/ information they store.
Master files.
A Master file is the main file that contains relatively permanent records about particular items or
entries against which transactions are processed.
Master files contain records, which have long-term significance, and are very important for the
running of the organization.
20
Data processing techniques
Master files normally contain 2 types of data: Static data and Dynamic data.
Static data is relatively permanent, and contain details which do not change, e.g., Name,
Sex, Date of birth, Date of hiring, etc.
Static data is processed by amending (i.e. making occasional changes to) the existing
records, e.g., inserting new records, deleting outdated records, etc
Dynamic data is temporary and is likely to change frequently, e.g., Salary, Tax rates, hours
worked, Rate of pay, etc.
Dynamic data is processed by updating (i.e. changing the values of the various fields).
The accuracy of data within the operational files is achieved by Updating the Master file (i.e.,
changing the contents in the master files regularly in order to reflect the current state of affairs).
This involves adding, removing or adjusting the data in the Master file.
A Transaction file contains individual data about the transactions (activities) that occurred in a
business during a particular period of time.
The file contains relatively temporary information such as all incoming or outgoing records
resulting from a transaction.
Transaction files are usually created from the source documents, which contain data from the
point of their origin.
The contents in a Transaction file are used to update the dynamic data on Master files. For
example, in a busy supermarket, daily sales are recorded on a transaction file, and later used to
update the stock file. The file is also used by the management to check on the daily or periodic
transactions.
Transaction files have a short life span. This is because, once the contents of the file have been
used to update the master file, its contents are no longer required, and can be replaced by the next
business transactions.
Files that contain Earnings & deductions of an Employee, or payments received from customers.
Reference files.
21
Data processing techniques
A Reference file is used for reference or look-up purposes.
Lookup information is that information which is stored in a separate file, but is required during
processing. E.g., the item code entered either manually or using a bar-code reader in a point-of-
sale terminal is used to look-up the item description & price from a reference file stored on a
storage device.
Reference files contain records that are fairly permanent or semi-permanent such as tax
deductions, Wage rates, Customer address, etc, and therefore, they need to be revised
occasionally.
Backup files.
A Backup file is used to hold duplicate copies (backups) of data or information from the
computer’s fixed storage (hard disk). These files are kept for security purposes.
This is because; the operational files held on the hard disk may be corrupted, lost or changed
accidentally leading to loss or damage of existing information. It therefore important to keep
copies of the recently updated files so that, in case the original file is corrupted or deleted, the
backup file can be used in its place or to reconstruct the original file.
Note. The backup file & the operational file should be kept at separate places so that in case of
loss or damage, both are not affected.
Sort files.
Sort files are created from existing files, such as Master or Transaction files, and are used mainly
for sorting data (i.e., they are used to alter the sequence of the existing files).
A sort file is mainly used where data is to be processed sequentially. In sequential processing,
data or records are first sorted and held on a magnetic tape before updating the master file.
Report files.
A Report file contains a set of relatively permanent records extracted from the data in a Master
file or generated after processing.
Report files are used to prepare reports, which can be printed at a later date.
Report on Overtime, report on Taxes, report on student’s class performance in the term, etc.
Scratch file.
22
Data processing techniques
A Scratch file is a temporary file used to hold data during processing. It contains temporary data,
which can be erased when the task is finished.
History files are usually old files retained for historical use or for reference purposes, e.g., it can
contain Employee details for the last 10 yrs.
Key field.
A Key field is one or more fields in a record that uniquely identifies the record or a group of
records.
E.g., an Employees Serial number may be used to identify the employee records in a Payroll file.
Note. Any field in the record can be used as the key field. However, it should display unique
identification characteristics.
Review questions
FILE ORGANIZATION
File organization refers to the way records are arranged (laid out) within a particular file.
The term file organization can also refer to the relationship of the Key of a record to the physical
location of that record in the computer file.
File organization is very important because; it determines the method of access, efficiency,
flexibility, and storage devices to be used.
23
Data processing techniques
Methods of file organization.
There are 4 methods by which records of a file can be arranged and accessed. These include:
1. Random.
2. Serial.
3. Sequential.
4. Indexed sequential.
An Algorithm (mathematical procedure) is applied onto the record key to generate the address of
the location where the record would be stored.
Random files are usually accessed directly. To access the file, the record key is used to
determine where a record is stored on the storage media. Once the record is located, it is then
read into the computer memory.
1. Data may be accidentally erased or overwritten unless special precautions are taken.
2. Random files are less efficient in the use of storage space compared to sequentially organized
files.
3. Expensive hardware and software resources are required.
4. Relatively complex when programming.
24
Data processing techniques
5. System design based on random file organization is complex and costly.
IRG
1 2 3
Serial files can be accessed serially. This involves searching through the entire file record by
record starting from the ‘head’ of the file towards the ‘tail’ of the file.
Note. Serial access is suitable where all the records in the file are to be read. This is because;
even the records that are not required must be passed over before locating the record of interest.
E.g., to access the 10th record in the file, then the computer reads the first 9 records before
reading the 10th record. Therefore,.
In Sequential file organization, the records are arranged within the file serially one after the
other. However, in sequential file organization, the records are stored in a particular order sorted
using a key field; hence, there is a relationship that exists between adjacent records and the key
fields.
K1 – K4 – Record keys.
Sequential files are accessed sequentially, i.e. the key field is used to search for the particular
record required. Searching starts at the beginning of the file and proceeds sequentially towards
the ‘tail’ of the file, until the required record is located.
25
Data processing techniques
3. Loading or reading a record requires only the Record Key.
4. It is efficient & economical if the number of file records to be processed is high.
5. Relatively inexpensive Input/Output media and devices may be used.
6. Errors in the files remain localized.
1. The entire file must be processed even when the no. of file records to be processed is low.
2. Transactions must be sorted in the sequence of the Master file before they can be processed
or updated.
3. Data redundancy/idleness is high since the same data may be stored in several files
sequenced in different keys.
4. Random enquiries are almost impossible to handle.
The records are arranged sequentially as in sequential files. However, indexed sequential files
have an Index that enables the computer to locate individual records on the storage media.
An Index is the address of a particular cylinder or track. The indexes are used to point at the
portions where the records are stored in groups. This allows a group of records that are not
required in a particular processing run to be bypassed.
a b c
To access a record in an indexed sequential file, the Index and the record’s key field are used by
the computer to search for the required record before it is read into the computer memory.
26
Data processing techniques
Sequential access:
In sequential access, the computer reads the records in sequential order (i.e., one record after the
other) using the index until the record matching the search key is found. The record is then read
into the Main memory.
In this selective sequential access, the transaction file must first be sorted into the same key
sequence as the master file. The access mechanism then goes forward in an ordered progression
(sequence), and only those records needed are read/processed.
The records in a Random file are not stored in any particular sequence of the key field. This
means that, the records can be processed in any sequence, i.e., by moving access mechanism
forward and backwards along the file in a non-orderly manner to access the records required.
In a Magnetic tape, the file records are placed one after the other onto the tape.
1). Serial:
In serial organization, the records are written onto the tape without having any relationship
between the record keys.
√ Serial files on a tape are accessed serially, i.e., each record is read from the tape into main
storage one after the other in the order they occur on the tape.
2). Sequential.
In Sequential organization, the records are written onto tape in sequence according to the
record keys. Sequential files are accessed sequentially.
Explanation;
To process a sequential Master file on a tape, the transaction file must be in the sequence of
the Master file. The transaction file is read first, followed by the Master file until the
matching file record is found. E.g., if the record required is the 20th record of the file, the
computer must first read all the 19 preceding records.
1). Serial:
The records are placed onto the disk one after the other with no regard for sequence.
28
Data processing techniques
√ Serial files on a disk are accessed Serially, i.e. each record is read from the disk into main
storage one after the other in the order they occur on the disk.
2). Sequential:
In sequential organization, the records are written onto the disk but in a defined sequence
according to the record keys.
3). Random:
In random organization, the records are placed onto the disk “randomly”, (i.e. there is no
obvious relationship between the records).
A mathematical formula is used to generate the address of the location where the record is
placed on the disk. During processing, the same record key is used to generate the address
which shows the location from which the record is read.
In Indexed Sequential organization, the records are stored in sequence, but an Index (key
field/guide) is provided to enable individual records to be located. In this case, the index will
always enable the sequence of the records to be determined.
Indexed sequential files can be accessed using sequential access, selective sequential access,
or random access method.
The file designer should determine how often the file is going to require updating.
For periodic updates (e.g., monthly update), the transactions are used to update the master
files in one run. For the non-periodic systems, the transactions may be updated anytime as
required.
29
Data processing techniques
The file design selected should therefore be able to meet the update strategies, and at the
required time.
The type of file organization adopted should be based on the expected number of records to
be processes/accessed in a particular run.
This refers to the method the computer shall use to transfer the contents of the file from the
storage media into the computer.
Before designing the file(s) to be maintained by a computer system, you have to consider
whether the system runs periodically or is an event-driven system.
In periodically run systems, all transactions relating to particular business are accumulated
over a period of time, after which they are applied to the relevant master files in a single run.
Such systems produce periodic reports from the maintained files.
On the other hand, event-driven systems allow file enquiries and instant update so long as the
transactions are available from the maintained master files for the production of instant
information.
Computer files are stored in the storage media. The type of file organization adopted
depends on the medium that will be used to store the computer file.
E.g., Serial access devices, such as Magnetic Tapes cannot be used to store Random files or
Indexed-sequential files. This is because; searching for the particular record required
proceeds serially regardless of the file organization method used.
Review Questions.
Data processing modes describe the ways in which a computer, under the influence of an
operating system, is designed to handle data or transactions during processing.
2. Online processing
3. Real-time processing
4. Time-sharing
6. Multi-processing.
7. Distributed processing
8. Interactive processing
Review questions
Batch processing
In batch processing, data or transactions are collected & accumulated together over a specified
period of time, e.g., daily, weekly, or monthly. The data is then input & processed at once (or as
a single unit) to produce a batch of output.
For example:
31
Data processing techniques
In a payroll processing system, details of employees such as number of hours worked, rate of
pay, may be collected for a period of 1 month, after which they are used to process the payment
for the duration worked.
Data collection is usually done off-line (i.e. away from the CPU) on special machines known as
Data entry terminals. The data is entered & stored on a disk in a batch queue for a while. It is
then input & processed one or more at a time under the control of the Batch operating system,
and the result obtained.
Batches of transactions are scheduled for processing by assigning them priorities. The priorities
are assigned in terms of percentage ratio, e.g. 95%, 60%, etc. The most priority jobs are
processed first, while the less priority jobs are processed once the computer resources (i.e., CPU
time, Memory & I/O devices) are released by the most priority jobs.
Once the processing of a given batch starts, there is no interaction between the operator & the
CPU. Therefore, the user cannot intervene to perform amendments to the program.
A job is not processed until it is fully input. In addition, a program must wait its turn before
processing the data. This means that, there will be a delay in obtaining results. For instance, a
job may wait in the batch queue for minutes or hours depending on the workload. Hence, Batch
processing cannot be used when the results are needed immediately.
√ The input device does not necessarily need to be connected to the computer.
If the device used for data entry is not connected to the computer, it is said to be Off-Line
(away from the computer).
√ The data is not immediately input into the computer, and it is not even immediately recorded
in a machine-readable form.
√ The speed of processing is not important. This implies that, processing of the data is done at
whatever time is most convenient.
1. Payroll systems.
The attendance data of each employee is collected regularly. It is then input weekly or
monthly as per the demands of the system, processed, & then the pay figures for each
employee is obtained.
Review questions
Online processing
In online processing, data or the input transactions are processed immediately they are received
to produce the information required.
Online processing occurs when the transactions are processed to update (or make any change in)
a computer file immediately after the transactions occur.
In online processing, all the Input/Output facilities, and communication equipments are under
direct influence of the central Processor.
In online processing, the operator communicates directly to the computer’s operating system
using commands, which are then interpreted by the supervisor. This means that, the operator can
interact with the system at any point of processing using the Input/Output facilities.
Note. In online processing, the data input units (terminals) are connected directly to the central
computer using communication links.
In such a configuration, the data (input transactions) are communicated from the workstations to
the central computer for processing, & the results communicated back to the workstations
through the telecommunication links.
33
Data processing techniques
√ The input device is connected directly to the computer.
√ The input data is processed immediately. Processing is completed within a short time (usually
1 or 2 minutes), depending on the speed of the system.
1. Banking:
A bank customer can make an inquiry using an online terminal. The system would then
respond immediately by accessing the relevant file, and inform the customer on the status of
his/her account.
2. Stock exchanges:
Terminals located in major stock exchanges throughout the country enables quick processing
of shares dealings.
3. Stock control:
1. Files are held online; therefore the information generated can be used to update the master
files directly.
2. The Information is readily available for immediate decision-making.
3. File enquiries are possible at any given time through the terminals (workstations).
Review questions
34
Data processing techniques
Real-time systems.
A Real-time system is capable of processing data so quickly such that the results (output)
produced are able to influence, control, or affect the outcome of the activity or process currently
taking place.
In a Real-time data processing system, the computer receives & processes the incoming data as
soon as it occurs, updates the transaction file, and gives an immediate response that would affect
the events as they happen.
The main purpose of a real-time processing is to provide accurate, up-to-date information, hence;
better services based on a real situation.
1). There must be a direct connection between Input/Output devices & the central Processor.
2). The Response time should be fairly fast, to allow a 2-way communication (interaction)
between the user & central processor.
Quick response.
A much short time cycle before the information is available to effect the functioning of its
environment.
35
Data processing techniques
Examples:
An individual cannot be booked before enquiring whether the airline seat is available. The
customer may request for an airline booking information through a remote terminal and the
requested information will be given out immediately by the reservation system. If a booking
is made, the system immediately updates the reservations file to avoid double-booking, and
sends the response back to the customer immediately.
This implies that, before the next transaction can be processed, the files must have been
updated by the previous transactions.
1. Real-time systems are very expensive & require complex Operating systems.
2. The systems are not easy to develop.
3. They require large communication equipments, e.g., they require a Front End Processor
(FEP), which is used to relieve the central computer by handling some of the limited
processing activities, and also link the terminals to the central computer.
4. Real-time systems use 2 or more computer processors sharing up the workloads, which are
expensive.
36
Data processing techniques
Review questions
Time-sharing systems
In time-sharing processing, the central processor allows 2 or more users, who have different
processing requirements, to use one computer at the same time.
The terminal users are usually connected to the central computer using communication links.
The CPU time is divided out equally among the users, and each user is allowed a “Time slice” –
a brief period when he/she is allowed to access the CPU.
The amount of time allocated to each user & the switching from one job to another is controlled
by a multi-user operating system. The OS normally assigns priorities to the various jobs entering
the system.
Illustration;
The OS may give each terminal user 5 seconds to submit a job. The user sits at the terminal &
issues commands to the OS.
After every 5 seconds, the central computer checks all the terminals to see if there is any user
who needs assistance. If a particular terminal does not need service, the computer goes onto the
next terminal. But if a new command has been issued, the computer will allocate a time-slice to
the user. During this time, the computer devotes its full attention to this user. When the time-
slice is over or the user’s requests have been satisfied, the computer goes on to the next terminal.
The user must now wait until he/she is allocated her next time-slice.
Note. The switching of control from one user to another during assigning of the time slices
happens so fast that an individual user may think that he/she is the only one using the system.
E.g., for 50 users each allocated 10 milliseconds; it takes only 500 milliseconds (½ a second) to
service them all.
Question. What happens to a user’s job if her time-slice is up and the job is not completed?
The job is interrupted and allocated some space on the disk where the job together with all
relevant status information is moved into. When the time comes to resume the job (or during the
37
Data processing techniques
next allocated time-slice), the job is rolled-in from the disk, and processing continues at the
point at which the interruption occurred.
√ Each user has one or more Input/Output devices connected to the central computer by
communication lines.
√ Each user is independently of the others who are connected to the system.
√ Each user has its own private set of programs plus access to a set of public programs.
√ The central computer accepts the data & instructions arriving simultaneously from many
users, and gives each user a small but frequently repeated segment of computer time.
√ Data files, program files, and Input/Output devices are all directly connected to the computer,
so that processing can be performed at random as requests for transactions are made.
1. In Bureaus that serve individuals or small companies who cannot afford the computer
facilities.
2. In learning institutions where there are many users.
Multi-programming systems
38
Data processing techniques
Multi-programming (also referred to as Multi-tasking) refers to a type processing where more
than one programs residing in the computer memory are executed concurrently by a single
Processor.
A multi-programming system allows the user to run 2 or more programs, all of which are in the
computer’s Main memory, at the same time.
The jobs are scheduled to run automatically by the Processor under the influence of a Multi-
programming or Multi-tasking operating system).
The schedule is such that; the Processor bound jobs (i.e., jobs that require much of the C.P.U
time as compared to the peripheral time) are assigned low priorities for them not to tie up the
C.P.U time. The Peripheral or Print bound jobs (i.e., jobs that require much of the peripheral
time as compared to the C.P.U time) are allocated the C.P.U time whenever it is available.
The OS allocates each program a time-slice, and decides the order in which they will be
executed. In this case, the programs take turns at short intervals of processing time. The
programs to be run are loaded into the memory and the CPU begins execution of the first one.
When the request is satisfied, the second program is brought into memory and its execution
starts, and so on.
Note. A Multi-programming system is able to work on several programs at the same time. It
works on the programs one after the other, and at any given time it executes instructions from
one program only. However, the computer works so quickly that it appears to be executing the
programs at the same time.
Advantages of multi-programming.
Disadvantages of multi-programming.
Review questions
39
Data processing techniques
(d). Discuss the hardware and software facilities necessary to facilitate Multi-programming.
Distributed processing.
Distributed data processing refers to dividing of processing tasks among 2 or more computers
that are located on physically separate sites, but connected by data transmission media.
For example;
An organization may have various computers that are located at various departments or business
sites, but linked together by communication lines. In such a case, each individual department or
business site is being served individually by its own computer resources.
The computers at different departments are usually of limited processing power (e.g.,
microcomputers), and only serve as terminals from the various departments. They are then
connected to a central computer of enhanced processing ability such as a Mini or a Mainframe
computer.
√ Each department or business site is served individually by the computer resources employed.
√ The Information generated in each department is used to influnce the decisions of individual
departments appropriately.
The following are computer arrangements that can be used for Distributed processing systems:
This is whereby the network is within the same locality, and does not require the use
telecommunication links, e.g., same building.
These are networks that involve computers separated by long distance; hence, they
communicate through telecommunication links.
40
Data processing techniques
Note. Networks within the same city may be linked through Telephone lines or special Coaxial
cables, while far distant places may be linked through Satellite transmission channels or
ground Microwave systems.
1. In Banks:
All the branches have Intelligent terminals (usually microcomputers) linked to a big
computer at the Head Office. The customers’ accounts are operated on the servers in the
branches, while data from the branches is sent to the main server where it is processed.
Review questions
1. Most companies are now shifting from the use of centralized mainframe computers to the use
of geographically distributed personal computers. This method of data processing is known
as Distributed data processing (DDP).
Required:
(i). Name any three computing resources that can be distributed.
(ii). Name four examples of industries and business organizations that extensively use
distributed processing systems.
(iii). List and explain three ways of networking microcomputers/personal computers to form
a distributed data processing system.
41
Data processing techniques
(iv). Name three risks that might be associated with the distributed data processing system.
Interactive processing
Interactive processing occurs if the computer & the terminal user can communicate with each
other. It allows a 2-way communication between the user & the computer.
As the program executes, it keeps on prompting the user to provide input or respond to prompts
displayed on the screen. In other words, the user makes the requests and the computer gives the
responses.
In Interactive processing, the data is processed individually and continuously as transactions take
place and output is generated instantly.
Multi-processing systems
Multiprocessing refers to the processing of more than one task at the same time on different
processors of the same computer.
This means that, at any given time, the processors could execute instructions from two or more
different programs, or different parts of one program simultaneously. In such systems, each CPU
is dedicated to one type of application, e.g., one CPU may handle all terminal users, while
another may process only the batch jobs.
The activities of the system are coordinated by the Multi-processing operating system.
Advantage: - if one CPU fails, the other(s) can take over the workload until repairs are made.
Review questions
This refers to batch processing where jobs are entered at a terminal remote from the computer
and transmitted into the computer, e.g., by means of telecommunication links.
Conversational mode.
42
Data processing techniques
This is interactive computer operation where the response to the user’s message is immediate.
Review questions
1. Mention 5 factors to be considered when selecting the data processing mode suitable for use
in organization.
43