3rd Term DP Notes For Ss2
3rd Term DP Notes For Ss2
1
SS2 DP, WK 1
Repairing means to rectify the problem in the hardware or software. Either the part has malfunctioned
or it has become worn to the point where the part needs to be replaced in order to maintain the
performance of your computer system. While finding or analyzing the faults, it can be decided which
hardware or software can be repaired.
Repairing may also include replacement of a component. It is an essential part of troubleshooting.
Repair of components may result into adding up of cost and delay in operations. Some failures occur
because of repairs, it is called repair generated failures.
Repairs are termed as corrective maintenance. Corrective maintenance is done when a fault occurs.
Preventive maintenance should be favoured over corrective maintenance. It may add to the cost but
saves operation time.
Preventive maintenance is often neglected and the emphasis is on repair maintenance policy. It
enforces maintenance through servicing.
Repair generated failures
These failures depend on the performance of the technician. The technician during repairing process
may leave some loose connections, wrong connections or some broken pins / broken wires. These can
be avoided if the technician rechecks/revise the work done.
MAINTENANCE
Maintenance is a process which starts with installation of the system and runs throughout the life of it.
It includes both
• Hardware maintenance and
• Software maintenance.
Hardware Maintenance
Computer hardware maintenance involves taking care of the computer's physical components, such as
its keyboard, hard drive and internal CD or DVD drives. Cleaning the computer, keeping its fans free
from dust, and defragmenting its hard drives regularly are all parts of a computer hardware
maintenance program. It includes proper cleaning, servicing, repairing or replacing components of the
computer.
Maintaining hardware helps to extend the computer's lifespan. It helps to prevent wear and tear, and
keeps the system functioning smoothly.
The following are the two types of maintenance methods used to keep the hardware intact:
• Preventive maintenance.
• Corrective maintenance.
Preventive maintenance means maintenance through preventions. Careful handling of the computer
enhances the life of the system and is called preventive maintenance. Preventive maintenance can be
done by taking some general precautions and some special precautions.
Corrective Maintenance
2
SS2 DP, WK 1
It refers to the maintenance procedures that are adopted when any error occurs in the system. It is
contrary to preventive maintenance and starts when a failure or crash occurs in the system. It includes
repair and troubleshooting techniques.
Corrective maintenance steps
• In case of failure general troubleshooting concepts should be performed first.
• If problem remains, locate the fault using different tools or diagnostic software.
• Once fault is determined, troubleshoot or replace the component, as required.
• Corrective maintenance also includes periodic enhancements.
Various tools that can be used during corrective maintenance: Data recovery tools from operating
system, third party data recovery tools, virus vaccines, etc.
Though preventive maintenance is better yet there are times that corrective maintenance is used due to
unseen factors leading to sudden failures.
3
SS2 DP, WK 1
Although computer cleaning products are available, you can also use household items to clean your
computer and its peripherals. Below is a listing of items you may need or want to use while cleaning
your computer.
o Cloth - A cotton cloth is the best tool used when rubbing down computer components. Paper
towels can be used with most hardware, but we always recommend using a cloth whenever
possible. However, only use a cloth when cleaning components such as the case, a drive, mouse,
and keyboard. You should not use a cloth to clean any circuitry such as the RAM or motherboard.
o Water or rubbing alcohol - When moistening a cloth, it is best to use water or rubbing alcohol.
Other solvents may be bad for the plastics used with your computer.
o Portable Vacuum - Sucking the dust, dirt, hair, cigarette particles, and other particles out of a
computer can be one of the best methods of cleaning a computer. However, do not use a
vacuum that plugs into the wall since it creates lots of static electricity that can damage your
computer.
o Compressed Air- Using compressed air for electronics can protect the components in your
devices. They prevent them from overheating and shorting out. It is not just components, but
your computer gets an extended life. Compressed air for electronics is one of the easiest and
fastest ways of cleaning. Cotton swabs - Cotton swaps moistened with rubbing alcohol or water
are excellent tools for wiping hard to reach areas in your keyboard, mouse, and other locations.
o Foam swabs - Whenever possible, it is better to use lint-free swabs such as foam swabs.
4
SS2 DP, WK 1
• When cleaning, be careful to not accidentally adjust any knobs or controls. Also, when cleaning
the back of the computer, if anything is connected make sure not to disconnect the plugs.
• When cleaning fans, especially smaller fans, hold the fan or place something in-between the fan
blades to prevent it from spinning. Spraying compressed air into a fan or cleaning a fan with a
vacuum may cause damage or generate a back voltage.
• Limit smoking around the computer.
Keyboard cleaning
Dust, dirt, and bacteria
The computer keyboard is usually the most germ infected items in your home or office. A keyboard may
even contain more bacteria than your toilet seat. Cleaning it helps remove any dangerous bacteria and
keeps the keyboard working properly.
Procedure: Before cleaning the keyboard, first turn off the computer or if you are using a USB keyboard
unplug it from the computer. Not unplugging the keyboard can cause other computer problems as you
may press keys that cause the computer to perform a task you do not want it to perform.
Many people clean the keyboard by turning it upside down and shaking. A more efficient method is to
use compressed air. The crumbs, dust, and other particulate that fall between the keys and build up
underneath are loosened by spraying pressurized air into the keyboard, then removed with a low-
pressure vacuum cleaner.
After the dust, dirt, and hair have been removed. Spray a disinfectant onto a cloth or use disinfectant
cloths and rub each of the keys on the keyboard. As mentioned in our general cleaning tips, never spray
any liquid onto the keyboard.
A plastic-cleaning agent applied to the surface of the keys with a cloth is used to remove the
accumulation of oil and dirt from repeated contact with a user's fingertips. If this is not sufficient for a
more severely dirty keyboard, keys are physically removed for more focused individual cleaning, or for
better access to the area beneath
5
SS2 DP, WK 1
Use a cleaning kit or damp clean cotton cloth to clean CDs, DVDs, and other discs. The cleaning
kit is comprised of a single disc that is designed to spin in user’s drive a nd remove all dust from
the lens.
Place the CD/DVD laser lens cleaning disc inside the DVD drive’s tray and close the tray.
As it spins, it will clear most if not all the dust on the lens.
As an extra precaution, use a can of air spray to gently spray into the open tray to remove any
residual dust
Try to read or write a DVD once again to make sure everything is working well now.
You can also clean the face of a disc. When cleaning a disc, wipe against the tracks, starting from the
middle of the CD or DVD and wiping towards the outer side. Never wipe with the tracks; doing so may
put more scratches on the disc.
Tip: If the substance on a CD cannot be removed using water, pure alcohol can also be used.
6
SS2 DP, WK 1
7
SS2 DP, WK 1
Having a battery fully charged and the laptop plugged in is not harmful, because as soon as the charge
level reaches 100% the battery stops receiving charging energy and this energy is bypassed directly to
the power supply system of the laptop.
However there's a disadvantage in keeping the battery in its socket when the laptop is plugged in, but
only if it's currently suffering from excessive heating caused by the laptop hardware.
So:
- In a normal usage, if the laptop doesn't get too hot (CPU and Hard Disk around 40ºC to 50ºC) the
battery should remain in the laptop socket;
- In an intensive usage which leads to a large amount of heat produced (i.e. Games, temperatures above
60ºC) the battery should be removed from the socket in order to prevent unwanted heating.
The heat, among the fact that it has 100% of charge, is the great enemy of the lithium battery and not
the plug, as many might think so.
Charging tips:
• For regular usage or when the laptop doesn’t go above 40ºC to 50ºC, keep the battery attached to its
socket.
• When the laptop is new or when a replacement battery is initially installed, be sure to fully charge it
before usage.
• Do not keep the battery and the A/C adapter plugged in too frequently and during intensive use. This
will cause chemical reaction which reduces the battery’s capacity to hold charges. What’s worse is that
eventually, it won’t be able to hold any charge without the AC plugged in.
• The battery should be in low charge levels before recharging. This significantly increases the likelihood
for a longer serviceable life.
• Do not leave it plugged in all the time.
8
SS2 DP, WK 1
Even if you treat your laptop’s battery properly, its capacity will decrease over time. Its built-in power
meter estimates how much juice available and how much time on battery you have left—but it can
sometimes give you incorrect estimates.
This basic technique will work in Windows 10, 8, 7, Vista. Really, it will work for any device with a
battery, including older MacBooks. It may not be necessary on some newer devices, however.
If you’re taking proper care of your laptop’s battery, you should be allowing it to discharge somewhat
before plugging it back in and topping it off. You shouldn’t be allowing your laptop’s battery to die
completely each time you use it, or even get extremely low. Performing regular top-ups will extend your
battery’s life.
However, this sort of behavior can confuse the laptop’s battery meter. No matter how well you take
care of the battery, its capacity will still decrease as a result of unavoidable factors like typical usage,
age, and heat. If the battery isn’t allowed to run from 100% down to 0% occasionally, the battery’s
power meter won’t know how much juice is actually in the battery. That means your laptop may think
it’s at 30% capacity when it’s really at 1%—and then it shuts down unexpectedly.
Calibrating the battery won’t give you longer battery life, but it will give you more accurate estimates of
how much battery power your device has left.
Manufacturers that do recommend calibration, often recommends calibrating the battery every two to
three months. This helps keep your battery readings accurate.
In reality, you likely don’t have to do this that often if you’re not too worried about your laptop’s battery
readings being completely precise.
9
SS2 DP, WK 1
Leave the computer discharging, non-stop, until it hibernates itself. You may use the computer
normally within this period;
When the computer shuts down completely, let it stay in the hibernation state for 5 hours or
even more;
Plug the computer to the A/C power to perform a full charge non-stop until its maximum
capacity (100%). You may use the computer normally within this period.
After the calibration process, the reported wear level is usually higher than before. This is natural, since
it now reports the true current capacity that the battery has to hold charge. Lithium Ion batteries have a
limit amount of discharge cycles (generally 200 to 300 cycles) and they will retain less capacity over
time.
Many people tend to think "If calibrating gives higher wear level, then it's a bad thing". This is wrong,
because like said, the calibration is meant to have your battery report the true capacity it can hold, and
it's meant to avoid surprises like, for example, being in the middle of a presentation and suddenly the
computer shuts down at 30% of charge.
Prolonged storage
10
SS2 DP, WK 1
To store a battery for long periods of time, its charge capacity should be around 40% and it should be
stored in a place as fresh and dry as possible. A fridge can be used (0ºC - 10ºC), but only if the battery
stays isolated from any humidity.
One must say again that the battery's worst enemy is the heat, so leaving the laptop in the car in a hot
summer day is half way to kill the battery.
Software Maintenance
Software maintenance includes updation, enhancements, changes, repair and replacements.
Altered environment or changed conditions may result in software maintenance.
It is of the following types:
Corrective
Adaptive
Perfective
Preventive
Corrective maintenance is concerned with fixing errors that are observed when the software is in
use.
Adaptive maintenance is concerned with the change in the software that takes place to make the
software adaptable to new environment such as to run the software on a new operating system.
Perfective maintenance is concerned with the change in the software that occurs while adding
new functionalities in the software.
Preventive maintenance involves implementing changes to prevent the occurrence of errors. The
distribution of types of maintenance by type and by percentage of time consumed.
11
SS2 DP, WK 1
INDEXES (week 3)
Indexing in Databases
Indexing is a way to optimize performance of a database by minimizing the number of disk accesses
required when a query is processed.
An index or database index is a data structure which is used to quickly locate and access the data in a
database table (to speed up query). They are similar to textbook indexes. In textbooks, if you need to go
to a particular chapter, you go to the index, find the page number of the chapter and go directly to that
page. Without indexes, the process of finding your desired chapter would have been very slow.
The same applies to indexes in databases. Without indexes, a DBMS has to go through all the records in
the table in order to retrieve the desired results. This process is called table-scanning and is extremely
slow. On the other hand, if you create indexes, the database goes to that index first and then retrieves
the corresponding table records directly.
Indexes are created using some database columns.
12
SS2 DP, WK 1
The first column is the Search key that contains a copy of the primary key or candidate key of the
table. These values are stored in sorted order so that the corresponding data can be accessed
quickly (Note that the data may or may not be stored in sorted order).
The second column is the Data Reference which contains a set of pointers holding the address of
the disk block where that particular key value can be found.
Overview of Indexes
As we noted earlier, an index on a file is an auxiliary structure designed to speed up operations that are
not efficiently supported by the basic organization of records in that file.
An index can be viewed as a collection of data entries, with an efficient way to locate all data entries
with search key value k. Each such data entry, which we denote as k*, contains enough information to
enable us to retrieve (one or more) data records with search key value k. (Note that a data entry is, in
general, different from a data record!) The following figure shows an index with search key sal that
contains (sal, rid) pairs as data entries. The rid component of a data entry in this index is a pointer to a
record with search key value sal.
13
SS2 DP, WK 1
simple; it converts the search key value to its binary representation and uses the two least significant
bits as the bucket identifier.
Another way to organize data entries is to build a data structure that directs a search for data entries.
Several index data structures are known that allow us to efficiently find data entries with a given search
key value.
Based on this, there are two ways indexing can be done:
1. Ordered indices: Indices are based on a sorted ordering of the values. The indices are usually
sorted so that the searching is faster. The indices which are sorted are known as ordered indices.
2. Hash indices: Hashing is the transformation of a string of characters into a usually shorter fixed-
length value or key that represents the original string. Hashing is used to index and retrieve items
in a database because it is faster to find the item using the shorter hashed key than to find it using
the original value.
Hash indices are based on the shorter fixed-length values being distributed uniformly across a
range of buckets. The buckets to which a value is assigned is determined by function called a hash
function (refer to note on file organization).
There is no comparison between both the techniques, it depends on the database application on which
it is being applied.
Access Types: e.g. value based search, range access, etc.
Access Time: Time to find particular data element or set of elements.
Insertion Time: Time taken to find the appropriate space and insert a new data.
Deletion Time: Time taken to find an item and delete it as well as update the index structure.
Space Overhead: Additional space required by the index.
INDEXING METHODS
14
SS2 DP, WK 1
The example below contains different levels of pointer pointing to the base table.
15
SS2 DP, WK 1
Non-Clustered Indexes
A non-clustered index does not sort the physical data inside the table. In fact, a non-clustered index is
stored at one place and table data is stored in another place and the index would have pointers to the
storage location of the data. A table can have multiple non-clustered indices because the index in the
non-clustered index is stored at a different place. For example, a book can have more than one index,
one at the beginning which shows the contents of a book unit wise and another index at the end which
shows the index of terms in alphabetical order.
It is important to mention here that inside the table the data will be sorted by a clustered index.
However, inside the non-clustered index, data is stored in the specified order. The index contains
column values on which the index is created and the address of the record that the column value
belongs to.
A non-clustered index just tells us where the data lies, i.e. it gives us a list of virtual pointers or
references to the location where the data is actually stored. Data is not physically stored in the order of
the index. Instead, data is present in leaf nodes. For e.g., the contents page of a book. Each entry gives
us the page number or location of the information stored. The actual data here (information on each
page of book) is not organized but we have an ordered reference (contents page) to where the data
points actually lie.
16
SS2 DP, WK 1
When a query is issued against a column on which the index is created, the database will first go to the
index and look for the address of the corresponding row in the table. It will then go to that row address
and fetch other column values. It is due to this additional step that non-clustered indexes are slower
than clustered indexes.
It requires more time as compared to clustered index because some amount of extra work is done in
order to extract the data by further following the pointer. In case of clustered index, data is directly
present in front of the index.
17
SS2 DP, WK 1
In the dense index, there is an index record for every search key value in the database. This makes
searching faster but requires more space to store index records itself. Index records contain search key
value and a pointer/reference to the actual record on the disk (the first data record with that search key
value).
Sparse Index
The index record appears only for a few items in the data file. Each item points to a block as
shown.
To locate a record, we find the index record with the largest search key value less than or equal to
the search key value we are looking for.
We start at that record pointed to by the index record, and proceed along the pointers in the file
(that is, sequentially) until we find the desired record.
18
SS2 DP, WK 1
19
SS2 DP, WK 1
Secondary Index
It is used to optimize query processing and access records in a database with some information other
than the usual search key (primary key).
It helps to reduce the size of mapping by introducing another level of indexing. In this, two levels of
indexing are used in order to reduce the mapping size of the first level and in general. At the initial
stage, it selects a range for the columns. Therefore, the mapping size of the first level becomes smaller.
Then, this index method reduces each range into smaller ranges. Generally, the primary memory stores
the first level mappings to fetch addresses faster. Furthermore, the secondary memory stores the
mapping of the second level and the actual data. . Actual physical location of the data is determined by
the second mapping level.
Initially, for the first level, a large range of numbers is selected so that the mapping size is small. Further,
each range is divided into further sub ranges.
20
SS2 DP, WK 1
Conclusion
21
SS2 DP, WK 1
The clustered index is a way of storing data in the rows of a table in some particular order. So that when
the desired data is searched, the only corresponding row gets affected that contain the data and is
represented as output. On the other hand, the non-clustered index resides in a physically separate
structure that references the base data when it is searched. A non-clustered structure can have a
different sort order.
The figure above illustrates the difference between a composite index with key (age, sal), a composite
index with key (sal, age), an index with key age, and an index with key sal.
All indexes shown in the figure use Alternative (2) above, for data entries.
If the search key is composite, an equality query is one in which each field in the search key is bound to
a constant. For example, we can ask to retrieve all data entries with age = 20 and sal = 10. The hashed
file organization supports only equality queries, since a hash function identifies the bucket containing
desired records only if a value is specified for each field in the search key.
22
SS2 DP, WK 1
A range query is one in which not all fields in the search key are bound to constants. For example, we
can ask to retrieve all data entries with age = 20; this query implies that any value is acceptable for the
sal field. As another example of a range query, we can ask to retrieve all data entries with age < 30 and
sal > 40.
before all its changes are reflected on disk). Durability is the Responsibility of the Recovery Manager.
23
SS2 DP, WK 1
The recovery manager is one of the hardest components of a DBMS to design and implement. It must
deal with a wide variety of database states because it is called on during system failures.
The Recovery Manager also interacts to a lesser degree with the Buffer Manager and the Transaction
Manager. It is invoked by the Transaction Manager for transaction rollback. It requests the Buffer
Manager for the Dirty Page list and the Transaction Manager for the Transaction table.
In a DBMS, it is also necessary to keep track of current and completed transactions. This is done using a
data structure called log. Each log entry typically describes the operation performed, the initial value of
any updated item and the final value of any updated item. The log must be written to stable storage
because it is needed by the recovery manager for recovery.
24
SS2 DP, WK 1
On restart after a failure, the basic recovery process is to use the log stored on stable storage to undo
the effects of aborted and incomplete transactions (in reverse order) and to redo the effects of
committed transactions (in forward order).
A complication is that both the database itself and the log use memory buffers, so data written to the
database or to the log are not necessarily recorded in stable storage immediately. Related to this is the
fact that both the database and the log are written to disk a page at a time, not an item at a time. A
transaction is regarded as committed when the ``commit'' entry written to the log is recorded on stable
storage.
Key choices for the recovery manager implementor include:
Whether to require all changed database pages to be written to disk when a transaction commits
(``force''). Forcing avoids the need to redo on restart.
Whether to allow changed database pages to be written to disk before the transaction commits
(``steal''). Stealing requires undo on restart.
Recovery managers may decide whether or not to use force and/or steal independently. Most recovery
managers use WAL (Write ahead logging) to allow STEAL/NO-FORCE without sacrificing correctness.
Checkpoints are used to periodically write the log and changed database pages to disk, recording the
fact on the disk, to reduce the work required on restart after failure.
It's important that restart be idempotent: if a failure occurs during restart, and a second restart is
performed, the effect should be the same as if the first restart had completed.
Some of the terms encountered above (in boldface) will later be discussed in detail.
States of Transactions
Active − In this state, the transaction is being executed. This is the initial state of every transaction.
25
SS2 DP, WK 1
Partially Committed − When a transaction executes its final operation, it is said to be in a partially
committed state.
Failed − A transaction is said to be in a failed state if any of the checks made by the database recovery
system fails. A failed transaction can no longer proceed further.
Aborted − If any of the checks fails and the transaction has reached a failed state, then the recovery
manager rolls back all its write operations on the database to bring the database back to its original
state where it was prior to the execution of the transaction. Transactions in this state are called aborted.
The database recovery module can select one of the two operations after a transaction aborts –
- Re-start the transaction
- Kill the transaction
Committed − If a transaction executes all its operations successfully, it is said to be committed. All its
effects are now permanently established on the database system.
In a general sense, a commit is the updating of a record in a database. In the context of
a database transaction, a commit refers to the saving of data permanently after a set of tentative
changes. A commit ends a transaction within a relational database and allows all other users to see the
changes.
States of Transactions
Transaction failure
A transaction has to abort when it fails to execute or when it reaches a point from where it can’t go any
further. This is called transaction failure where only a few transactions or processes are hurt.
DATABASE RECOVERY TECHNIQUES
A transaction T reaches its commit point when all its operations that access the database have been executed
successfully i.e. the transaction has reached the point at which it will not abort (terminate without completing).
Once committed, the transaction is permanently recorded in the database. Commitment always involves writing a
commit entry to the log and writing the log to disk. At the time of a system crash, item is searched back in the log
for all transactions T that have written a start_transaction(T) entry into the log but have not written a commit(T)
entry yet; these transactions may have to be rolled back to undo their effect on the database during the recovery
process
Undoing – If a transaction crashes, then the recovery manager may undo transactions i.e. reverse the
operations of a transaction. This involves examining a transaction for the log entry write_item(T, x,
old_value, new_value) and setting the value of item x in the database to old-value.There are two major
techniques for recovery from non-catastrophic transaction failures: deferred updates and immediate
updates.
Deferred update – This technique does not physically update the database on disk until a transaction has
reached its commit point. Before reaching commit, all transaction updates are recorded in the local
transaction workspace. If a transaction fails before reaching its commit point, it will not have changed the
database in any way so UNDO is not needed. It may be necessary to REDO the effect of the operations
that are recorded in the local transaction workspace, because their effect may not yet have been written
in the database. Hence, a deferred update is also known as the No-undo/redo algorithm
Immediate update – In the immediate update, the database may be updated by some operations of a
transaction before the transaction reaches its commit point. However, these operations are recorded in a
log on disk before they are applied to the database, making recovery still possible. If a transaction fails to
reach its commit point, the effect of its operation must be undone i.e. the transaction must be rolled back
hence we require both undo and redo. This technique is known as undo/redo algorithm.
Caching/Buffering – In this one or more disk pages that include data items to be updated are cached into
main memory buffers and then updated in memory before being written back to disk. A collection of in-
memory buffers called the DBMS cache is kept under control of DBMS for holding these buffers. A
directory is used to keep track of which database items are in the buffer. A dirty bit is associated with
each buffer, which is 0 if the buffer is not modified else 1 if modified.
Shadow paging – The AFIM (After Image) does not overwrite its BFIM (Before Image) but recorded at
another place on the disk. Thus, at any time a data item has AFIM and BFIM (Shadow copy of the data
item) at two different places on the disk.
27
SS2 DP, WK 1
- To recover, it is sufficient to free the modified pages and discard the current directory. The state of
the database before transaction execution is available through the shadow directory. Database can
be returned to its previous state.
- Committing a transaction corresponds to discarding the previous shadow directory.
Page
In a computer's random access memory (RAM), a page is a group of memory cells that are accessed as
part of a single operation. That is, all the bits in the group of cells are changed at the same time. In some
kinds of RAM, a page is all the memory cells in the same row of cells. In other kinds of RAM, a page may
represent some other group of cells than all those in a row.
In computer systems that use virtual memory (also known as virtual storage), a page is a unit of data
storage that is brought into real storage (on a personal computer, RAM) from auxiliary storage (on a
personal computer, usually the hard disk) when a requested item of data is not already in real storage
(RAM). It is a fixed-length contiguous block of virtual memory.
Pages are the internal basic structure to organize the data in the database files.
Dirty Page
When a page is read from disk into memory, it is considered a clean page because it is similar to its
equivalent on disk.
However, once the page has been modified in memory due to data modification (Insert/update/delete), it
is marked as a dirty page means any pages which are available in buffer pool different from disk are
known as Dirty Pages. Simply we can say that the pages which are modified in the buffer cache is called as
a ‘Dirty page’.
A dirty page is simply a page that has been changed in memory since it was loaded from disk and is now
different from the on-disk page. "Dirty" pages contain data that has been changed but has not yet been
written to disk.
Cache
A cache is the part of the memory which transparently stores data so that future requests for that data
can be served faster.
28
SS2 DP, WK 1
We want to keep as much data as possible in memory, especially those data that we need to access
frequently. We call the technique of keeping frequently used disk data in main memory caching. A cache
is also something that has been "read" from the disk and stored for later use.
Buffer
A buffer is a region of a physical memory storage used to temporarily hold data while it is being moved
from one place to another. Operating systems generally read and write entire blocks. Thus, reading a
single byte from disk can take as much time as reading the entire block. We call the part of main
memory where a block being read or written is stored a buffer.
The buffer keeps track of changes happening in a running program by temporarily storing them before
the changes are finally saved in the disk. A buffer is something that has yet to be "written" to disk.
A buffer pool is an area of main memory that has been allocated by the database manager for the
purpose of caching table and index data as it is read from disk.
When a row of data in a table is first accessed, the database manager places the page that contains that
data into a buffer pool. Pages stay in the buffer pool until the database is shut down or until the space
occupied by the page is required by another page.
Pages in the buffer pool can be either in-use or not, and they can be dirty or clean.
29
SS2 DP, WK 1
Buffer Management Policies specify rules that govern when a page from the database cache can be
written to disk
30
SS2 DP, WK 1
Force versus no-force concerns writing of clean pages from the buffer pool. The simple question here is:
who decides, and when, that a modified page is written out to disk? There are two basic approaches:
Force policy. At phase 1 of a transaction’s commit, the buffer manager locates all pages modified by that
transaction and writes the pages to disk. All pages updated by a transaction are immediately written to
disk before the transaction commits. It provides durability without REDO logging, but can cause poor
performance.
Forcing means that every time a transaction commits, all the affected pages will be pushed to stable
storage. This is inefficient, because each page may be written by many transactions and will slow the
system down.
No-force policy. This is the liberal counterpart. A page, whether modified or not, stays in the buffer as
long as it is still needed. Only if it becomes the replacement victim it will be written to disk. A no-force
policy is in effect if, when a transaction commits, we need not ensure that all the changes it has made to
objects in the buffer pool are immediately forced to disk.
Advantage of the force policy
It avoids any REDO recovery during restart. If transaction is successfully committed, then, by definition,
all its modified pages must be on disk.
Why not use it as a standard buffer management policy? Because of “hotspot” pages.
The force policy simplifies restart, because no work needs to be done for transactions that committed
before the crash – it avoids REDO. The price for that is significantly more I/O for frequently modified
pages.
Another drawback is that a transaction will not be completed before the last write has been executed
successfully, and the response time may be increased significantly as a consequence. With no-force
policy, the only synchronous write operation goes to the log, and the volume of data to be written is
usually about two orders of magnitude less.
Most crash recovery uses a steal/no-force approach, accepting the risks of writing possibly uncommitted
data to memory to gain the speed of not forcing all commit effects to memory. This avoids need for very
large buffer space and reduces disk I/O operations for heavily updated pages
INTRODUCTION TO ARIES
ARIES stands for Algorithms for Recovery and Isolation Exploiting Semantics. It has the following
general characteristics:
31
SS2 DP, WK 1
It maintains various data structures to identify dirty pages in the memory buffers and the active
transactions. (Pages are dirty if they are changed but not written to disk.)
On restart, it redoes the actions of all transactions to restore the state at the time of the failure.
It then undoes the actions of all uncommitted transactions.
Phases of ARIES
When the recovery manager is invoked after a crash, restart proceeds in three phases:
Analysis: Identifies dirty pages in the buffer pool (i.e., changes that have not been written to disk) and
active transactions at the time of the crash, by scanning through the log and other records. Determine
which transactions committed since checkpoint and which ones failed.
By the end of the analysis phase, REDO phase has the information it needs to do its job
Redo: Repeats all actions, starting from an appropriate point in the log, and restores the database state
to what it was at the time of the crash. To REDO an action, the logged action is reapplied.
Undo: Undoes the actions of transactions that did not commit, so that the database reflects only the
actions of committed transactions.
The Analysis phase identifies T1 and T3 as transactions that were active (therefore not
committed) at the time of the crash, and therefore to be undone;
T2 as a committed transaction, and all its actions, therefore, to be written to disk; and P1, P3,
and P5 as potentially dirty pages.
All the updates (including those of T1 and T3) are reapplied in the order shown during the Redo
phase.
Finally, the actions of T1 and T3 are undone in reverse order during the Undo phase; that is, T3’s
write of P3 is undone, T3’s write of P1 is undone, and then T1’s write of P5 is undone.
32
SS2 DP, WK 1
ARIES PRINCIPLES
There are three main principles behind the ARIES recovery algorithm:
Write-ahead logging: Any change to a database object is first recorded in the log (more on log shortly);
the record in the log must be written to stable storage before the change to the database object is
written to disk.
Repeating history during Redo: Upon restart following a crash, ARIES retraces all actions of the DBMS
before the crash and brings the system back to the exact state that it was in at the time of the crash.
Then, it undoes the actions of transactions that were still active at the time of the crash (effectively
aborting them).
Logging changes during Undo: Changes made to the database while undoing a transaction are logged in
order to ensure that such an action is not repeated in the event of repeated (failures causing) restarts.
The second point distinguishes ARIES from other recovery algorithms and is the basis for much of its
simplicity and flexibility. In particular, ARIES can support concurrency control protocols that involve
locks of finer granularity than a page (e.g., record-level locks). The second and third points are also
important in dealing with operations such that redoing and undoing the operation are not exact inverses
of each other.
THE LOG
The log, sometimes called the trail or journal, is a history of actions executed by the DBMS. Physically,
the log is a file of records stored in stable storage, which is assumed to survive crashes; this durability
can be achieved by maintaining two or more copies of the log on different disks (perhaps in different
locations), so that the chance of all copies of the log being simultaneously lost is negligibly small.
The most recent portion of the log, called the log tail, is kept in main memory and is periodically forced
to stable storage. This way, log records and data records are written to disk at the same granularity
(pages or sets of pages).
33
SS2 DP, WK 1
Every log record is given a unique id called the log sequence number (LSN). As with any record id, we
can fetch a log record with one disk access given the LSN.
A log record is written for each of the following actions:
Updating a page: After modifying the page, an update type record is appended to the log tail.
The pageLSN of the page is then set to the LSN of the update log record. (The page must be
pinned in the bufferpool while these actions are carried out.)
Commit
Abort
End
Undoing an update
OTHER RECOVERY-RELATED DATA STRUCTURES
In addition to the log, the following two tables contain important recovery-related information:
Transaction table: This table contains one entry for each active transaction. The entry contains
(among other things) the transaction id, the status, and a field called lastLSN, which is the LSN of the
most recent log record for this transaction.
The status of a transaction can be that it is in progress, is committed, or is aborted. (In the latter two
cases, the transaction will be removed from the table once certain ‘clean up’ steps are completed.)
Dirty page table: This table contains one entry for each dirty page in the buffer pool, that is, each
page with changes that are not yet reflected on disk. The entry contains a field recLSN, which is the
LSN of the first log record that caused the page to become dirty. Note that this LSN identifies the
earliest log record that might have to be redone for this page during restart from a crash.
During normal operation, these are maintained by the transaction manager and the buffer manager,
respectively, and during restart after a crash, these tables are reconstructed in the Analysis phase of
restart.
THE WRITE-AHEAD LOG PROTOCOL
Before writing a page to disk, every update log record that describes a change to this page must be
forced to stable storage. This is accomplished by forcing all log records up to and including the one with
LSN equal to the pageLSN to stable storage before writing the page to disk.
The importance of the WAL protocol cannot be overemphasized—WAL is the fundamental rule that
ensures that a record of every change to the database is available while attempting to recover from a
crash. If a transaction made a change and committed, the no-force approach means that some of these
changes may not have been written to disk at the time of a subsequent crash. Without a record of these
changes, there would be no way to ensure that the changes of a committed transaction survive crashes.
Note that the definition of a committed transaction is effectively “a transaction whose log records,
including a commit record, have all been written to stable storage”!
When a transaction is committed, the log tail is forced to stable storage, even if a no-force approach is
being used. It is worth contrasting this operation with the actions taken under a force approach: If a
force approach is used, all the pages modified by the transaction, rather than a portion of the log that
34
SS2 DP, WK 1
includes all its records, must be forced to disk when the transaction commits. The set of all changed
pages is typically much larger than the log tail because the size of an update log record is close to (twice)
the size of the changed bytes, which is likely to be much smaller than the page size.
Further, the log is maintained as a sequential file, and thus all writes to the log are sequential writes.
Consequently, the cost of forcing the log tail is much smaller than the cost of writing all changed pages
to disk.
CHECKPOINTING
A checkpoint is like a snapshot of the DBMS state, and by taking checkpoints periodically, the DBMS can
reduce the amount of work to be done during restart in the event of a subsequent crash.
Checkpoint is a mechanism where all the previous logs are removed from the system and stored
permanently in a storage disk. Checkpoint declares a point before which the DBMS was in consistent
state, and all the transactions were committed.
Checkpointing in ARIES has three steps.
First, a begin checkpoint record is written to indicate when the checkpoint starts.
Second, an end checkpoint record is constructed, including in it the current contents of the
transaction table and the dirty page table, and appended to the log.
The third step is carried out after the end checkpoint record is written to stable storage: A special
master record containing the LSN of the begin checkpoint log record is written to a known place
on stable storage.
While the end checkpoint record is being constructed, the DBMS continues executing transactions and
writing other log records; the only guarantee we have is that the transaction table and dirty page table
are accurate as of the time of the begin checkpoint record.
This kind of checkpoint is called a fuzzy checkpoint and is inexpensive because it does not require
quiescing* the system or writing out pages in the buffer pool (unlike some other forms of
checkpointing). On the other hand, the effectiveness of this checkpointing technique is limited by the
earliest recLSN of pages in the dirty pages table, because during restart we must redo changes starting
from the log record whose LSN is equal to this recLSN. Having a background process that periodically
writes dirty pages to disk helps to limit this problem.
When the system comes back up after a crash, the restart process begins by locating the most recent
checkpoint record. For uniformity, the system always begins normal execution by taking a checkpoint, in
which the transaction table and dirty page table are both empty.
* To quiesce is to pause or alter a device or application to achieve a consistent state, usually in
preparation for a backup or other maintenance.
MEDIA RECOVERY
35
SS2 DP, WK 1
Media recovery is most often used to recover from media failure, such as the loss of a file or disk, or a
user error, such as the deletion of the contents of a table. Media recovery can be a complete recovery
or a point-in-time recovery.
Media recovery is based on periodically making a copy of the database. Because copying a large
database object such as a file can take a long time, and the DBMS must be allowed to continue with its
operations in the meantime, creating a copy is handled in a manner similar to taking a fuzzy checkpoint.
When a database object such as a file or a page is corrupted, the copy of that object is brought up-to-
date by using the log to identify and reapply the changes of committed transactions and undo the
changes of uncommitted transactions (as of the time of the media recovery operation).
What is the difference between media recovery & crash recovery?
Media recovery is a process to recover database from backup when physical disk failure occur.
Crash recovery is an automated process taken care by a DBMS when instance failure occur, i.e. when
there is failure with an instance of the database.
36
SS2 DP, WK 1
between Database Management System and DDBMS is local DBMS is allowed to access single site where
as DDBMS is allowed to access several sites.
Distributed DBMS should have at least the following components.
37
SS2 DP, WK 1
o Economics: Increased complexity and a more extensive infrastructure means extra labour costs.
o Security: Remote database fragments must be secured, and they are not centralized so the
remote sites must be secured as well.
o Difficult to Maintain Integrity: In a distributed database, enforcing integrity over a network may
require too much of the network's resources to be feasible.
o Lack of Standards: There are no tools or methodologies yet to help users convert a centralized
DBMS into a distributed DBMS.
o Additional software is required
o HOMOGENEOUS
In a homogeneous distributed database, all the sites use identical DBMS and operating systems. Its
properties are −
• The sites use very similar software.
• The sites use identical DBMS or DBMS from the same vendor.
• Each site is aware of all other sites and cooperates with other sites to process user requests
(there is transparency).
• The database is accessed through a single interface as if it is a single database.
There are two types of homogeneous distributed database −
• Autonomous − Each database is independent that functions on its own. They are integrated by a
controlling application and use message passing to share data updates.
• Non-autonomous − Data is distributed across the homogeneous nodes and a central or master
DBMS co-ordinates data updates across the sites.
o HETEROGENEOUS
38
SS2 DP, WK 1
In a heterogeneous distributed database, different sites have different operating systems, DBMS
products and data models. Its properties are −
• Different sites use dissimilar schemas and software.
• The system may be composed of a variety of DBMSs like relational, network, hierarchical or
object oriented.
• Query processing is complex due to dissimilar schemas.
• Transaction processing is complex due to dissimilar software.
• A site may not be aware of other sites and so there is limited co-operation in processing user
requests (no transparency).
Client-Server
Collaborating Server
Middleware.
Client-Server Systems
A Client-Server system has one or more client processes and one or more server processes, and a
client process can send a query to any one server process. Clients are responsible for user-
interface issues, and servers manage data and execute transactions.
Thus, a client process could run on a personal computer and send queries to a server running on
a mainframe.
Collaborating Server Systems
The Client-Server architecture does not allow a single query to span multiple servers because the
client process would have to be capable of breaking such a query into appropriate subqueries to
be executed at different sites and then piecing together the answers to the subqueries.
The client process would thus be quite complex, and its capabilities would begin to overlap with
the server; distinguishing between clients and servers becomes harder.
Eliminating this distinction leads us to an alternative to the Client-Server architecture: a
Collaborating Server system. We can have a collection of database servers, each capable of
running transactions against local data, which cooperatively execute transactions spanning
multiple servers.
When a server receives a query that requires access to data at other servers, it generates
appropriate subqueries to be executed by other servers and puts the results together to
compute answers to the original query. Ideally, the decomposition of the query should be done
39
SS2 DP, WK 1
using cost-based optimization, taking into account the costs of network communication as well
as local processing costs.
Middleware Systems
The Middleware architecture is designed to allow a single query to span multiple servers,
without requiring all database servers to be capable of managing such multisite execution
strategies. It is especially attractive when trying to integrate several legacy systems, whose basic
capabilities cannot be extended.
The idea is that we need just one database server that is capable of managing queries and
transactions spanning multiple servers; the remaining servers only need to handle local queries
and transactions.
We can think of this special server as a layer of software that coordinates the execution of
queries and transactions across one or more independent database servers; such software is
often called middleware. The middleware layer is capable of executing joins and other relational
operations on data obtained from the other servers, but typically, does not itself maintain any
data.
PARALLEL DATABASE SYSTEMS
Parallel DBMS improves performance through parallelizing various operations: loading data, indexing,
query evaluation. Data may be distributed, but purely for performance reasons. In parallel database
system, parallelization of operations is performed for enhancing the performance of the architecture;
• divide a big problem into many smaller ones to be solved in parallel
• Increase bandwidth (in our case decrease queries’ response time)
In real time, there are situations where centralized systems are not enough flexible to handle some
applications.
The architectures related to Parallel DBMS are
Shared memory: In this architecture, a common global memory is shared by all processors. Any
processor has access to any memory module.
40
SS2 DP, WK 1
Shared disk: All processors have private memory (not accessible by others), but direct access to
all disks in the system. The number of disks does not necessarily match the number of
processors.
Shared nothing: Each processor has exclusive access to its own main memory and disk unit. In
this, each memory/disk owned by processor acts as server for data.
It the most common architecture nowadays.
Types of Parallelism:
1. Data-partitioned parallelism (Intra-operation): the input data is partitioned and we work on
each partition in parallel. A task divided over all machines to run in parallel.
2. Pipe-Lined Parallelism (Interoperation): Execution of different operations in pipe-lined
fashion, one operator consumes the output of another operator. For instance, if we need to
join three tables, one processor may join two tables and send the result set records as and
when they are produced to the other processor. In the other processor the third table can be
joined with the incoming records and the final result can be produced.
It involves ordered (or partially ordered) tasks and different machines are performing
different tasks.
41
SS2 DP, WK 1
1) Capacity: A parallel database allows a large online trader to have thousands of users accessing
information at the same time.
2) Speed: The server breaks up a user database request into parts and post each part to a separate
computer. They work on the parts concurrently and combine the results, passing them back to
the user. This speeds up, allowing faster access to very complex databases.
3) Reliability: A parallel database, properly configured, can continue to work in spite of the failure
of any computer in the cluster.
Disadvantages of Parallel Database
1) Programming to target Parallel architecture is a bit difficult but with proper understanding and
practice you are good to go.
1) Various code alteration has to be performed for different target architectures for improved
performance.
2) Communication of results might be a problem in certain cases.
3) Power utilization is huge by the multi core architectures.
4) Also, better cooling technologies are required in case of clusters.
NB: Distributed processing usually imply parallel processing (not vice versa). You can have parallel
processing on a single machine
Assumptions about architecture
Parallel Databases
• Machines are physically close to each other, e.g., same server room
• Machines connects with dedicated high-speed LANs and switches
• Communication cost is assumed to be small
• Can be shared-memory, shared-disk, or shared-nothing architecture
Distributed Databases
• Machines can far from each other, e.g., in different continent
• Can be connected using public-purpose network, e.g., Internet
• Communication cost and problems cannot be ignored
• Usually shared-nothing architecture
42
SS2 DP, WK 1
43
SS2 DP, WK 1
A computer virus is a piece of code that spreads from one computer to another by attaching itself to
other files through a process called self-replication. In other words, the computer virus spreads by itself
into other executable code or documents. The code in the virus usually executes when the file it is
attached to is opened.
The purpose of creating a computer virus is to infect vulnerable systems, gain admin control and steal
user sensitive data. Hackers design computer viruses with malicious intent and prey on online users by
tricking them.
Other forms of malicious software (malware) are worms, adware, spyware, Trojans, ransomeware, logic
bomb, etc. If you own or use a computer, you are vulnerable to malware. Computer viruses are
deployed every day in an attempt to wreak havoc, whether it be by stealing your personal passwords, or
as weapons of international sabotage.
VIRUS is said to be an acronym meaning Vital Information Resource Under Siege.
How does a computer virus operate?
A computer virus operates in two ways. The first kind, as soon as it lands on a new computer, begins to
replicate. The second type plays dead until the trigger kick starts the malicious code. In other words, the
infected program needs to run to be executed. Therefore, it is highly significant to stay shielded by
installing a robust antivirus program.
Any user who has ever been infected can tell you that computer viruses are very real. These programs
are typically distributed from host to host via email or a website that has been compromised. Some are
even attached to legitimate files and unknowingly executed by a user when they launch a particular
program. A virus is much more than the commonly perceived malicious code that functions with the
intent to destroy. They are classified by type, origin, location, files infected and degree of damage.
These common attributes are relative to most and all can have an adverse effect on your operating
system.
Computer viruses come in different forms to infect the system in different ways. Some of the most
common viruses are;
It is written in a macro language and infects Microsoft Word or similar applications (e.g., word
processors and spreadsheet applications) and causes a sequence of actions to be performed
automatically when the application is started or something else triggers it.As the name suggests, the
macro viruses particularly target macro language commands in applications like Microsoft Word. The
same is implied on other programs too.
44
SS2 DP, WK 1
In MS Word, the macros are keystrokes that are embedded in the documents or saved sequences for
commands. The macro viruses are designed to add their malicious code to the genuine macro
sequences in a Word file. However, as the years went by, Microsoft Word witnessed disabling of macros
by default in more recent versions. Thus, the cybercriminals started to use social engineering schemes
to target users. In the process, they trick the user and enable macros to launch the virus.
An executable virus is a non-resident computer virus that stores itself in an executable file and infects
other files each time the file is run. The majority of all computer viruses are spread when a file is
executed or opened. A non-resident virus is a computer virus that does not store or execute itself from
the computer memory. Executable viruses are an example of a non-resident virus.
Few file infector viruses come attached with program files, such as .com or .exe files. Some file infector
viruses infect any program for which execution is requested, including .sys, .ovl, .prg, and .mnu files.
Consequently, when the particular program is loaded, the virus is also loaded.
Besides these, the other file infector viruses come as a completely included program or script sent in
email attachments.
Multipartite Virus
This type of virus spreads through multiple ways. It infects both the boot sector and executable files at
the same time.
Polymorphic Virus
These type of viruses are difficult to identify with a traditional anti-virus program. This is because the
polymorphic viruses alters its signature pattern whenever it replicates.
More and more cybercriminals are depending on the polymorphic virus. It is a malware type which has
the ability to change or mutate its underlying code without changing its basic functions or features. This
helps the virus on a computer or network to evade detection from many antimalware and threat
detection products.
Since virus removal programs depend on identifying signatures of malware, these viruses are carefully
designed to escape detection and identification. When security software detects a polymorphic virus,
the virus modifies itself thereby; it is no longer detectable using the previous signature.
Overwrite Virus
This type of virus deletes all the files that it infects. The only possible mechanism to remove is to delete
the infected files and the end-user has to lose all the contents in it. Identifying the overwrite virus is
difficult as it spreads through emails.
The virus design purpose tends to vary and Overwrite Viruses are predominantly designed to destroy a
file or application’s data. As the name says it all, the virus after attacking the computer starts
overwriting files with its own code. Not to be taken lightly, these viruses are more capable of targeting
specific files or applications or systematically overwrite all files on an infected device.
45
SS2 DP, WK 1
On the flipside, the overwrite virus is capable of installing a new code in the files or applications which
programs them to spread the virus to additional files, applications, and systems.
Spacefiller Virus
This is also called “Cavity Viruses”. This is called so as they fill up the empty spaces between the code
and hence does not cause any damage to the file.
Example of viruses:
- Sleeper
- Alabama virus
- Christmas virus
- Friday the 13th
- ILOVEYOU (ILOVEYOU is one of the most well-known and destructive viruses of all time)
- MyDoom
- Storm Worm
- Melissa virus
-
Sources of viruses:
Downloading Programs
Programs that contains the downloadable files are the commonest source of malware such as freeware,
worms, and other executable files. Whether you download an image editing software, a music file or an
e-book, it is important to ensure the reliability of the source of the media. Unknown, new or less
popular sources should be avoided.
Email Attachments
Anyone can send you an email attachment whether you know them or not. Clicking on unknown links or
attachments can harm your device. Think twice before clicking anything and make sure that file type is
not ‘.exe’.
Internet
One of the easiest ways to get a virus on your device is through the Internet. Make sure to check URL
before accessing any website. For a secured URL always look for ‘https’ in it. For example, when you
click videos published on social media websites, they may require you to install a particular type of plug-
in to watch that video. But in reality, these plug-ins might be malicious software that can steal your
sensitive information.
46
SS2 DP, WK 1
Bluetooth
Bluetooth transfers can also infect your system, so it is crucial to know what type of media file is being
sent to your computer whenever a transfer takes place. An effective armor would be to allow Bluetooth
connectivity with only known devices and activate it only when required.
Unpatched Software
Often overlooked, unpatched software is also a leading source of virus infection (Unpatched software
means there are vulnerabilities in a program or code that a company is aware of and will not or cannot
fix). Security holes in a software are exploited by attackers and are unknown to software makers until
the attackers release them in the form of zero-day attacks. It is therefore recommended to install
software updates as soon as they are available on your PC.
Infected diskettes; infected CD-ROMS;
illegal duplication of Software, etc.
Apart from above-mentioned sources, file sharing networks can also be a source of computer virus
attacks too. Therefore, use PC security software keep your device safe and secure from malicious
attempts.
• Pop-ups bombarding the screen/ presence of tiny dots wandering across the screen
If you come across any of these above-mentioned signs then there are chances that your computer is
infected by a virus or malware. Not to delay, immediately stop all the commands and download an
47
SS2 DP, WK 1
antivirus software. If you are unsure what to do, get the assistance of an authorized computer
personnel.
How can you help protect your devices against computer viruses? Here are some of the things you can
do to help keep your computer safe.
Use a trusted antivirus product, such as Norton AntiVirus Basic, and keep it updated with the
latest virus definitions. Norton Security Premium offers additional protection for even more
devices, plus backup
Avoid clicking on any pop-up advertisements.
Always scan your email attachments before opening them.
Always scan the files that you download using file sharing programs.
You can take two approaches to removing a computer virus. One is the manual do-it-yourself approach.
The other is by enlisting the help of a reputable antivirus program.
Antivirus software was originally developed to detect and remove computer viruses, hence the name.
However, with the proliferation of other kinds of malware, antivirus software started to provide
protection from other computer threats.
Antivirus software is practically a requirement for anyone using the Windows operating system. While
it's true you can avoid computer viruses if you practice safe habits, the truth is that the people who
write computer viruses are always looking for new ways to infect machines. There are several different
antivirus programs on the market -- some are free and some you have to purchase. Keep in mind that
free versions often lack some of the nicer features you'll find in commercial products.
Some examples of antivirus;
- Norton Anti-virus
McAfee Virus scan
Dr. Solomon’s Took Kit, etc.
- Kaspersky
- Avast Antivirus
- Panda Cloud Antivirus
- Microsoft Security Essentials
- Avira AntiVirus
- AVG Anti-Virus
- Comodo Antivirus
- Immunet Protect
- PC Tools AntiVirus
- Bitdefender Family Pack
- Trendmicro
48
SS2 DP, WK 1
- Norton 360
- Watchdog
- ESET
Assuming your antivirus software is up to date, it should detect malware on your machine. Most
antivirus programs have an alert page that will list each and every virus or other piece of malware it
finds. You should write down the names of each malware application your software discovers.
Many antivirus programs will attempt to remove or isolate malware for you. You may have to select an
option and confirm that you want the antivirus software to tackle the malware. For most users, this is
the best option -- it can be tricky removing malware on your own.
If the antivirus software says it has removed the malware successfully, you should shut down your
computer, reboot and run the antivirus software again. This time, if the software comes back with a
clean sweep, you're good to go. If the antivirus software finds different malware, you may need to
repeat the previous steps. If it finds the same malware as before, you might have to try something else.
49
SS2 DP, WK 1
INTRODUCTION
• After having learnt the rudiments of data processing for several months, you should know the
career options available.
• As computers and technology continue to become the cornerstone for just about every business,
data processors will be in constant demand to help corporations, individuals, and government
offices adapt and more effectively use technology in the office and in the home.
• From creating computer networks within a company that allow offices to share files and data, to
working as a computer service administrator, data processing majors will be invested with a wide
array of computer and office skills that have real practical applications to the job market.
CAREER OPTIONS
The career options for computer graduates can be classified into different categories (Some of these
professionals have similar functions):
1. Programming & Software dev.
2. Information Systems Operation and Management
3. Telecoms and Networking
4. Computer Science Research
5. Web and Internet
6. Graphics & Multimedia
7. Training and Support
8. Computer Industry Specialists
Some careers require additional training or study and experience or working in the field.
1. Programming & Software development
Computer programmers of any kind write and test code that allows computer applications and software
programs to function properly. They turn the program designs created by software developers and
engineers into instructions that a computer can follow.
a) System Analyst
Computer systems analysts study an organization’s current computer systems and procedures and
design information systems solutions to help the organization operate more efficiently and effectively.
They bring business and information technology (IT) together by understanding the needs and
limitations of both.
b) System Consultant
Systems Consultant. The systems consultant reviews a firm's internal processes and aids the customer
network department and IT staff in providing initial technical support to end-users. Leads/participates
on projects that apply technology solutions to business problems
50
SS2 DP, WK 1
c) Software Engineer
A software engineer is a person who applies the principles of software engineering to the design,
development, maintenance, testing, and evaluation of computer software.
Computer software engineers apply engineering principles and systematic methods to develop
programs and operating data for computers. They follow the SDLC (Software Development Life Cycle)
phases in developing software Systems.
d) Systems Programmer
System programmer engages in the activity of programming computer system software.
The primary distinguishing characteristic of systems programming when compared to application
programming is that application programming aims to produce software which provides services to the
user directly (e.g. word processor), whereas systems programming aims to produce software and
software platforms which provide services to other software, are performance constrained, or both (e.g.
operating systems, computational science applications, game engines, industrial automation, and
software as a service applications).
Most programmers are application programmers. This is in contrast with systems programmer.
e) Database Analyst
A person responsible for analyzing data requirements within an organization and modeling the data and
data flows from one department to another. Formerly called a "data administrator," the database
analyst may also perform "database administration" functions, which deal with the particular databases
employed.
f) Artificial intelligence (AI) Programmer
An artificial intelligence programmer helps develop operating software that can be used for robots,
artificial intelligence programs or other artificial intelligence applications. They may work closely with
electrical engineers or robotics engineers and others in order to produce systems that utilize artificial
intelligence.
This refers to the capability of adapting or changing based on adding data. It may also mean
programming a system to look for or seek out specific conditions and respond based on those factors.
For example, their programming may enable robots to learn to interact with other robots or work
together collaboratively. Other systems they program may be designed to take specific actions only
under certain conditions.
g) Scientific Application Programmer
An individual who writes scientific application programs.
In computer programming, a scientific language is a programming language optimized for the use of
mathematical formulas and matrices. Although these functions can be performed using any language,
they are more easily expressed in scientific languages.
h) UI (User Interface) Designer
51
SS2 DP, WK 1
User Interface Design is a crucial subset of UX (User eXperience). User interface (UI) design is the
process of making interfaces in software or computerized devices with a focus on looks or style.
Designers aim to create designs users will find easy to use and pleasurable. UI design typically refers to
graphical user interfaces but also includes others, such as voice-controlled ones.
The role is one part Graphic Designer and one part behaviorist. UI Designers figure out the steps
consumers will use when accessing technology, then design models that shorten or streamline the steps
in the process to create a better user experience.
The role of a UI designer is one part Graphic Designer and one part behaviorist. UI Designers figure out
the steps consumers will use when accessing technology, then design models that shorten or streamline
the steps in the process to create a better user experience.
i) Embedded Systems Application Programmer
An embedded system is a controller with a dedicated function within a larger mechanical or electrical
system, often with real-time computing constraints. It is embedded as part of a complete device often
including hardware and mechanical parts.
2. Information Systems Operation and Management.
52
SS2 DP, WK 1
Information technology (IT) management consultants analyze the technology needs of organizations and
then make computer systems recommendations. They are mostly involved in decision making.
3. Telecommunications and Networking
a) Network Engineer/Consultant
The Network Consultant is an experienced and educated professional who certifies network
functionality and performance. They are responsible for designing, setting up and maintaining computer
networks at either an organization or client location.
Consultants meet with the organization’s manager, network engineers to discuss networking
requirements
b) Network administrator
The same as a Systems Administrator. Network and computer systems administrators are responsible
for the day-to-day operation of these networks. They organize, install, and support an organization's
computer systems, including local area networks (LANs), wide area networks (WANs), network
segments, intranets, and other data communication systems
4. Computer Science Research
a) Computer Scientist/Researcher
A computer and information research scientist is an expert in the field of computer science, usually
holding a PhD or professional degree. These scientists use the collective knowledge of the field of
computer science to solve existing problems and devise solutions to complex situations.
They invent and design new approaches to computing technology and find innovative uses for existing
technology. They study and solve complex problems in computing for business, science, medicine, and
other fields.
Computer scientists are often hired by software publishing firms, scientific research and development
organizations where they develop the theories that allow new technologies to be developed. Computer
scientists are also employed by educational institutions such as universities.
b) Computer Science Professor
Computer Science Professors teach courses in computer science. May specialize in a field of computer
science, such as the design and function of computers or operations and research analysis. Includes
both teachers primarily engaged in teaching and those who do a combination of teaching and research.
c) AI Researcher
An AI researcher carries out research involving reasoning, knowledge representation, planning, learning,
natural language processing, perception and the ability to move and manipulate objects. General
intelligence is among the field's long-term goals.
d) Data Miner
53
SS2 DP, WK 1
Data mining involves exploring and analyzing large blocks of information to gather meaningful patterns
and trends, it involves discovering patterns in large data sets.
The Data Miner/Data Mining Specialist's role is to design data modeling/analysis services that are used
to mine enterprise systems and applications for knowledge and information that enhances business
processes.
e) Bioinformatics Specialist
Bioinformatics specialists are computer scientists who apply their knowledge to the management of
biological and genomic data. They build databases to contain the information, write scripts to analyze it,
and queries to retrieve it.
Bioinformatics scientists conduct research to study huge molecular datasets including DNA, microarray,
and proteomics data.
5. Web and Internet
a) Web/Internet Applications programmer
Internet/Web Application Programming focuses on systems that are used over the Internet or an
intranet. A web application is a computer program that utilizes web browsers and web technology to
perform tasks over the Internet.
Web/Internet Applications programmer creates these programs.
b) Internet Consultant
Internet consultants use their technological and computer skills to help people or businesses access and
utilize the Internet. Their work may include implementing or refining a networking system, creating a
Web site, establishing an online ordering or product support system, or training employees to maintain
and update their newly established Web site. Some consultants work independently, and others may be
employed by a consulting agency.
c) Web developer/Webmaster
Creates or maintains a Web site. Provides content and programming or supervises writers and
programmers. Monitors the performance and popularity of the site. Provides secure forms and
transactions for Internet-based businesses.
Web developers assess the needs of users for information-based resources. They create the technical
structure for websites and make sure that web pages are accessible and easily downloaded through a
variety of browsers and interfaces.
Web developers structure sites to maximize the number of page views and visitors through search
engine optimization. They must have the communication ability and creativity to make sure the website
meets its user's needs.
d) Digital/Internet Advertising Designer
This professional design online adds for businesses and organizations using tools like cookies, search
engine marketing, email ads, banner ads, blogs, social network ads and more.
54
SS2 DP, WK 1
55
SS2 DP, WK 1
c) Technical Writer
A technical writer is a professional writer that communications complex information. They create
technical documentation that includes things like instruction manuals, user manuals, quick reference
guides, and white papers. They may also create more common types of content including social media
posts, press releases, and web pages.
Essentially, technical writers break down complex technical products into easy-to-understand guides
that help the end-user understand how to use the products and services.
d) Computer Operator
A Computer Operator is responsible for the technical operation of the computer system. They resolve
user problems by answering questions and requests.
8. Computer Industry Specialists
a) System Integrator
Abbreviated as SI, an individual or company that specializes in building complete computer systems by
putting together components from different vendors. Unlike software developers, systems integrators
typically do not produce any original code. Instead they enable a company to use off-the-shelf hardware
and software packages to meet the company's computing needs.
b) IT Recruitment Consultant
IT Recruitment consultants are responsible for attracting candidates for IT jobs and matching them to
temporary or permanent positions with client companies. They look for and discover talents.
c) IT Sales Professional
A sales professional is someone who sells products or services to potential customers. They seek to
solve prospects' challenges through the products they sell. Great sales professionals will have strong
selling and communication skills.
The role of an IT Sales Professional falls into three categories; pre-sales, sales and post-sales support.
d) Journalist, Computer-Related Publicist
Practices journalism in IT only. Publicists work as the bridge between their customers and the media.
They represent their clients by managing the media's perception of them.
An Attention to Detail
The slightest mistake can affect how a web page looks or how a program runs. Computer personnel
must pay close attention to detail to ensure everything works correctly and efficiently.
56
SS2 DP, WK 1
A Commitment to Learning
Technology is constantly changing, and those who keep abreast of the latest developments in
information technology are the ones who will be the most successful.
Versatility
The most successful computer professionals will be the ones who have skills that extend beyond
information technology, such as skills in business and finance.
57
SS2 DP, WK 1
58
SS2 DP, WK 1
•Establish and maintain a register of professionals registered under the decree to practice the
profession of computing in Nigeria. And to publish a list of persons registered from time to time.
• Carrying out every other function that has been granted to it by the provisions of the decree
which include Organizing and controlling the practice of computing in the country.
• Supervising the computing profession in Nigeria. Screening of all individuals who want to be
registered as computer professionals. Screening and registering of all corporate organizations that are
involved or wants to be involved in selling or using computing facilities and providing computing
professional services in Nigeria. Maintaining high standards of professional ethics, professionalism, and
discipline.
• Determine the academic standards in computing programmes/degrees such as computer
engineering, computer science, information science, etc. Accreditation of degree awarding institutions
and their courses. And evaluation of the certificates in computing. Conducting of professional exams in
conjunction with associations and bodies that are external to the council.
• Publicizing of the activities of the council. Making publication of computing professional works
such as books, journals, magazines, newsletters, etc.
Note: The above are the two major professional bodies for professionals in the computing, information
technology, and system industry, in Nigeria. There are other associations.
ITAN is an association of over 350 Information Technology driven companies in Nigeria. It was founded
in 1991 to promote IT literacy and penetration in Nigeria; and to promote members’ interest in the area
of trade, public policy formulation and negotiations with government on IT policy matters.
ITAN keeps its members informed about ongoing trends and issues relevant to the industry.
Their Services include:
The Nigeria Internet Group (NIG), founded in 1995, is a not-for-profit, non-governmental organization,
promoting the Internet in Nigeria.
To achieve its mandate, the Group engages in a number of activities which include; policy advocacy,
awareness creation and education.
59
SS2 DP, WK 1
The Institute for Management of Information Systems (IMIS) was founded in the year 1978 and it is one
of the leading association promoting excellence in the field of Information Systems Management
through education and Professional association. '
IMIS is previously known as Institute of Data Processing Management (IDPM). The headquarters of the
institute is located in United Kingdom. The institute approximately consists of 12,000 members and the
majority of the members are residents outside the UK.
To understand the importance of Information Systems Management the institute had consistently
played a prominent role. IMIS focuses specifically only on the practical application and management of
Information Systems within the society while all the other professional associations concentrates
primarily on the technical side of information systems. The Institute for Management of Information
Systems makes great efforts towards the recognition of Information Systems Management as one of the
key professions influencing the future of the world. The Institute for Management of Information
Systems and the British Computer Society (BCS) have been regarded as the 2 main UK professional
institutes for computer professionals
f. The Institute Of Software Practitioners Of Nigeria (ISPON)
Additionally, The Institute of Software Practitioners of Nigeria (ISPON) is the body of computer software
and related services industry in Nigeria. However, ISPON is concerned with the growth of a software-
driven Information Technology industry in Nigeria.
Others include:
g. The Internet Service Providers' Association of Nigeria (ISPAN) regulates and monitors ISPs.
h. Nigerian Information Technology Professionals in the America (NITPA)
i. Association of Telecom Companies of Nigeria (ATCN), a professional, non-profit, non-political
umbrella organization of telecommunications companies in Nigeria.
NB: Nigerian Communications Commission (NCC) is an independent regulatory authority for the
telecommunications industry in Nigeria. It is a commission and not a professional body.
60