Effective Storage and Data Management
Effective Storage and Data Management
Consulting LLC
Prepared by:
Rick Leopoldi
August 4, 2009
This paper discusses the processes and methods to define, characterize, and orient
data within the storage technology hierarchy. In addition, it addresses various and
necessary data usage and profile characteristics to effectively manage data and
storage with an IT service management perspective.
Introduction
Managing physical and logical storage devices is a necessary part of storage
management. That is, understanding the performance, capacity, and utilization of
storage devices is critical to any storage management effort. However within a
larger scope of IT service management, it is not the only part. Managing the data
that resides on the storage devices is an essential part of an ITSM effective data
and storage management effort. Data management implies an understanding of
the data usage requirements, characteristics and life cycle and employing them to
manage data on physical and logical storage devices.
Analyzing 3 separate areas of data can accomplish this; data definition, data usage
life cycle, and data profile characteristics. Once completed, a more effective ITSM
data and storage management effort can be achieved allowing a more definitive
understanding of where data can be placed physically and logically within the
storage technology hierarchy.
EMAIL databases would also fall into this category, allowing a wide community to
access it. In addition, there is usually a need for some security at the individual or
group level even though the whole database is still "Public". A shared
Repository/Dictionary would be another example of "Public" data.
Divisional "Corporate" Data
This is data that is vital to the corporation but accessed and used by a smaller
divisional community within the organization. This data could be files and/or
databases that support corporate business functions and have corporate data
characteristics such as restricted access requirements for human resource
information, payroll, security, and trade secrecy.
Departmental Data
This is data that is of interest to a more localized community, literally a department,
or maybe a geographical region such as a regional office. The data, at least in
summary form, may well be part of a feeder system to databases required at the
organizations business unit level or a corporate wide database, but in its detailed
form it can typically reside on a departmental system.
Private Data
This is data that is private and/or personal to an individual, or a small
homogeneous group of individuals. It may consist of working files that are a
preamble to supplying data to departmental or corporate databases, but at the
"working" stage are private. Security needs are typically less. The data may be
downloaded copies of departmental or corporate data, where direct updates to the
original file are not being done.
As with departmental data, there may be other factors that cause private data to
be located on departmental shared LAN or possibly a NAS or SAN but this is
atypical.
It is natural to associate private data type with workstations and this is an area
where the platform overlap may occur. Typically, when LANs behave as
departmental systems, there may be issues of security or transaction rates that can
impact placing this data on a department shared LAN, NAS or SAN.
The process for deciding where a given database should reside should begin with
this initial data requirement and characterization analysis. This is a recommended
process for managing data to a lesser number of standard platforms.
Possible Location
NAS, SAN
Shared LAN, NAS or SAN
Shared LAN, NAS or SAN
Shared LAN or NAS
Workstation, Shared LAN, SAN
Beginning with this initial high-level data definition process, the basic idea is to
determine, for each file, dataset, or database, the data usage characteristics using
the data profile characteristics as a base set of criteria.
The approach is to take criteria such as security, response time, concurrency,
updating, currency, etc. and use some metric to determine whether it is mostly
true for this data, partially true, or not true at all (High, Medium or Low could be
used).
This allows for an organizations unique requirements to qualify and quantify the
importance of the particular file, dataset, or database for each criterion. As an
example: