|
Computer Storage Architecture - From Ancient Stone Age to
Modern Computer Storage
Our
ancestors used to carve on stones to record their important events, histories
and stories. Then, the Chinese invented paper and methods for mass producing it
some three thousand years ago, after getting tired of carving on stones and
bamboo. This made storing information much easier and less labor-intensive.
Paper has remained throughout the last three thousand years the primary medium
for societies to record information.
However, there are many issues related to the use of paper for storing
information. For one thing, paper is not very durable. It can also be quite
heavy, although not as heavy as stone! But by far the biggest issue with paper
is accessibility: finding the information that we need within a document quickly
and easily.
Fast forward to today. Since the invention of the computer disk, we have devised
three main ways of storing massive amounts of data and information onto disks.
These three ways are Direct Attached Storage (DAS), Network Attached Storage
(NAS), and Storage Area Network (SAN).
Direct Attached Storage
By
my estimate, 95% of all computer storage devices today are DAS. For example,
just about all desktops and laptops are using DAS. By saying DAS, we are
describing the disk storages that are directly connected to the computers that
are using them, either internal to the computers or external to them, like
external USB disk drive. Most of the servers today are also using DAS. One of
the main reasons why so many computers are using and sharing DAS is because of
the design decision made when the computer was first introduced. When computers
were first introduced, configuring them together in a networked environment was,
in general, not part of the consideration. Computer networking has become
pervasive only during the last 15 or so years. Despite the fact that computers
were operated for a long time without any DAS (so called diskless workstation),
the diskless model never caught on as a popular way of deploying computers.
In
some cases, DAS is unavoidable, for example, with laptops: we need the storage
to go with us wherever we take out laptop, so not having DAS for the laptops is
just not an option. DAS is also ideal for low-cost deployment with relative
localized applications.
There are a few disadvantages of using DAS. DAS is a captive storage, generally
speaking. Once the storage device is attached to a computer, it is very
difficult to redeploy it even when the situation arises where one computer has
plenty of storage space and another is running out of space. There is not an
easy way to reallocate the storage space to resolve this situation. Since there
is not an easy way to predict most of the time how the disk storage is going to
be used on a given computer, installing the right amount of DAS is always
tricky. With DAS, there are also backup issues; since computers with DAS are
distributed, it is next to impossible to back them up easily and quickly. So
most companies just back up the servers and never or rarely ever back up the
desktops and laptops. The distributed nature of DAS can also make managing and
maintaining it a nightmare for Information Technology (IT) professionals.
Network Attached Storage
With the development of
Network File System (NFS) by Sun Microsystems, Inc. for
UNIX and Common Internet File System (CIFS) for Windows by Microsoft, a new way
of storing files became possible. With these widely accepted standards, storing
files in storage devices that are directly connected to the computer network and
which are specially designed and optimized for file accessing has become quite
popular.
NAS solves the issues of
needing to predict the disk storage usage for the most part since most NAS
implementation allows the expansion of the storage easily and dynamically. NAS
is also relatively easy to deploy and maintain. Since NAS is built independently
of the Operating Systems, it can serve any computers that support NFS & CIFS,
and just about all the computers support either or both protocols. NAS can also
be centrally deployed and managed, and it supports two of the most popular file
systems: NT File System (NTFS) and UNIX File System (UFS). As NAS keeps more and
more mission-critical data, many fault tolerance, and data protection and
expansion features have been added to NAS over the last several years.
Despite its popularity due
to its ability to provide a network-based storage, there are several drawbacks
to NAS. Computers that are using NAS store their files remotely over the
network, so the applications are unable to access data at the block-level.
However, accessing data at the block-level is essential for all database and
server-based applications. Backup support in general is also slow and
inefficient since all the backups have to be performed at the file system level.
In some environments, NAS may also have scalability issues.
Storage Area Network
Two main characteristics of
SAN are how it operates when it comes to data transfers within its system and
between external devices, and how SAN lays out all the hardware to serve the
data with super performance and increase redundancy. SAN transfers data in
blocks to increase the throughput performance as well as to allow the database
and some server-based applications to work seamlessly in their native
environment. Unlike NAS, which usually comes as one or two separate units, SAN
comes with several different units and other support devices like specialized
SAN switches, separate management units, etc.
Just about all SAN units can
also serve up NFS & CIFS, just like NAS. However, not all SAN units use NTFS and
UFS; most have their own proprietary file system that understands NTFS and UFS.
While NAS uses a general purpose network for data transfers, SAN uses its own
very highly sophisticated network to perform some of its data transfers via its
web of fiber channels, hundreds to thousands of gigabits at a time. Backup in a
SAN environment can also be done in blocks to reduce the time and resources
necessary to backup mountains of data.
So, it is due to issues of
cost, complexity, and lack of common standards that are the three top reasons
why SAN has not been as widely deployed as NAS up to this point.
Like all areas of IT, the
storage industry is ever-changing. We are seeing convergences in many of the
methods described above; and with the introduction and standardization of iSCSI
protocol and IP SAN technologies, we are seeing more creative and less costly
ways of using these new combinations as well.
As an old IT saying goes,
garbage in, garbage out. What is common with the state of all storage
technologies is that none of these technologies addresses the issue of managing
the content being stored inside. By managing, I mean that users should be able
to find the files they need when they need them; there should not be any
duplications anywhere; and yet, different version of the same files can be kept
by certain users, who can also perform automatic data migration based on
watermarks or retention policies, in order to meet certain governmental
requirements.
I am an optimist and I have
a hope. The day will come when we all can carry the entire Library of Congress,
the library collections of all the major universities, and all the content
currently found on the Internet in our handheld devices, and we can search,
read, or listen to this content anytime, anywhere. And more importantly, owning
such a device will cost us only a price comparable to dinner in a nice
restaurant, getting it refreshed monthly via our wireless network will cost no
more than what a can of soda costs, and it can be refreshed within the time it
takes to drink that very same can of soda! One day…
By
Benson Yeung, Senior Partner

Benson Yeung Biography

Back to Top 
Information Request Form
|
 |