
Why won't we just hit delete?
Published: 21 October 2003 10:00 GMT
Is policing just how much data is stored in your organisation too much effort? Martin Brampton wonders where current trends are taking us…
The striking thing about data is just how much of it there is. And the speed with which it continues to grow. Storage devices are giving us ever greater raw capacities but that can be as much a problem as a solution.
Of course, just about every kind of communication is now digitised and therefore amenable to being stored as data. That is one of the driving forces behind the phenomenal growth in storage. Another is the inefficiency with which much information is stored. The first paragraph of this article is only a few dozen words, yet required 20,000 bytes of disk storage.
We also have a strong tendency to hang on to data more or less indefinitely. IT managers figured out long ago that only a small minority of users create storage burdens that need to be controlled. By and large, it is far easier and cheaper to simply provide the increasing storage that people want than to attempt to police their activities.
This does lead to growing technical complexity, though. For the sake of good management, there is a strong preference for data to be stored on network servers. The general-purpose network server has increasingly been replaced by flexible provision of processing backed up by large scale pooled storage facilities.
As with most technology, installing a major storage facility is often easier than the inevitable subsequent upgrades. Admittedly, the upgrades are easier than the corresponding upgrades to servers with directly attached storage. But problems certainly arise, especially as one of the demands is that the storage facility should look like it continues forever.
After all, when we talk about the availability of servers, what we really mean is often the availability of data. Processing power is easily replaced, and if there is enough of it around, processing can be moved from one place to another in moments. There are certainly some transitional issues but for all but the most critical applications they are manageable.
Storage is different. There are situations where data can be kept in more than one place but generally this creates more problems than solutions. Mostly it is much more effective to have only a single master copy of any set of data and that is the resource that needs to sustain availability and integrity.
Behind the scenes the data might be duplicated through techniques such as mirroring. But this is the kind of area where things start getting decidedly complicated. As demands change, so the hardware has to be changed. Even sticking with the same hardware provider will not avoid significant technology shifts over time. The next layer up from the storage hardware tends to be proprietary software that also changes and creates compatibility problems.
Administration of large-scale storage facilities can also become a nightmare. As capacity is added on different occasions, often the best that hard pressed operations staff can manage is to secure basic backup and restore functions. More proactive management of data to achieve the best economy and flexibility is frequently impractical.
This is the logic that drives the latest layer to be added to the mix - the storage portal. As a specialised kind of systems management tool, it papers over the various underlying technologies to give a single uniform interface to storage. The only snag is a new dependence on the proprietary vendor of the portal.
Simultaneously, the provision of processing power is driving towards blade servers, where the physical infrastructure supports rapid commissioning of extra capacity. The complexity of data storage is pushed away to the storage system, via the portal.
Perhaps there is one odd thing about all this. We are moving to a situation where the storage we see as users is an abstraction built on top of a physical system that we know little about, and processing is spread across arbitrary devices. We have seen this kind of thing before. It was called the mainframe. Is all progress an illusion?
Next week: Are we retaining too much information?
For past Devil's Advocate columns see the links below, or type 'Devil' into our search engine.
Martin Brampton is founder of Black Sheep Research, an independent consultancy providing research, writing and speaking services on a wide range of business and technology issues. Martin was previously a director at Bloor Research, and has worked with IT as a user and analyst for over 20 years. He is a longtime contributor to silicon.com and his blog can be found on his website.
The facility in Surrey is a European distribution warehouse so ideally you will have worked in a similar distribution site, managing the operation ...
The candidate is to lead on database technologies within the organisation, responsible for the integrity and security of the data stored and the back ...
RMAN, Data Guard etc.Responsible for the administration of existing enterprise databases and the analysis, design, and creation of new ...
Agenda Setters 2009
Welcome to the ninth annual Agenda Setters poll – silicon.com's list of the top 50 most influential individuals in the technology and IT industries, from techies and CIOs to entrepreneurs and business leaders. Find out more in our latest special report.
Stories from the web...
Copyright © 2008 CBS Interactive Limited. All rights reserved. Top of page
Seb Janacek Minority Report: Mac Mini - a real nowhere machine What could it have become with a little more love and attention?
Bethan Jones Can I use a netbook as my everyday work machine? Part II silicon.com sub editor reveals whether her netbook delivered