Issues and Implications for LAN-Based Imaging Systems
Articles and Tips: article
Systems Engineering Division
01 May 1992
Imaging is one of the hottest buzzwords currently being heard throughout the networking industry, but the technology involved is surrounded by confusion. This Application Note introduces imaging, defines what an image is, and looks at traditional and evolving methods for manipulating images. It then describes various issues and implications to be considered for implementing imaging on a LAN.
After a first look at imaging, most people come away confused about the technology. While this reaction is not unusual with most new technologies, it seems especially true of imaging. This AppNote will hopefully cut through some of the confusion, organize the key issues surrounding imaging, and present some of the main implications to consider for LAN-based imaging implementations.
The subject of imaging is far too broad to cover in one AppNote. Simply describing the technological breakthroughs that have made imaging technology possible could fill a small volume. And since almost every issue and technological device for imaging is currently aimed directly at the LAN as the preferred implementation platform, there are numerous network issues to be examined as well.
This introductory AppNote on imaging focuses on two central questions:
What is imaging?
What issues must be considered before implementing imaging on a LAN?
This discussion is geared mainly toward network designers, implementers, and application developers who are wrestling with decisions about how and when to implement imaging-related technology. Currently, NetWare is a suitable platform for imaging applications. The AppNote includes a sample scenario that details areas that must be studied to justify the use of imaging in an organization. Future AppNotes in the series will look at specific imaging hardware and software, process architectures, and potential bottlenecks.
What Is Imaging?
The use of pictures to communicate goes back much farther than petroglyphs to the very core of human thought. We look, we see, we form images in our brains that can be instantly recalled. We even speak using picturesque language to bring vivid perceptions to our listener's mind. Almost everyone enjoys being entertained with vivid pictures that communicate thoughts and ideas. It seems only natural, then, that pictures should be used to communicate on the computer as well.
Over the last several years we've seen a definite trend toward picture-oriented computing. The icons and other graphical elements of today's popular user interfaces are, at least, a promise of communication by pictures. However, computer imaging technology intends to go even farther and let the pictures contain the actual information where key information is evident.
One reason for the difficulty in defining imaging is that the industry itself is not clear on the matter. Imaging is not a technology of its own; it is a conglomeration of hardware and software applications. Imaging has no single inventor, and no one company can be credited with its introduction. It is true that major companies (Xerox, Kodak, IBM, DEC, Wang, et al) offer imaging products, but they did not invent imaging, nor do they subscribe to a standard for it.
In an environment where standards abound, many imaging implementations were put together with a "standards" insensitivity that keeps them from being widely transportable, even on their own network. Often, the vendors who initiated imaging had no other standards besides a vision of how images could be used in the computer industry.
What Is an Image?
To understand imaging, we must first establish a suitable answer to this question: what is an image? We might say that an image is simply a picture of something-like a photograph. To be stored electronically on a computer, the photograph would have to be converted into a bit map consisting of tens of thousands of dots, as shown in Figure 1.
Figure 1: To the computer, a picture is represented by a bit map made up of tens of thousands of dots.
Of course, anything that could be a snapshot would fit this definition of an image. But then what we have as "images" are huge files that are difficult to work with and take up lots of computer storage space. As shown in Figure 2, a single 36MB bit map would fill up 25 1.44MB floppy diskettes.
Figure 2: The storage requirements for bit maps are typically very large - much more than fits on a single floppy diskette.
Consider what would happen if we took a group of images such as a the pictures in a book and filed them away with only one index to all of them. When we went to print them, it could easily require printing all of them. Consequently, while the size of the images going into the imaging systems may not have been a problem, the size of the images coming out might well be.
Figure 3: Images should be retrievable one and some at a time.
Certainly both text and graphics should be included in a set of requirements for what an image is. But an image can be more than just text and graphics. Eventually, electronic images of spreadsheets, reports, and graphics will hopefully become part of the images that are captured and stored for other uses, such as snapshot and archival data. These will significantly economize and streamline the paper-handling part of business.
Figure 4: Data, text, graphics, and on-screen information could all become images.
From the moment we create graphic files in the computer, we have captured the images for storage. Yet without a programmed method for accessing them, they cannot be easily retrieved.
Imaging as a Utility
In trying to define exactly what imaging really is, we came up with this helpful idea: imaging should be thought of not as a noun, but as a verb. With that in mind, we propose the following definition.
Imaging: Capturing information graphically and associating it with a pointer.
By this definition, imaging becomes a "utility." We can therefore propose that this utility can have several dimensionsCfor instance, making a picture of something and extracting data directly from the image with OCR (optical character reader) techniques. This definition gives us a much better grasp of where the imaging industry is headed in the future.
Since capturing information electronically and making electronic data available as a resource are two of the computer's main functions, it seems only reasonable to expect that data contained in images created in the computer should be accessible in the computer. But that is not necessarily true, especially in business.
Electronic data capture, in the form of data entry personnel keying in data, has been around for quite a while. Yet with all of that data going into the computer, it is interesting that only about 5 percent of the total number of documents that businesses use are available electronically. The other 95 percent of the documents are filed away in the traditional filing cabinet.
In other words, the letter you create with a word processor and mail out is most likely stored in a cabinet, not in a computer, by the recipient. If the letter were imaged at the other end, it could be associated with some method of retrieval.
Moving from EDI to IDE
Traditional electronic data interchange (EDI) is the data entry paradigm being proliferated with mainstream imaging today. As illustrated in Figure 5, this paradigm calls for data entry personnel to sit at the computer, review the image on-screen, and enter data from it manually.
Figure 5: Mainstream imaging relies on data entry personnel to review on-screen images and enter data manually.
But data entry personnel performing the tasks of reviewing and entering data is not what imaging is really about. Whereas imaging applications now involve document preparation and data entry, one hope is to eliminate the tedious manual entry of data by utilizing image-aided data entry (IDE).
IDE is a method for directly incorporating data from the scanning process. It is not "see the image on screen and manually enter the relevant fields into the database." Rather, it is "scan the document and the relevant fields are entered automatically." This concept of how to work with images is far removed from treating the image as a picture.
The Many Faces of Imaging
With the realization that imaging is a multi-faceted utility, an imaging system should be looked at as more than just a storage system whereby an initial image is placed in the computer. Currently, there are four areas that imaging market could be said to include:
Storage and retrieval
Other areas, such as databases (with which imaging interfaces to a very high degree), are also important.
Storage and Retrieval
Figure 6 illustrates the traditional view of imaging as a storage and retrieval system for scanned-in documents. Note that the entry of key words is a manual procedure to be performed by the operator.
Figure 6: Traditional imaging is a mechanism for storing and retrieving scanned images of documents.
One source defines image processing as "processing digitized data in a manner which extracts information for object detection, recognition, and classification" (Kendall Preston, Advanced Imaging, May 1990). As illustrated in Figure 7, image processing adds a front-end piece (such as OCR) to the traditional store-and-retrieve scenario so that data can be extracted from the image.
Figure 7: Image processing allows data to be extracted from the image before it is indexed and stored.
Image processing can span the distance between graphics as a scanned image and graphics as converted to text, changing scanned sheets to indexable words.
Scanned images process at different rates depending upon their complexity. Line drawings and graphic textual images have less density, compress faster, transmit more rapidly, and require smaller files. Image processing can be used to identify individual objects, such as type fonts, faces, and shapes, in continuous tone photographic images or line drawings, as shown in Figure 8.
As alluded to earlier, image processing is significant as a methodology within imaging itself. It is mainly directed to the extraction of information from the image. This is technology that exacts a heavy toll on performance and the check book.
Figure 8: Graphic images can be continuous tone (even color) photographs or line drawings before scanning.
Document management is different from storage/retrieval and image processing, although it may make use of either or both. Document management might be likened to file cabinet management, where the folders in the cabinets are documents.
This is different from treating each scanned image as a record unto itself. In the simplest application of document management, all of the scanned images become part of the document, and yet all images may need to be accessible individually. This puts a lot of demand on the system that images the documents to track multiple images, possibly in a sorted or categorized order. It may require that image processing be done on the original document during the scanning phase.
All in all, document management can be an extremely complex imaging application implementation.
This technological implementation of imaging involves the programming which makes imaging useful to the many instead of just to the few. It holds the promise of delivering the right document to the right place, hopefully at the right time. While this programming is actually an adjunct to imaging technology, it has the potential for the highest return on the invested dollar. According to the imaging authorities, this is the reason for imaging.
A method by which all of the imaging information can be distributed to the people who need it may be what you've been searching for all along, but work-flow technology is not a cure-all. In fact, the network problems it creates are so vast that they need solutions at two levels:
At the network level: the network operating system must be able to handle imaging and work-flow technology.
At the application design level: imaging solutions must use store and forward technology to its best use or risk clogging the entire network.
Imaging and Databases
Imaging is a technology that has previously caused much stir in the database community. It fits well with database operations, allowing for pointers in the database to indicate images in the image base. Yet many questions arise about how the relational model would handle images. The interactions between imaging and databases is a subject we'll have to address in a future AppNote.
What Imaging Is Not
Now that we've discussed what imaging is, we need to spend a few paragraphs discussing what it is not. Sometimes imaging is confused with multimedia, but imaging does not normally incorporate sound, voice, or video. Imaging is also mistaken for "pen and forms" computing. For many operations, forms-based computing is elegantly efficient when compared to image-based computing. Few organizations wants to generate more internal paper than they need to.
The fact that imaging is often confused with other technologies has some scary implications. It hints that perhaps the computer industry may be moving a bit too fast for even the most technical among us to keep up with.
Traditionally, imaging solutions resemble the process shown in Figure 6. When vendors talk about imaging, this is often what they are talking about. What gets scanned into the computer and why are decisions pretty much left up to the company employing the imaging technology and the vendor supplying it.
This process application or process flow is not a process architecture. Some of the architectures employed to obtain imaging services will be covered in a later AppNote. This process flow results in the next set of issues surrounding the implementation of imaging technology: compatibility between products from different vendors.
Because it seems there are almost as many imaging vendors as there are proposed application solutions, you can easily find more than one vendor who can offer you a solution compatible with your needs. With no intention of denigrating the imaging industry, there are far too many vendors with too many different stories for the average company to evaluate. This is a problem that causes another problem: product variation.
Most vendors employ their own file format for image storage; one uses Btrieve. Although image compression techniques vary, several hardware vendors provide translations between one compression type and another. This is a great idea if you only need to send one or two images to someone else who is using a different technology. But it's a useless feature if you are considering transferring many images from one system to another, because you typically have to bring each image up on the screen before you can make the conversion.
Since imaging technology is a conglomeration of different technologies, it is surrounded with standards from each of the participants. Unfortunately, there is no standard method for architecting an imaging system, nor is there a standard for deploying imaging applications. In short:
There are no imaging compatibility standards between imaging systems.
There are no transport standards to migrate images from one computer system to another.
There are no standards for implementing imaging applications.
The Need for a Consultant
Letting a vendor who sells a single imaging solution do the initial design and analysis takes on the aspect of letting the fox run the hen house. There are a few knowledgeable vendors who will not take on a project beyond their capability, but "few" is the key word.
Finding a qualified consultant with experience is a good first step, and for several good reasons:
There are several technologies which should be examined before you make an imaging decision; a good consultant will have the numbers for all of them.
A qualified consultant does not recommend imaging for everything, and probably not for most things.
A qualified consultant would be sure you did all of your homework concerning the legal issues before starting implementation studies.
A qualified consultant will keep your management appraised of the cost and the benefits before implementation.
Having upper management support is vital to implementing an imaging solution. Some vendors or consultants are well suited to clarifying issues, as well as holding management support. Imaging can be expensive; it involves upgrades for workstation monitors and graphics controllers, costly scanners, and a lot of time spent in preparing the documents. And it is impossible to enable enterprise-wide imaging without significant preparation of the network. Keeping management support through rising costs and before future returns are realized requires great clarity of purpose.
Personnel movement after imaging implementation becomes a real factor to deal with. There are two types of difficulties with personnel and imaging from the very start. First, people need to be retrained for new postions; paper shuffling tends to go away. Second, imaging can eliminate the need for employees to leave their desks to find documents, thus cutting out much of the "people motion" in an organization. People who use imaging must make a conscious effort to get away from their computer periodically and interact with others around the office.
IDE brings with it a higher complexity on EDI issues, specifically the data entry for indexing, query, and reports. With IDE, clerical personnel can often be reassigned as document preparation and entry personnel, incorporating vast amounts of information to the corporate database.
Global System Perspectives
Understanding corporate issues is critical, even when management mandates a single station solution, or a "grow no farther" solution. In a network system, integration, migration, and topological considerations take no less precedence.
When you see the "intuitive" power of imaging and document management coupled with the corporate database, system growth becomes a real issue. Data migration to and from an imaging system is also an important consideration. How do you attach images to a database when the original size of a database record (Name, Address, City, State, Zip) might be 250 bytes, while the compressed image of an employee photograph could be up to 10KB, or 103bytes?
Any increase of three orders of magnitude (103) is going to cause a stir, especially for a database. Searching and sorting algorithms fall apart when the size of the database grows too large. A 103 increase in network traffic may be far too muchCnetwork bandwidth is not without limits.
For the network administrator, then, handling increasing data storage and bandwidth can be real issues, depending upon the size and complexity of the imaging implementation decided upon.
Justifying Your Images
Before implementing any new technology, it is a good idea to ask yourself is there a real benefit--both strategic and tactical--in that technology. If the answer is yes, the technology warrants a thorough examination. Little questions like, "What technology does this new technology replace?" often bring up other propositions that might nullify any justification for a transition, especially a transition to what might be a more expensive technology.
What technologies might nullify imaging as a solution? Certainly there are alternatives:
Computer output to laser disk (COLD)
Computer output to micrographics (COM)
Micrographics technologies (microfilm)
These three techniques are illustrated in Figure 9. The other alternative, of course, it to continue with the current paper-based system.
Figure 9: Alternatives to imaging systems include laser disk and microfilm storage.
Each of these techniques is an archival technique. Each has merit for specific application implementation. COLD might be excellent for optical jukebox retrieval rates; COM is useful for periodic review of document discrepancies.
The point is that you must choose the technology that is right for what you are trying to do. Imaging is an expensive storage and archival technology, but a very beneficial work-flow document technology with incredible side benefits.
As an example of how complex this issue is, consider the plight of the records manager who would like to employ imaging in the area of purchase orders. Here is the scenario.
The records manager knows that Purchasing is retaining the current year's purchase orders. Due to necessary document retrieval, they are kept in filing cabinets in the Purchasing department. Once a year, Purchasing pulls the following documents from its filing cabinets for shipment to the retention facility: the original request for purchase order, the purchase order itself, and any copies and correspondences. At the same time, the Accounting department puts out reams of paper: copies of Purchasing's purchase orders, copies of shipping and receiving documents, and originals of payment vouchers. There is a lot of duplication, and several departments are involved. What's worse, this mountain of paper represents only last year's documents, which exist in addition to the current year's documents.
The records manager knows that there is paper duplication between Purchasing and Accounting. (For now, we'll ignore the legal issue of whether each copy of the purchase order is a truly separate document). Additionally, the records manager knows that if Purchasing and Accounting go to imaging-based processing, no one in the record manager's department will have to go to a file cabinet for a recent document, or to the warehouse and search through the boxes for an archived document. These manual records searches can be expensive, a point to be covered in any justification study.
Having looked briefly from the record manager's point of view, let's look at some of the design questions surrounding an imaging system for that records department.
First, the original request was for the Records department, but from the looks of it, the Purchasing, Accounting, and Receiving departments might benefit more from imaging for the current (not retained) documents. This brings up the question of involving one or all departments and delays an immediate solution.
From another point of view, would internal documents be better off handled in a forms-based computing paradigm, where internal paper is never created and only external and outbound documents are archived? This approach brings with it the added difficulty of obtaining the consent of all department heads after you tell them that every report, data gathering technique, and other procedure in their department will need to be completely recreated electronically. Such is the nature of forms-based computing. There is no paper to create them from, only a data source.
No matter how complex the above issues seem, over the long term they pose potentially great solutions to problems of redundant paperwork. The major difficulty remains that people are not ready to move completely from paper to electronic methods--at least not all in one step. Thus, a transition phase may well be in order and imaging could well be it. Imaging offers the electronic accessiblity of computer documents. The paper flow changes, but the ability to count, locate, and typify from the original image is still there.
The records manager's imaging system to retrieve documents for Purchasing may well be the best initial site. Eventually, the Records department may be the recipient site for snapshots of outbound purchase orders sent directly from computers in Purchasing and stored on optical media.
But, even when only one department initiates imaging technology, LAN supervisors need to be wary. In lieu of a complete forms-based computing installation, imaging is hot technology. Once you put in one installation and show that it serves the community for which it was initiated, several more departments will be anxious to use it.
When Purchasing has its old orders retained as images, you can bet that Accounting will want to look them up electronically. This phenomenon is called "turnpiking." Turnpiking occurs when a solution initially sought for by one group is found to be useful by another. Where initially a road was put in for one reason, more and more people find it useful for their own reasons. The traffic is increased.
Ramifications to Be Studied
The above example highlights the issue of turnpiking. We've also alluded to personnel shuffle issues. The offshoot issues from these two considerations alone can take weeks to analyze and integrate into an imaging proposal:
How many departments have documents to be imaged?
What other departments would need them?
What are the retention times for documents in storage?
What would the traffic pattern look like?
What is the cost vs. COLD, COM, micrographics?
If the current technology used by the records manager is a paper-based system with a single computer program to track the information, two comparisons will need to be done to justify the cost of an imaging system. In the first study, micrographic techniques (the use of small-image microfilms) will have to be explored to determine if micrographic techniques are quicker and cheaper in that specific application. This is the cost effective alternative technology alluded to earlier.
Additionally, inside of that study, one would want to consider the reasons for moving to imaging versus paper or micrographics. There are three areas to consider when justifying the move to an imaging system (please forgive the "ute" alliteration):
The cost of an imaging system versus your present system must compute. Imaging is primarily a storage and retrieval technology, in some respects very much like micrographic techniques. Imaging is an expensive technology, while micrographics is cost efficient for what you get.
On the other side of the question, one dispute that leads to a law suit against you will resolve the compute question rapidly. However, being sued over a missing document can be just as expensive as being sued over a document you have retained too long. Even after the implementation of imaging technology, retention schedules can carry with them a much higher gravity with regard to potential disaster.
No matter how you wish to justify the matter, please do not let anyone convince you that the compute reason is based upon the National Archives argument of $120 dollars as the average cost to manually retrieve a lost document. Very few people believe this is a common occurrence. Besides, you can lose a document far more thoroughly with an imaging system than with retained paper, given the time to build a large archive of documents.
After you obtain ground zero in the first study, a second study based on imaging technology and processing can be done. In addition to going over all of the factors involved in the first study, you'll need to study other issues:
The legality of imaging the documents (state laws)
Retention and disposal schedules for all documents
The effect of imaging storage (media requirements)
Retrieval speed (time required to locate a document)
Imaging Internal Documents
One last consideration is whether imaging will be successful in replacing internal document handling, or whether it should be primarily a method for handling externally-created documents.
Imaging works best for the external document: that document which comes to the enterprise from an outside source, or one which originates within the enterprise, travels around, and returns. With these documents, imaging and image processing techniques can be shown at their finest advantage.
Internal documents can be imaged, but only when they are ready to be archived or "locked down" to their final format.
In the long term, imaging should be looked to for handling external documents before consideration is given to internal documents that need to be streamlined. This is an extremely important consideration in the design of both networks and applications, because it shifts the burden to development and away from bandwidth on all internal documents.
In a full blown implementation, imaging brings with it some very real management difficulties for the organization. Once you are satisfied with your resolution of the legal issues involving image retention of documents, there are the personnel issues to face. Do not be misled by vendors with regard to personnel or legal issues. Imaging is an extremely powerful technological breakthrough. Some of the ramifications in the workplace have not even been estimated yet.
When and where you decide to implement either a storage retrieval or imaging processing system can make more difference to your long term survivability than almost any other technological decision. By its very nature, imaging is most economical on the LAN. There are no mainframe implementation scenarios that can show the long term economics of LAN-based storage (with the exception of tape storage, which is incredibly slow even on a LAN).
No matter how you decide to control storage requirements and media, turnpiking may well be a major concern. Conceptually, imaging at the departmental level could resemble the early days of printing on the LAN. The only difference is that instead of output from the shared departmental printer, you have documents being input at the departmental scanners and traversing the LAN to individual workstations. The resulting traffic could potentially add up to hundreds or even thousands of kilobytes per second.
IDE-assisted EDI is a real consideration for external paper documents for which you have only marginal control or input as to their construction. Of course, you would want to give forms-based implementations some consideration for internally developed applications.
If allowed to develop intra-organizationally, image processing will eventually require some very sophisticated database management. The areas of imaging and database are inseparably linked, as are imaging and work-flow. For organizations located at multiple sites, communications considerations, locality of data, transportability of image format, and security become very real issues.
In this AppNote, we have tried to highlight some of the more penetrating considerations surrounding the implementation and justification of an imaging system. One item to keep in mind is that imaging works on all NetWare platforms, right now. In our next imaging AppNote, we will look at topological issues as well as process architectures for building an imaging solution.
* Originally published in Novell AppNotes
The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.