<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0"><channel><title><![CDATA[Capture Latest Wiki Entries]]></title><link><![CDATA[http://www.aiim.org/community/Wiki/Capture]]></link><description /><language>en-us</language><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/aiim/Capture-Latest-Wiki-Entries" /><feedburner:info uri="aiim/capture-latest-wiki-entries" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item><title><![CDATA[Digital Imaging Standard]]></title><link><![CDATA[http://www.aiim.org/Community/Wiki/view/Digital-Imaging-Standard]]></link><description><![CDATA[<i>NOTE: The AIIM Standards Committee, C24, Document Imaging is seeking the assistance of the AIIM Capture Community to complete a section of a standard they are working on for digital imaging. Please feel free to edit the following.</i><br /><br /><h2 class="separator">2. System design considerations<a id="C_&#9;System_design_considerations_5" class="headeranchor" title="%%%%SectionLinkTextPlaceHolder%%%%" href="http://www.aiim.org/layouts/sublayouts/website/community/wiki/IframeEditor.aspx#C_System_design_considerations_5"> </a>&#0182;<a class="headeranchor" id="C_System_design_considerations__ABIC_6" href="#C_System_design_considerations__ABIC_6" title="%%%%SectionLinkTextPlaceHolder%%%%">&#0182;</a></h2>As with any project, the place to start is with the system design. There are many elements that must be considered in order to achieve an effective imaging system design. While not every item discussed in this section needs to be considered, it is important to understand the elements of the system that may affect system performance and scalability.<br /><br /><h3 class="separator">Purposes of Digital Imaging<a id="Purposes_of_Digital_Imaging_0 class=headeranchor title=%%%%SectionLinkTextPlaceHolder%%%% href=" http://www.aiim.org/layouts/sublayouts/website/community/wiki/iframeeditor.aspx#purposes_of_digital_imaging_0"=""> </a>&#0182;<a class="headeranchor" id="Purposes_of_Digital_Imaging__ABIC_0" href="#Purposes_of_Digital_Imaging__ABIC_0" title="%%%%SectionLinkTextPlaceHolder%%%%">&#0182;</a></h3>One of the main attributes and benefits of a digital imaging system is the ability to provide distributed access to document information by a worldwide audience. It also provides a way to preserve information contained in documents in other formats.<br /><br />Reasons for converting paper documents to images include:<br /><br />1. Establish redundancy – paper can decay, be lost or stolen, and is susceptible to fire, flood, insects, etc – and digital imaging enables a digital copy to be available. <br /><br />2. Ensure document security – Losing documents may be costly to any organization as it may cause a loss of productivity or an embarrassing leak. Having information in a secured environment prevents ad-hoc deletions and can reduce risk. Documents are online and can require usernames and passwords for access.<br /><br />3. Enable access to content anywhere – Providing secure image access over the web allows users to find the image they need whenever desired.<br /><br />4. Increase the readership of the document – Easy access to images means that images or a link to images may be sent to any user all over the world. No longer are users tethered to a physical document for information needed for his or her job function.<br /><br />5. Reduce copying and shipping costs – Copying paper documents and shipping those documents via mail or other carrier, are significant costs with any paper-intensive operation. Scanning reduces the need for a physical copy and if a physical copy is needed, it may be printed at the user location instead of being shipped to the user.<br /><br />6. Reduce on and off site paper storage costs – By reducing the amount of paper in any organization, the storage costs will go down. There are costs associated with storing and/or preserving images. However, these costs will be reduced with sound records management and retention policies.<br /><br />7. Provide the foundation of a sound records management solution – Having one digital image instead of multiple paper copies allows the records manager to establish the item of record and ensure that records are truly deleted when the retention policy has been met.<br /><br />8. Reduce costs of sorting/routing, indexing, data entry, and redaction by providing the foundation for automatic data recognition solutions (OCR/ICR, barcode, MICR, CAR/LAR, etc) and auto-redaction.<br /><br /><h3 class="separator">End Users<a id="End_Users_1 class=headeranchor title=%%%%SectionLinkTextPlaceHolder%%%% href=" http://www.aiim.org/layouts/sublayouts/website/community/wiki/iframeeditor.aspx#end_users_1"=""> </a>&#0182;<a class="headeranchor" id="End_Users__ABIC_1" href="#End_Users__ABIC_1" title="%%%%SectionLinkTextPlaceHolder%%%%">&#0182;</a></h3>As with any computer software solution, the end user must always be considered. Failure to discuss how the documents are used in the organization today may result in a failed implementation.<br /><br />End users need to be confident in the system's ability to access documents. They also need to have confidence that the documents  are properly secured. The trick, with the imaging solution, is to ensure that secure access to images is fast and easy – that the users may quickly find the desired scanned document – without a cumbersome process. Replacing paper or affecting any change may be difficult, but making sure end users are aware of the change and considering how images will affect his or her job should help achieve successful job performance.<br /><br />End user design considerations include:<br /><br />1. Computer Operating System – Can the user quickly log in and access data? Are there single sign on programs enabled that would allow the user credential to be cached so a user name and password do not have to be entered every time the system is accessed?<br /><br />2. Application Integration - <br />a) User - Does the imaging solution work with other applications that users must access? Will the user need to have the imaging solution and other applications displayed as separate applications or is application integration possible? Will the imaging system provide the means for improved user productivity?<br />b) Automation - Does the imaging solution provide a secure API for access by "backroom" processing applications?  Does the solution handle high volumes of queries and retrievals that can be generated by such applications?  Can the solution provide extensible, structured storage for large amounts of metadata (ex. full page recognition, image transactional type, indices, data fields, etc.), or provide connections to other systems for such storage?<br /><br />3. Monitor Size – Is the typical monitor size large enough for the user to read the data that exists on the image? What is the default image zoom that reduces the need for scrolling up or down or left or right on the image?<br /><br />4. Printing – Will printing be allowed? If yes, what will be the mechanism to ensure the printed copy is secure? Will color printing be allowed? Is the ability to audit the print function required?<br /><br />5. Software access – Are the searching capabilities easy to use? Must the users choose several options to get to an image? Are there default values and saved searches available? Does access to the content need to be audited?<br /><br />6. External Devices – Will end users be able to save images to a USB drive or an external hard drive? Will the user be able to save a copy to a local hard drive?<br /><br />7. Content/Metadata Editing - Will end users be able to manipulate the content or metadata? Does the system provide version control?<br /><br />Considering the end user in the design of an digital imaging project will assist in user acceptance and help ensure the success of the project.<br /><br /><h3 class="separator">Audience<a id="Audience_2 class=headeranchor title=%%%%SectionLinkTextPlaceHolder%%%% href=" http://www.aiim.org/layouts/sublayouts/website/community/wiki/iframeeditor.aspx#audience_2"=""> </a>&#0182;<a class="headeranchor" id="Audience__ABIC_2" href="#Audience__ABIC_2" title="%%%%SectionLinkTextPlaceHolder%%%%">&#0182;</a></h3>Separate from the end user experience is the audience that may need access to the images. Documents created or used by one department may need to be used by many different departments in the organization or externally.<br /><br />1. Security – Who must see the document for his or her job? Does the user need view-only access? Should the user have the ability to annotate or add notes to an image? Should the user be able to modify the metadata?<br /><br />2. Redaction – Are there parts of the image that may only be seen by a small group of users? Redacting or obscuring most or part of an image may be needed to protect personal identifiable information. This is particularly important for government organizations that must release data and for any organization that deals with private, restricted or confidential information. <br /><br />3. Role – What is the role of each group of users? Content creator? Information users? Information manager? Security officer concerned with access to the information? Understanding the need for the data will ensure the appropriate information is available and also reduce risk by eliminating unnecessary document access. <br /><br /><h3 class="separator">Image Formats<a id="Image_Formats_3 class=headeranchor title=%%%%SectionLinkTextPlaceHolder%%%% href=" http://www.aiim.org/layouts/sublayouts/website/community/wiki/iframeeditor.aspx#image_formats_3"=""> </a>&#0182;<a class="headeranchor" id="Image_Formats__ABIC_3" href="#Image_Formats__ABIC_3" title="%%%%SectionLinkTextPlaceHolder%%%%">&#0182;</a></h3>When a piece of paper is scanned, the resulting electronic document may be created in many different formats. There are a wide variety of image formats and each type presents its own unique set of characteristics.<br /><br />For example, TIFF and PDF can contain different embedded formats. They both allow multiple pages in a single file. <br /><br />In addition, the format of the resulting image will affect how much storage is required and image quality as well as system performance. Some file compression types lose image quality while reducing storage requirements. Those that do not lose image quality are considered "lossless." Those that do lose image quality are "lossy." Refer to ISO 12033 for details on selecting the appropriate file format.<br /><br />When OCR is apart of the process TIFF Group 4 is the ideal format for OCR accuracy.  Post OCR format can be any digital or image format available in the OCR application, but for OCR results TIFF Group 4 is optimum.<br /><br />Some standard and industry standard image file formats and file compression considerations include:<br /><br />Output Formats:<br />1. TIFF (Tagged Image File Format) – <br />    TIFF multibit <br />    TIFF Group 4 single bit<br /><br />2. BMP<br /><br />3. LZW <br /><br />4. PNG<br /><br />5. JPEG2000<br /><br />6. JPEG<br /><br />7. GIF<br /><br />8. JBIG/JBIG2<br /><br />9. PDF,PDF/A, PDF/E<br />  <br /><a class="unknownlink" href="Replace%20list%20above%20with%20table.ashx" title="Replace list above with table">Replace list above with table</a><br />				<br /><br /><b>Metadata and Indexing</b>The ability to quickly locate and access the images must be considered in system design. Added metadata and indexing enhances the user's ability to quickly locate the desired image and will increase the chances of a successful imaging solution implementation.<br /><br /><h3 class="separator">Imaging Architecture<a id="Imaging_Architecture_4 class=headeranchor title=%%%%SectionLinkTextPlaceHolder%%%% href=" http://www.aiim.org/layouts/sublayouts/website/community/wiki/iframeeditor.aspx#imaging_architecture_4"=""> </a>&#0182;<a class="headeranchor" id="Imaging_Architecture__ABIC_4" href="#Imaging_Architecture__ABIC_4" title="%%%%SectionLinkTextPlaceHolder%%%%">&#0182;</a></h3>The hardware and network used in a digital imaging system include<br /><br />Create a table with definition, use, required/optional, characteristics<br /><br />1. Server Hardware<br /><br />2. Web Server<br /><br />3. Capture devices<br /><br />4. User workstation<br /><br />5. Storage and backup<br /><br />6. Network<br /><br />7. Printers<br /><br />8. Remote Access<br /><br />9. Handheld Devices<br /><br /><h3 class="separator">???????&#0182;<a class="headeranchor" id="ABIC_5" href="#ABIC_5" title="%%%%SectionLinkTextPlaceHolder%%%%">&#0182;</a></h3>1. – System back ups ensure the images will not be lost and hot sites or server clusters ensure that images will always be available. ????? (fit under Imaging Architecture)]]></description><category domain="http://www.aiim.org/Community/search/keyword?w=imaging"><![CDATA[imaging]]></category><category domain="http://www.aiim.org/Community/search/keyword?w=documentmanagement"><![CDATA[documentmanagement]]></category><category domain="http://www.aiim.org/Community/search/keyword?w=capture"><![CDATA[capture]]></category><pubDate>Fri, 23 Sep 2011 15:55:52 GMT</pubDate><dc:creator><![CDATA[Rick Laxman]]></dc:creator><guid /></item><item><title><![CDATA[Distributed Capture]]></title><link><![CDATA[http://www.aiim.org/Community/Wiki/view/Distributed-Capture]]></link><description><![CDATA[The Internet, greater bandwidth, and smaller, less expensive scanners (and MFDs) enable paper to be inserted into the business process at the point of creation. Rather than collecting and shipping documents to a central location, those documents can be scanned on the same day.<br /><br />Distributed scanning can support branch operations and even mobile workers, but scanning isn't their job. To be successful, a distributed capture application needs to be as simple as possible to use while meeting business needs.]]></description><pubDate>Mon, 02 Aug 2010 23:26:15 GMT</pubDate><dc:creator><![CDATA[Bryant Duhon]]></dc:creator><guid /></item><item><title><![CDATA[Document Preparation]]></title><link><![CDATA[http://www.aiim.org/Community/Wiki/view/Document-Preparation]]></link><description><![CDATA[Prior to scanning, especially high-volume scanning, documents have to be manually prepared. Time-consuming and too often underestimated, poor document prep will also slow down document throughput (costing extra time and money).The goal is to ensure that documents are successfully scanned while minimizing paper jams, misfeeds, double feeds, and image quality. There are two basic kinds of preparation:<br /><br /><strong>1. Physical preparation of documents.</strong> This includes removing all binding from documents and files, such as paper clips, staples, binder clips, three-ring binders, and rubber bands. Separator sheets are inserted in place of bindings in order to keep track of pages that go together in a document or file. It also includes mundane activities such as moving “Post-it” notes to open areas on a page or onto a clean sheet of paper so as not to cover up important information.<br /><br /><strong>2. Batch preparation of documents.</strong> Activities include indexing documents and organizing them by date, type, case, instance, occurrence, or whatever way the application or job requires that they be sorted.<br /><br /><strong>TIPS FOR DOCUMENT PREPERATION</strong><br /><strong>1. Throw out unneeded documents.</strong> Folders often include extra copies of different documents. Getting rid of them diminishes the number of pages that must be scanned, which can reduce licensing costs.<br /><br /><strong>2. Sort out over-sized documents that may be heeled or folded.</strong> Such documents need to be separated into separate sheets.<br /><br /><strong>3. Fan the paper after removing staples.</strong> Fanning helps to separate sheets that are stuck together where the document was stapled. This is also true for documents with hole punches. Multiple sheets of paper that are hole-punched together tend to stick together at the punch point.<br /><br /><strong>4. On documents with hole punches at the top of the page, turn the sheets upside down to prevent scanner misfeeds.</strong> Scanner pick rollers are built to align with the two-hole punches at the top of the page, which can result in mispicks. Many high production scanners have settings that automatically rotate the image 180 degrees immediately upon scan, so documents may be scanned upside down, then automatically rotated to the correct orientation prior to saving.<br /><br /><strong>5. Photocopy unscannable pages so they may be used for scanning.</strong> Keep track of where to replace the original paper when scanning is completed.<br /><br /><strong>6. Ensure that all of the paper sheets have been shuffled so that the top of all of the pages are evenly stacked.</strong> Use a paper jogger to even out and align the pages.<br /><br /><strong>7. When scanning documents of different lengths, make sure they are at least the same width.</strong> Check for odd-sized documents and scan them one at a time. For documents such as paycheck stubs, W-2 forms, and so forth, photocopy the document prior to batch scanning, or attach the document to a standard letter size page. If smaller size documents need to be attached to a letter sheet, take the following precautions:<br /><ul><li>Photos, attachments, and varying types of paper affect scanning. Eliminate unneeded materials and single feed the rest.</li><li>Don’t use glue because it gums up scanners. Tape won’t. Cellophane tape is slippery, and causes misfeeds by pick rollers, so avoid using it. A rough surface transparent tape makes the pick roller less likely to slip.</li><li>Tape the document in the middle of the letter sheet, not the top. A good portion of the page should be loaded in the feeder before it hits the attached sheet.</li><li>Tape documents straight on the page. This will provide a better image and ensure smoother feeding.</li><li>Tape both the top and bottom of the document for most of the document length. The scanner will most likely rip the document off the letter-sized sheet if you don’t tape the bottom of the odd-sized sheet.</li><li>Tape wrapped around a roller can be difficult to remove and clean. You will need to clean the scanner more frequently when using a lot of taped documents or the scanner will eat them.<br /></li></ul><br /><strong>8. Barcode recognition.</strong> To optimize barcode recognition, follow these guidelines:<br /><ul><li>Laser printers should be set at 300 dpi for printing barcodes. Inkjet can be used occasionally if an equivalently high DPI setting is used, along with a full inkjet cartridge and good quality paper, but don’t use inkjet printers for production-level printing.</li><li>Keep frequent tabs on the barcode printing quality and the toner cartridge. Replace at the first indication of fading or degradation, regardless of how much toner is left in the cartridge.</li><li>Do not reuse barcode separator sheets. If you must, don’t reuse them too often. Make sure that they are clean and fully legible. Document separators that are reused too much get dirty, smudged or the toner flakes off of the page, so discard them as soon as these warning signs appear.</li><li>Do not repeatedly photocopy separator sheets. Create a set of master sheets, and then copy those only. Don’t make copies of copies. With each copy generation, the barcode gets slightly reduced in size and image quality.<br /></li></ul><br /><strong>9. Scanner maintenance.</strong> Cleaning instructions are there for a purpose, so adhere rigidly to the daily maintenance instructions and follow these rules:<br /><ul><li>Replace used ink and toner when recommended. Monitor the frequency of misfeeds, double feeds, and mispicks for signs of scanner wear and tear.</li><li>Periodically check the scanner feed areas to remove any staples, paper clips paper fragments, or other objects that may have fallen into them.</li><li>Another cause of double feeds, jams, and mispicks is dust accumulation. Clean the paper path daily with scanner cleaning fluid to remove dust accumulation on the rollers, which makes them slippery.</li><li>Clean the lamp lenses daily to insure that gunk does not build up. When scanning, inspect images to check for vertical or horizontal black or white stripes. If stripes appear, stop scanning and clean the scanner.</li><li>When cleaning, also check the camera lens and lamp areas for scratches that a staple or paperclip may have caused. Replace all scratched components.<br /></li></ul>]]></description><pubDate>Fri, 30 Jul 2010 12:13:20 GMT</pubDate><dc:creator><![CDATA[Bryant Duhon]]></dc:creator><guid /></item><item><title><![CDATA[Forms Processing]]></title><link><![CDATA[http://www.aiim.org/Community/Wiki/view/Forms-Processing]]></link><description><![CDATA[Automated forms processing is used to capture data on forms that are external to a company, forms that are filled in by manual means—using hand print, machine print, and check boxes—and then returned to a centralized location for batch processing. Imaged hand or machine print is of little value until it is converted into computer-usable (ASCII) data. Since over 80% of all business documents are<br />forms, manual data entry forms conversion constitutes an enormous expense, which can be significantly diminished through the use of recognition-based, automated forms processing.<br /><br /><strong>Steps in Forms Processing</strong><br />Forms automation is ICR-intensive (intelligent character recognition) and involves an eight-step process. Any forms processing solution, whether turnkey or built from modular components, must use software that goes through these stages to convert bitmapped image data to ASCII data:<br /><br /><strong>1. Scanning</strong>—Pages of forms are scanned and converted into bitmapped (usually TIFF) images of forms which are either compressed and stored for later batch processing, or are passed immediately in an uncompressed format to an ICR engine for recognition.<br /><br /><strong>2. Image Analysis</strong>—The document image is cleaned up. Character images are enhanced, using image enhancement techniques. The document is identified and its image is registered and deskewed, so that the recognition zones containing the fields designated for recognition can be located by a predefined recognition template. The template identifies which individual fields on the form image require recognition, and what the nature of those fields are—barcode, signature, hand print, machine print, numeric, alphabetic, alphanumeric, etc.<br /><br /><strong>3. Document Image Background Removal</strong>—This stage is not necessary if the document is a form that was originally printed in a colored (“drop out”) ink that is invisible to the scanner being used. If not, the form image may contain lines, boxes, fine print, and other form attributes that tend to confuse the ICR engine. These form attributes must be extracted from the image of the form, so that only the character images are left behind. Broken and fragmented characters are then repaired and restored to their original shapes.<br /><br /><strong>4. Character Location</strong>—The fields that contain character data are located by the predefined recognition template.<br /><br /><strong>5. Character Segmentation</strong>—Custom software routines analyze, separate, and break down the character fields into isolated characters.<br /><br /><strong>6. Character Classification</strong>—Individual characters are classified by ICR algorithms according to their ASCII category and assigned a confidence value which is an index of how “certain” the ICR engine “feels” about the selection it has made. Alternate character choices are ranked according to those values, so that they can be incorporated into editing procedures that improve ICR accuracy. For example, the alternate choice “1” might be used instead of the first-ranked choice “I” when contextual analysis reports that the field is all numeric.<br /><br /><strong>7. Post-processing</strong>—The initial or “raw” recognition results are validated using edit procedures such as grammatical rules, spell-checkers, dictionaries, check-sum routines, and look-up tables. Ambiguous and erroneous data fields—the “rejects”—are identified and set aside.<br /><br /><strong>8. Manual Correction of Rejected Character Fields</strong>—Rejects are sent to data entry operators at workstations for manual correction. Unlike generic text recognition applications that utilize standard English language spell-checkers, forms processing applications always contain words and terms that are idiosyncratic to the industry of the form that is being processed. Consequently, forms processing<br />applications require more sophisticated validation routines than full-text processing applications. Forms automation accuracy can be improved by specifically redesigning a form. The idea is to force the separation of the fixed<br />elements of a form, such as lines and fine print—the “passive data”—from the hand and machine print that a user enters on a form—the “active data.” This is accomplished by printing the fixed form elements in carbonless, colored ink that is invisible to a scanner and therefore causes the ink to “drop-out” from the form,<br />leaving only the hand and machine print characters visible to the scanner.<br /><br />Forms automation is also used to process so-called “unstructured” documents such as invoices, which is done without the aid of a predefined recognition template.<br /><br /><a class="unknownlink" href="From-an-original-column-written-by-Arthur-Gingrande-of-IMERGE-Consulting-.ashx" title="From-an-original-column-written-by-Arthur-Gingrande-of-IMERGE-Consulting-">From-an-original-column-written-by-Arthur-Gingrande-of-IMERGE-Consulting-</a>]]></description><pubDate>Fri, 30 Jul 2010 12:49:23 GMT</pubDate><dc:creator><![CDATA[Bryant Duhon]]></dc:creator><guid /></item></channel></rss>

