<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr="http://purl.org/syndication/thread/1.0" gd:etag="W/&quot;A0QMQnc8cSp7ImA9WhRRFE4.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403</id><updated>2011-11-27T16:16:23.979-08:00</updated><category term="speed" /><category term="OCR Software" /><category term="OCR results" /><category term="OCR Software Definition" /><category term="OCR application" /><category term="data capture" /><category term="zone ocr software" /><category term="ICR Software" /><category term="distributed ocr" /><category term="Optical Character Recognition" /><category term="hand printing" /><category term="Optical Character Recognition Software" /><category term="settings" /><category term="centralized ocr" /><category term="DPI" /><category term="OCR accuracy" /><category term="Document Capture" /><category term="OMR Software" /><category term="test" /><category term="ocr performance" /><category term="recognition software" /><category term="open source OCR" /><category term="data extraction" /><category term="scanning" /><category term="zone ocr" /><category term="index" /><category term="character correction" /><category term="full text" /><category term="ade" /><category term="handwriting" /><category term="capture" /><category term="OCR" /><title>OCR Software</title><subtitle type="html">Information about Optical Character Recognition Software / OCR Software</subtitle><link rel="http://schemas.google.com/g/2005#feed" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/posts/default" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><generator version="7.00" uri="http://www.blogger.com">Blogger</generator><openSearch:totalResults>22</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/OcrSoftware" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="ocrsoftware" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry gd:etag="W/&quot;DkMFQH8zfip7ImA9Wx9aF00.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-6534354839531059600</id><published>2011-03-09T13:00:00.000-08:00</published><updated>2011-03-09T13:00:11.186-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2011-03-09T13:00:11.186-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="recognition software" /><category scheme="http://www.blogger.com/atom/ns#" term="handwriting" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR" /><category scheme="http://www.blogger.com/atom/ns#" term="hand printing" /><category scheme="http://www.blogger.com/atom/ns#" term="ICR Software" /><title>What is Intelligent Character Recognition (ICR)?</title><content type="html">So, &lt;b&gt;&lt;i&gt;Optical Character Recognition (OCR)&lt;/i&gt;&lt;/b&gt; is the process of recognizing computer generated text from an image, typically one that is scanned using &lt;a href="http://www.psigen.com/"&gt;document capture software&lt;/a&gt;. &amp;nbsp; If you don't know the difference between OCR and Capture, see my other post here: &lt;a href="http://ocrsoftware-1.blogspot.com/2010/01/ocr-software-versus-document-capture.html"&gt;&amp;nbsp;OCR vs. Capture&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Intelligent Character Recognition (ICR)&lt;/b&gt; is the process of recognizing hand-printed or handwritten information from a scanned document. &amp;nbsp;It utilizes the patterns of the pixels to match to specific written characters. &amp;nbsp;This form of recognition is typically not as accurate as OCR, but there are several ways to make the accuracy acceptable, the main of which is to provide combed fields or spaced boxes to ensure character spacing, or whitespace between symbols.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-6534354839531059600?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/Shz0LeGX_EnlGMNMk6JsRp2-pSw/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/Shz0LeGX_EnlGMNMk6JsRp2-pSw/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/Shz0LeGX_EnlGMNMk6JsRp2-pSw/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/Shz0LeGX_EnlGMNMk6JsRp2-pSw/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/6534354839531059600/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2011/03/what-is-intelligent-character.html#comment-form" title="3 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/6534354839531059600?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/6534354839531059600?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2011/03/what-is-intelligent-character.html" title="What is Intelligent Character Recognition (ICR)?" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>3</thr:total></entry><entry gd:etag="W/&quot;CE4GQX07fSp7ImA9WxBaGUU.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-964600966191164331</id><published>2010-03-30T14:02:00.000-07:00</published><updated>2010-03-30T14:02:00.305-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-03-30T14:02:00.305-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="ade" /><category scheme="http://www.blogger.com/atom/ns#" term="data capture" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR" /><category scheme="http://www.blogger.com/atom/ns#" term="Document Capture" /><category scheme="http://www.blogger.com/atom/ns#" term="data extraction" /><title>What is Advanced Data Extraction (ADE)?</title><content type="html">&lt;span style="font-size: large;"&gt;&lt;strong&gt;OCR and Data Extraction&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
A big part of any &lt;strong&gt;OCR solution&lt;/strong&gt; is the process of &lt;a href="http://www.psigen.com/"&gt;data capture&lt;/a&gt; and extraction.&amp;nbsp; Most &lt;a href="http://www.psigen.com/"&gt;document capture&lt;/a&gt; applications provide the ability to process the converted text and provide the &lt;strong&gt;extraction of expressions&lt;/strong&gt; from the text.&amp;nbsp; So how can this&amp;nbsp; help?&amp;nbsp; Well, the ability to parse the OCR text provides automation, and allows you to populate fields based on what you find.&lt;br /&gt;
&lt;br /&gt;
An example might be a form that has &lt;strong&gt;DOB: 1/2/1968&lt;/strong&gt; &lt;br /&gt;
&lt;br /&gt;
You want to extract everything to the right of DOB: from a document.&amp;nbsp; You can do this with an ADE engine.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-964600966191164331?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/BA7PRUs3FRdkWyxqKuYx1xWmbnM/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/BA7PRUs3FRdkWyxqKuYx1xWmbnM/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/BA7PRUs3FRdkWyxqKuYx1xWmbnM/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/BA7PRUs3FRdkWyxqKuYx1xWmbnM/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/964600966191164331/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/03/what-is-advanced-data-extraction-ade.html#comment-form" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/964600966191164331?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/964600966191164331?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/03/what-is-advanced-data-extraction-ade.html" title="What is Advanced Data Extraction (ADE)?" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>2</thr:total></entry><entry gd:etag="W/&quot;A0AHRHg7eSp7ImA9WxBUGUw.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-8009196576078004389</id><published>2010-03-06T15:55:00.000-08:00</published><updated>2010-03-06T15:55:35.601-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-03-06T15:55:35.601-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="OCR accuracy" /><category scheme="http://www.blogger.com/atom/ns#" term="settings" /><category scheme="http://www.blogger.com/atom/ns#" term="DPI" /><title>OCR and the Right Settings</title><content type="html">&lt;span style="font-size: large;"&gt;What DPI should be set for optimal OCR Accuracy?&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
So. I get this question all the time and decided it might be good to post about it.&amp;nbsp; What is the best DPI setting for Optical Character Recognition (OCR)?&lt;br /&gt;
&lt;br /&gt;
I have been at clients that erroneously believe the higher the DPI, the beeter the results, and feel pain whenever I see an &lt;strong&gt;OCR Scanner &lt;/strong&gt;set beyond 300 DPI, and some even at 600 DPI!!&amp;nbsp; Holy cow, how do you handle those file sizes?&lt;br /&gt;
&lt;br /&gt;
The fact remains that almost all OCR engines on the market are tuned and optimized for 300DPI for optimal conversion and recognition.&amp;nbsp; Going beyond this will provide no better results, and significantly increase your file size exponentially.&amp;nbsp; Most &lt;a href="http://www.psigen.com/"&gt;Document Capture&lt;/a&gt; companies provide image processing prioer to OCR that will allow you to scan at 200 DPI, with fairly consistent results.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-8009196576078004389?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/5A0gnATPov2k5zQ9DjG4ifM-ES8/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/5A0gnATPov2k5zQ9DjG4ifM-ES8/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/5A0gnATPov2k5zQ9DjG4ifM-ES8/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/5A0gnATPov2k5zQ9DjG4ifM-ES8/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/8009196576078004389/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/03/ocr-and-right-settings.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/8009196576078004389?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/8009196576078004389?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/03/ocr-and-right-settings.html" title="OCR and the Right Settings" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total></entry><entry gd:etag="W/&quot;CkcASX88eyp7ImA9WxBUEE8.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-5820672867572284451</id><published>2010-02-24T06:00:00.001-08:00</published><updated>2010-02-24T06:00:48.173-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-24T06:00:48.173-08:00</app:edited><title>OCR Software Post</title><content type="html">&lt;span style="font-size: large;"&gt;OCR Software for Business&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://aiim.typepad.com/aiim_blog/2009/10/8-things-i-learned-about-ocr-from-small-and-midsized-organizations.html"&gt;8 Things About OCR&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-5820672867572284451?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/0EpY9iklhiYmLdtC7iHJ2xbqQe4/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/0EpY9iklhiYmLdtC7iHJ2xbqQe4/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/0EpY9iklhiYmLdtC7iHJ2xbqQe4/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/0EpY9iklhiYmLdtC7iHJ2xbqQe4/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/5820672867572284451/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/ocr-software-post.html#comment-form" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/5820672867572284451?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/5820672867572284451?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/ocr-software-post.html" title="OCR Software Post" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>2</thr:total></entry><entry gd:etag="W/&quot;CUUGQ3w9eCp7ImA9WxBVGU4.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-6559984103910163487</id><published>2010-02-23T05:53:00.001-08:00</published><updated>2010-02-23T05:53:42.260-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-23T05:53:42.260-08:00</app:edited><title>Google Goggles and OCR</title><content type="html">Cool stuff:&lt;br /&gt;
&lt;br /&gt;
&lt;object width="560" height="340"&gt;&lt;param name="movie" value="http://www.youtube.com/v/FYFSNy9FGqA&amp;hl=en_US&amp;fs=1&amp;"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/FYFSNy9FGqA&amp;hl=en_US&amp;fs=1&amp;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="560" height="340"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-6559984103910163487?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/pA_rdo6mQWlTrTNz-qklKjpfppw/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/pA_rdo6mQWlTrTNz-qklKjpfppw/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/pA_rdo6mQWlTrTNz-qklKjpfppw/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/pA_rdo6mQWlTrTNz-qklKjpfppw/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/6559984103910163487/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/google-goggles-and-ocr.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/6559984103910163487?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/6559984103910163487?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/google-goggles-and-ocr.html" title="Google Goggles and OCR" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total></entry><entry gd:etag="W/&quot;C0ANR3o-cCp7ImA9WxBVGU4.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-7228150729908134903</id><published>2010-02-23T05:29:00.000-08:00</published><updated>2010-02-23T05:29:56.458-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-23T05:29:56.458-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="OCR accuracy" /><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR Software" /><category scheme="http://www.blogger.com/atom/ns#" term="character correction" /><title>OCR Software and Character Correction</title><content type="html">&lt;span style="font-size: large;"&gt;&lt;strong&gt;Optical Character Recognition and Character Correction&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
So what is &lt;em&gt;character correction&lt;/em&gt; when associated with &lt;a href="http://ocrsoftware-1.blogspot.com/2010/01/what-is-ocr-icr-and-omr.html"&gt;OCR&lt;/a&gt;?&amp;nbsp; The OCR process provides the recognition and &lt;strong&gt;conversion&lt;/strong&gt; of images to text, and in this process, there can be many characters that can be misidentified throughout the &lt;strong&gt;conversion process&lt;/strong&gt;.&amp;nbsp; Typically, &lt;a href="http://www.psigen.com/"&gt;document capture&lt;/a&gt; applications provide the ability to identify commonly misinterpreted characters through a table of correction mappings.&amp;nbsp; So lets say a particular &lt;strong&gt;&lt;a href="http://ocrsoftware-1.blogspot.com/2010/02/zone-ocr-and-accuracy-within.html"&gt;zone OCR&lt;/a&gt;&lt;/strong&gt; field was designated as numbers only, and the engine interpreted an "l" for a "1" (that is an l for a one).&amp;nbsp; The correction piece of the recognition engine can provide logic to the OCR process, and make sure the text is properly interpreted.&amp;nbsp; &lt;br /&gt;
&lt;br /&gt;
This is just one of many ways to improve accuracy, but note you will need the right kind of OCR application that allows this feature to be enabled.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-7228150729908134903?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/R4UWggWfRJsRBsIz1i-zlOSCnYY/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/R4UWggWfRJsRBsIz1i-zlOSCnYY/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/R4UWggWfRJsRBsIz1i-zlOSCnYY/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/R4UWggWfRJsRBsIz1i-zlOSCnYY/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/7228150729908134903/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/ocr-software-and-character-correction.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/7228150729908134903?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/7228150729908134903?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/ocr-software-and-character-correction.html" title="OCR Software and Character Correction" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total></entry><entry gd:etag="W/&quot;CEcFRHs9eyp7ImA9WxBVF0Q.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-7432043946624769190</id><published>2010-02-19T05:44:00.000-08:00</published><updated>2010-02-21T14:40:15.563-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-21T14:40:15.563-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="index" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR results" /><category scheme="http://www.blogger.com/atom/ns#" term="centralized ocr" /><category scheme="http://www.blogger.com/atom/ns#" term="ocr performance" /><title>Are index fileds really necessary when you have the full text OCR?</title><content type="html">&lt;span style="font-size: large;"&gt;&lt;strong&gt;Full text OCR&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Ah, the old debate, do I just perform &lt;strong&gt;&lt;a href="http://ocrsoftware-1.blogspot.com/"&gt;optical character recognition&lt;/a&gt;&lt;/strong&gt; on all my scanned documents, make them &lt;strong&gt;searchable&lt;a href="http://ocrpdf.blogspot.com/"&gt; OCR PDFs&lt;/a&gt;&lt;/strong&gt;, and rely on the &lt;strong&gt;OCR to retrieve documents&lt;/strong&gt;?&amp;nbsp; Why use index fields when I already have all the converted text?&lt;br /&gt;
&lt;br /&gt;
Index fields, or performing the indexing process, provides structured data about the documents.&amp;nbsp; This data can be utilized, especially when using document capture software, to link into columns and index fields in your document management system.&amp;nbsp; Index fields provide faster retrieval, especially if you want to be able to retrieve through specifying several criteria.&amp;nbsp; Relying on &lt;em&gt;OCR&lt;/em&gt;, or the recognized text can get you in trouble.&amp;nbsp; First of all, you are assuming that the document will alwyas have recognized text, and that all the items that you are searching for are in the text.&amp;nbsp; Secondly, depdning on the type of &lt;strong&gt;OCR&lt;/strong&gt; format you have, you may have to just find the document, and then open and parse what you are looking for.&amp;nbsp; This can also lead to false positives in retrieval if many documents have the same terms in their &lt;strong&gt;OCR&lt;/strong&gt; text.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-7432043946624769190?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/F2qQUNwXDx__pZl9w_hSaOKk170/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/F2qQUNwXDx__pZl9w_hSaOKk170/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/F2qQUNwXDx__pZl9w_hSaOKk170/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/F2qQUNwXDx__pZl9w_hSaOKk170/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/7432043946624769190/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/are-index-fileds-really-necessary-when.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/7432043946624769190?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/7432043946624769190?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/are-index-fileds-really-necessary-when.html" title="Are index fileds really necessary when you have the full text OCR?" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total></entry><entry gd:etag="W/&quot;CEcHSHw9eip7ImA9WxBVF0Q.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-1909274509389196455</id><published>2010-02-16T19:43:00.000-08:00</published><updated>2010-02-21T14:40:39.262-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-21T14:40:39.262-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="OCR accuracy" /><category scheme="http://www.blogger.com/atom/ns#" term="zone ocr software" /><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition" /><category scheme="http://www.blogger.com/atom/ns#" term="zone ocr" /><title>Zone OCR and Accuracy within Recognition Zones</title><content type="html">&lt;span style="font-size: large;"&gt;&lt;strong&gt;Zone OCR Accuracy&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
So when doing &lt;strong&gt;&lt;a href="http://ocrsoftware-1.blogspot.com/2009/12/what-is-zone-ocr.html"&gt;zone OCR&lt;/a&gt;&lt;/strong&gt; , or Optical Character Recognition on a portion of a page, what features do I need to ensure I have the best possible accuracy.&amp;nbsp; List below:&lt;br /&gt;
&lt;br /&gt;
&lt;ul&gt;&lt;li&gt;Utilize a &lt;strong&gt;&lt;a href="http://www.psigen.com/"&gt;document capture&lt;/a&gt;&lt;/strong&gt; application that provides some type of page registration.&amp;nbsp; The problem with using zone OCR is that most engines utilize a set template of coordinates on the page, and just repeat this "zone" on each page.&amp;nbsp; If the scanner is off, or the page skewed, you can have erroneous readings.&amp;nbsp; Page registration gives the&lt;strong&gt; recognition engine&lt;/strong&gt; the ability to anchor a page feature, always referencing the zone from the set coordinates of the feature.&lt;/li&gt;
&lt;li&gt;Utilize a &lt;strong&gt;scanning application&lt;/strong&gt; that provides the ability to perform image processing on the zone prior to running &lt;strong&gt;&lt;a href="http://www.psigen.com/modules/capture/ocr/optical_character_mark_recognition.aspx"&gt;Optical Character Recognition&lt;/a&gt; . &lt;/strong&gt;Removing lines, deshading, despeckling can provide a cleaner zone, and thus improve overall accuracy.&lt;/li&gt;
&lt;li&gt;Some &lt;strong&gt;&lt;em&gt;advanced capture&lt;/em&gt;&lt;/strong&gt; applications provide the ability to filter zones based on character sets.&amp;nbsp; This allows you to interpret the characters within a zone as say, all numbers, or perhaps a date, which provides the engine a more narrower character set for the whole recognition process.&amp;nbsp; &lt;a href="http://www.psigen.com/"&gt;PSIGEN PSI:Capture&lt;/a&gt;, for example, not only allows character set mapping to &lt;em&gt;zone ocr&lt;/em&gt; templates, but also provides auto-correction for the most commonly misinterpreted characters.&lt;/li&gt;
&lt;li&gt;Finally, and highly recommended for the highest level of accuracy, is the ability to set a character &lt;strong&gt;matching filter for a zone&lt;/strong&gt;.&amp;nbsp; This technology, sometimes called ADE, provides the ability to utilize regular expressions to ensure a match, and lets you over draw the recognition area / zone and filter to your liking.&lt;/li&gt;
&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-1909274509389196455?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/y1tp86iAiKL7KJZnZ8gtewpT5Lk/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/y1tp86iAiKL7KJZnZ8gtewpT5Lk/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/y1tp86iAiKL7KJZnZ8gtewpT5Lk/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/y1tp86iAiKL7KJZnZ8gtewpT5Lk/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/1909274509389196455/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/zone-ocr-and-accuracy-within.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/1909274509389196455?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/1909274509389196455?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/zone-ocr-and-accuracy-within.html" title="Zone OCR and Accuracy within Recognition Zones" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total></entry><entry gd:etag="W/&quot;CEcDQ306fCp7ImA9WxBVF0Q.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-7717137203213605718</id><published>2010-02-13T07:48:00.000-08:00</published><updated>2010-02-21T14:41:12.314-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-21T14:41:12.314-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="full text" /><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition" /><category scheme="http://www.blogger.com/atom/ns#" term="capture" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR Software" /><category scheme="http://www.blogger.com/atom/ns#" term="centralized ocr" /><category scheme="http://www.blogger.com/atom/ns#" term="open source OCR" /><title>Why use OCR Software to perform full text conversion of images?</title><content type="html">&lt;span style="font-size: large;"&gt;&lt;strong&gt;OCR Software&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
When we scan documents, they are just images, pictures of our paper.&amp;nbsp; For many organizations, this &lt;strong&gt;scanned image&lt;/strong&gt; is exactly what they need, and a little index information about the document is sufficient to provide them with retrieval capability.&lt;br /&gt;
&lt;br /&gt;
So why take the time and spend the money to utilize&lt;strong&gt; OCR Software&lt;/strong&gt; to convert the scanned document to a &lt;strong&gt;searchable&lt;/strong&gt; format?&amp;nbsp; Below are some reasons to always perform full text OCR of scanned documents:&lt;br /&gt;
&lt;br /&gt;
&lt;ol&gt;&lt;li&gt;Always provide every means possible for retrieval.&amp;nbsp;&amp;nbsp;Just using index fields to search for scanned documents may seem like a fantastic idea, but what if the document is misidentified?&amp;nbsp; Or the indexer enters incorrect information?&amp;nbsp; Performing a &lt;em&gt;full text OCR&lt;/em&gt; of the document can provide an insurance policy that a document can always be found through full text search.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.psigen.com/"&gt;Document Capture&lt;/a&gt; software today provides &lt;strong&gt;fast reliable OCR&lt;/strong&gt;.&amp;nbsp; Most capture software on the market provides the ability to automatically convert the documents to searchable format for a small expense.&amp;nbsp; Some of the engines on the market can do the conversion at 100+ pages per minute, so there is really not much time wasted in the &lt;strong&gt;OCR conversion / recognition process&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://ocrpdf.blogspot.com/"&gt;OCR to PDF&lt;/a&gt; for a format that contains both image and text in one container.&amp;nbsp; Adobe provides the PDF image with hidden text option to give you a seachable file format that contains a pristine image.&lt;/li&gt;
&lt;li&gt;Plan for the worst case.&amp;nbsp; Audits...legal issues...sometimes you need to search beyond the index fields, and full text can give you the ability to find the needle in the haystack.&lt;/li&gt;
&lt;/ol&gt;&lt;strong&gt;OCR applications&lt;/strong&gt; give you the means and capabilities to convert images to searhcable formats and there are many reasons to do the full text conversion.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-7717137203213605718?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/l7aDUnlKwBK-AM_DjRxbVIoTLV4/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/l7aDUnlKwBK-AM_DjRxbVIoTLV4/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/l7aDUnlKwBK-AM_DjRxbVIoTLV4/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/l7aDUnlKwBK-AM_DjRxbVIoTLV4/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/7717137203213605718/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/why-use-ocr-software-to-perform-full.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/7717137203213605718?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/7717137203213605718?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/why-use-ocr-software-to-perform-full.html" title="Why use OCR Software to perform full text conversion of images?" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total></entry><entry gd:etag="W/&quot;CEYFQn85eyp7ImA9WxBVF0Q.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-43296633732986395</id><published>2010-02-09T21:27:00.000-08:00</published><updated>2010-02-21T14:41:53.123-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-21T14:41:53.123-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="distributed ocr" /><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition" /><category scheme="http://www.blogger.com/atom/ns#" term="Document Capture" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR Software" /><category scheme="http://www.blogger.com/atom/ns#" term="centralized ocr" /><title>OCR Software - Distributed vs. Centralized</title><content type="html">&lt;span style="font-size: large;"&gt;&lt;strong&gt;OCR Software - Distributed vs. Centralized&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Ah, the centralized versus distributed question...it is one that is continually asked in the scanning, capture and document capture space.&amp;nbsp; Most associate &lt;strong&gt;OCR Software &lt;/strong&gt;with familiar desktop applications like eCopy Desktop, OmniPage, PaperPort, etc.&amp;nbsp; These provide, in a way, distribution of the overall OCR process to end users.&lt;br /&gt;
&lt;br /&gt;
There are applications on the market that can provide centralized and controlled OCR capabilities, through either a server or a workstation deployment.&amp;nbsp; One example is PSI:Capture from PSIGEN, and advanced &lt;a href="mailto:http://www.psigen.com"&gt;document capture&lt;/a&gt; application, that allows &lt;strong&gt;centralized OCR processing&lt;/strong&gt;.&amp;nbsp; Why would you want to do this?&amp;nbsp; Well, in most cases, this type of OCR deplyment model is utilized in conjunction with a document capture system, for centralized capture, indexing, QA, OCR and migration to a centralized DM / ECM system.&amp;nbsp; Typically, these systems give a broad and expansive feature set, providing all different types of &lt;a href="http://ocrsoftware-1.blogspot.com/2009/12/what-is-ocr-software.html"&gt;OCR functionality&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-43296633732986395?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/tj2Qb1h5EUuGMkCsyAdOBhWKhqw/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/tj2Qb1h5EUuGMkCsyAdOBhWKhqw/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/tj2Qb1h5EUuGMkCsyAdOBhWKhqw/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/tj2Qb1h5EUuGMkCsyAdOBhWKhqw/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/43296633732986395/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/ocr-software-distributed-vs-centralized.html#comment-form" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/43296633732986395?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/43296633732986395?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/ocr-software-distributed-vs-centralized.html" title="OCR Software - Distributed vs. Centralized" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total></entry><entry gd:etag="W/&quot;CEYAQng-fCp7ImA9WxBVF0Q.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-7755110202260975485</id><published>2010-02-02T07:15:00.000-08:00</published><updated>2010-02-21T14:42:23.654-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-21T14:42:23.654-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="speed" /><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition" /><category scheme="http://www.blogger.com/atom/ns#" term="test" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR Software" /><category scheme="http://www.blogger.com/atom/ns#" term="ocr performance" /><title>How fast is OCR Software? OCR Performance Testing</title><content type="html">&lt;span style="font-size: large;"&gt;&lt;strong&gt;OCR Performance Testing&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
So which Desktop Optical Character Recognition Software is the fastest? Has the best overall performance when converting images to Word? When converting images to PDF?&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
I ran some testing with 4 basic desktop OCR applications to see which would have the fastest conversion times. The OCR applications are:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
-eCopy Desktop (Uses the ReadIRIS OCR Engine)&lt;br /&gt;
&lt;br /&gt;
-Adobe 8&lt;br /&gt;
&lt;br /&gt;
-Paperport 11&lt;br /&gt;
&lt;br /&gt;
-OmniPage 15&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
I ran all the tests on a 9 month old laptop, with a Dual Core 2 GHz processor, and 2 GB of memory. I utilized all the "out of the box" settings on the apps, with no performance tuning of settings, and I timed the speed of the applications to convert a 100 page TIFF image to Word and to Adobe Image and Text PDF.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Results of the OCR Speed Test in minutes and seconds(Word/PDF):&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
eCopy Desktop 4:25/2:58&lt;br /&gt;
&lt;br /&gt;
Adobe 8 3:54/3:22&lt;br /&gt;
&lt;br /&gt;
OmniPage 15 2:16/2:16*&lt;br /&gt;
&lt;br /&gt;
PaperPort 11 2:35**&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
*With OmniPage you run the conversion process and then save to your preferred format.&lt;br /&gt;
&lt;br /&gt;
**PaperPort just had text conversion capabilities.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
I have to note that the eCopy Desktop test can be misleading in that it performs auto-orientation on all the pages before performing OCR. Also note that when evaluating an OCR application, speed is not the only factor. You need to decide up front whether you want speed, accuracy, both, or want to focus on formatting. I will write another article on formatting and which application is best in the near future.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-7755110202260975485?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/uxNJ0GM1svlQvjdHRhcNDrth8h0/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/uxNJ0GM1svlQvjdHRhcNDrth8h0/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/uxNJ0GM1svlQvjdHRhcNDrth8h0/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/uxNJ0GM1svlQvjdHRhcNDrth8h0/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/7755110202260975485/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/how-fast-is-ocr-software-ocr.html#comment-form" title="4 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/7755110202260975485?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/7755110202260975485?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/02/how-fast-is-ocr-software-ocr.html" title="How fast is OCR Software? OCR Performance Testing" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>4</thr:total></entry><entry gd:etag="W/&quot;CEYDQHk6eyp7ImA9WxBVF0Q.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-7917987348170469465</id><published>2010-01-30T08:07:00.000-08:00</published><updated>2010-02-21T14:42:51.713-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-21T14:42:51.713-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition Software" /><category scheme="http://www.blogger.com/atom/ns#" term="scanning" /><category scheme="http://www.blogger.com/atom/ns#" term="Document Capture" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR Software" /><title>OCR Software versus Document Capture Software</title><content type="html">&lt;strong&gt;&lt;span style="font-size: large;"&gt;OCR Software versus Document Capture Software&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;
&lt;br /&gt;
So all &lt;strong&gt;OCR Software&lt;/strong&gt; companies provide the ability to &lt;strong&gt;convert scanned files&lt;/strong&gt; into text or searchable PDFs via the &lt;strong&gt;Optical Character Recognition&lt;/strong&gt; process, but how do I capture/scan the images so the applications can do their conversion?&lt;br /&gt;
&lt;br /&gt;
This is an interesting question.&amp;nbsp; Let's talk about &lt;a href="http://www.psigen.com/"&gt;Document Capture&lt;/a&gt; first.&amp;nbsp; This type of application is built from the ground up to &lt;strong&gt;scan/capture documents&lt;/strong&gt; at a high rate of speed, provide the means to collect information about the documents through a number of means, and then export the document/data to a back end repository.&amp;nbsp; All document capture companies provide all types of OCR options, and usually OEM their &lt;a href="http://ocrsoftware-1.blogspot.com/2010/01/what-is-ocr-icr-and-omr.html"&gt;OCR, ICR, OMR&lt;/a&gt;&amp;nbsp;components from the major OCR application companies, like:&amp;nbsp; &lt;a href="http://www.abbyy.com/"&gt;ABBYY&lt;/a&gt;, &lt;a href="http://www.opentext.com/"&gt;OpenText&lt;/a&gt;, &lt;a href="http://www.opentext.com/"&gt;Nuance&lt;/a&gt;, &lt;a href="http://www.readsoft.com/"&gt;ReadSoft&lt;/a&gt;, etc.&amp;nbsp; Most of these companies have diversified their offering to include document capture, but their offerings far way short on the capture side in my opinion...they are OCR companies.&lt;br /&gt;
&lt;br /&gt;
The real goal here is to get the best OCR results possible through a powerful OCR engine, and also minimize your time required to scan and process through the best document capture software.&amp;nbsp; So, if you are looking to do high volume OCR processing, I highly recommend choosing a capture application that utilizes your OCR engine of choice to get the best of both worlds.&amp;nbsp; I will write more on this topic in upcoming posts.&amp;nbsp; If you want some guidance on &lt;a href="http://ocrsoftware-1.blogspot.com/2009/12/how-do-i-pick-right-ocr-software.html"&gt;How to pick the right OCR Software&lt;/a&gt;, click on the link text.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-7917987348170469465?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/OXvAl9JBZaGvF-FK50H3P0QRdd4/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/OXvAl9JBZaGvF-FK50H3P0QRdd4/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/OXvAl9JBZaGvF-FK50H3P0QRdd4/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/OXvAl9JBZaGvF-FK50H3P0QRdd4/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/7917987348170469465/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/01/ocr-software-versus-document-capture.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/7917987348170469465?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/7917987348170469465?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/01/ocr-software-versus-document-capture.html" title="OCR Software versus Document Capture Software" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total></entry><entry gd:etag="W/&quot;CEYNQno_eCp7ImA9WxBVF0Q.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-7942375819928716295</id><published>2010-01-26T05:26:00.000-08:00</published><updated>2010-02-21T14:43:13.440-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-21T14:43:13.440-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="OCR application" /><category scheme="http://www.blogger.com/atom/ns#" term="recognition software" /><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR Software" /><title>Microsoft SharePoint and OCR</title><content type="html">&lt;span style="font-size: large;"&gt;&lt;strong&gt;Microsoft SharePoint and OCR&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://scanningwithsharepoint.wordpress.com/"&gt;Scanning with Microsoft SharePoint&lt;/a&gt;&amp;nbsp;is an interesting endeavor, and typically the main reason for this undertaking is to have a searchable body of information.&amp;nbsp;&amp;nbsp;So what type of &lt;strong&gt;Optical Character Recognition (OCR)&amp;nbsp;Software&lt;/strong&gt; can be utilized with SharePoint?&amp;nbsp;&amp;nbsp; First of all, all the same rules apply in picking the right recognition software to do the conversion from image to text, as outlined in &lt;a href="http://ocrsoftware-1.blogspot.com/2009/12/how-do-i-pick-right-ocr-software.html"&gt;"How do I pick the right OCR Software?"&lt;/a&gt;.&amp;nbsp; You need to evaluate what you are trying to accomplish and look at your business process and workflows to get a good idea of how to initiate the conversion process:&lt;br /&gt;
&lt;br /&gt;
&lt;em&gt;Are your paper images scanned en masse, through a centralized capture process?&lt;/em&gt;&lt;br /&gt;
&lt;br /&gt;
If this is the case, you would typically do all of your &lt;strong&gt;OCR processing&lt;/strong&gt; and recognition in front end &lt;a href="http://www.psigen.com/"&gt;document capture software&lt;/a&gt;.&amp;nbsp; These application provide the fastest OCR engines, and their recognition processing time can be anywhere from 100-600 pages per minute, depending on the types of pages you are scanning.&amp;nbsp; &lt;br /&gt;
&lt;br /&gt;
&lt;em&gt;Do you utilize MFPs / Copiers to scan document to sharepoint?&lt;/em&gt;&lt;br /&gt;
&lt;br /&gt;
Most companies are trying to leverage their investment in their copier hardware to provide end users a great scanning and capture onramp to SharePoint.&amp;nbsp; In this case, you typically want an OCR application that can provide recognition on the fly, and do the conversion process behind the scenes.&amp;nbsp; Their are many MFP integrated applications on the market that can provide the &lt;strong&gt;OCR engine&lt;/strong&gt;: &lt;a href="http://www.psigen.com/"&gt;PSIGEN PSI:Capture&lt;/a&gt;, &lt;a href="http://www.nsius.com/"&gt;NSI AutoStore&lt;/a&gt;, &lt;a href="http://www.ecopy.com/"&gt;eCopy&lt;/a&gt; to name a few.&lt;br /&gt;
&lt;br /&gt;
&lt;em&gt;Do the end users compile, combine and work with documents at their desktops?&lt;/em&gt;&lt;br /&gt;
&lt;br /&gt;
In environments where end users are constantly working in their documents, and need desktop scanning access, typically and &lt;strong&gt;OCR Desktop application&lt;/strong&gt; can be the best solution.&amp;nbsp; These applications can put the control of the conversion process in the end user's hands, and can provide them OCR capability at the click of the mouse.&amp;nbsp; Some apps in this class are &lt;a href="http://www.ecopy.com/"&gt;eCopy Paperworks&lt;/a&gt;, &lt;a href="http://www.nuance.com/products/"&gt;PaperPort and OmniPage&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
All of the OCR Solutions on this page focus on doing the process before the documents hit SharePoint.&amp;nbsp; I will write an article later on solutions that can OCR documents within SharePoint Libraries later.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-7942375819928716295?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/5FFyAtFw7EBRER3zkdGuNrFzkQI/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/5FFyAtFw7EBRER3zkdGuNrFzkQI/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/5FFyAtFw7EBRER3zkdGuNrFzkQI/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/5FFyAtFw7EBRER3zkdGuNrFzkQI/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/7942375819928716295/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/01/microsoft-sharepoint-and-ocr.html#comment-form" title="3 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/7942375819928716295?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/7942375819928716295?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/01/microsoft-sharepoint-and-ocr.html" title="Microsoft SharePoint and OCR" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>3</thr:total></entry><entry gd:etag="W/&quot;CEUFRHg_eSp7ImA9WxBVF0Q.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-6166680377425814270</id><published>2010-01-23T06:43:00.000-08:00</published><updated>2010-02-21T14:43:35.641-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-21T14:43:35.641-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="OMR Software" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR Software" /><category scheme="http://www.blogger.com/atom/ns#" term="ICR Software" /><title>What is OCR, ICR and OMR?</title><content type="html">&lt;span style="font-size: large;"&gt;&lt;strong&gt;What is OCR, ICR and OMR?&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
In the area of text conversion, there is often confusion on the acronyms that surround the industry, and what each one designates.&amp;nbsp; Below are some quick overviews of each of the recognition technologies, and what they accomplish:&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-size: large;"&gt;&lt;strong&gt;Optical Character Recognition (OCR) Software&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
OCR Software takes images, and converts them to searchable text.&amp;nbsp; The output can be a plain text file, or the industry standard today is an image with hidden text PDF.&amp;nbsp; OCR can also be utilized to extract data from scanned images, providing a means to either harvest information, or create index fields for later search. &lt;a href="http://en.wikipedia.org/wiki/OCR_software"&gt;OCR Software Definition&lt;/a&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-size: large;"&gt;&lt;strong&gt;Intelligent Character Recognition (ICR) Software&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
ICR Software provides the ability to recognize handwritten, or hand printed text.&amp;nbsp; This process can be extrememly accurate when the printed text is bound by boxes, or combed form fields.&amp;nbsp; Hanwriting is a little more complex, and typically requires many samples to be accurate. &lt;a href="http://en.wikipedia.org/wiki/Intelligent_character_recognition"&gt;ICR Software Definition&lt;/a&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-size: large;"&gt;&lt;strong&gt;Optical Mark Recognition (OMR) Software&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
OMR Software, somtimes called "mark sesnse", provides the ability to read checked boxes on forms or documents.&amp;nbsp; The software senses the difference between an unmarked and marked box using a baseline reading, and then allows the recognition to take place.&lt;br /&gt;
&lt;br /&gt;
Many manufacturers combine all 3 into a single recognition engine that provides powerful analysis capabilities for scanned documents and forms.&amp;nbsp; &lt;a href="http://en.wikipedia.org/wiki/Optical_mark_recognition"&gt;OMR Software Definition&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-6166680377425814270?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/qSuwAD8E9CDNSzHNTETsM4Tbfe4/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/qSuwAD8E9CDNSzHNTETsM4Tbfe4/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/qSuwAD8E9CDNSzHNTETsM4Tbfe4/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/qSuwAD8E9CDNSzHNTETsM4Tbfe4/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/6166680377425814270/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/01/what-is-ocr-icr-and-omr.html#comment-form" title="3 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/6166680377425814270?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/6166680377425814270?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/01/what-is-ocr-icr-and-omr.html" title="What is OCR, ICR and OMR?" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>3</thr:total></entry><entry gd:etag="W/&quot;CEUHRnY7eSp7ImA9WxBVF0Q.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-4329726330938623356</id><published>2010-01-16T16:53:00.000-08:00</published><updated>2010-02-21T14:43:57.801-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-21T14:43:57.801-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="OCR application" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR accuracy" /><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR Software" /><title>OCR Software and Image Processing</title><content type="html">&lt;span style="font-size: large;"&gt;&lt;strong&gt;OCR Software and Image Processing&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
Why is image processing so important when utilizing &lt;a href="http://ocrsoftware-1.blogspot.com/2009/12/what-is-ocr-software.html"&gt;&lt;strong&gt;Optical Character Recognition&lt;/strong&gt; Software&lt;/a&gt;?&lt;br /&gt;
&lt;br /&gt;
In order to get the highest possible accuracy with your OCR Application, the recognition process needs to have a clean image to examine.&amp;nbsp; The most important are &lt;strong&gt;auto-orientation, deskew and despeckle&lt;/strong&gt;.&amp;nbsp; The Auto-orientation process examines tha page, and makes sure it is oriented correclty for the whole recognition process.&amp;nbsp; Deskew examines the page for any skewing, whcih may occur during the scan process, and "rights" the page to make sure the text is inline throughout the page.&amp;nbsp; Despeckle takes away any speckles on the page that can be falsely identified as font characters, but also can be attributed to any misreads of characters.&lt;br /&gt;
&lt;br /&gt;
Older documents may require other functions, such as font improvement and deshading to insure the highest possible accuracy in the overall OCR process.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-4329726330938623356?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/1_P2dYcoEr-4WNGwCc6NsCzuBo8/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/1_P2dYcoEr-4WNGwCc6NsCzuBo8/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/1_P2dYcoEr-4WNGwCc6NsCzuBo8/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/1_P2dYcoEr-4WNGwCc6NsCzuBo8/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/4329726330938623356/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/01/ocr-software-and-image-processing.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/4329726330938623356?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/4329726330938623356?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2010/01/ocr-software-and-image-processing.html" title="OCR Software and Image Processing" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total></entry><entry gd:etag="W/&quot;CEUBSHk4fyp7ImA9WxBVF0Q.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-5390822684096437490</id><published>2009-12-30T07:25:00.000-08:00</published><updated>2010-02-21T14:44:19.737-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-21T14:44:19.737-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="scanning" /><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition" /><category scheme="http://www.blogger.com/atom/ns#" term="capture" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR Software" /><title>Optical Character Recognition (OCR) and Capture</title><content type="html">&lt;strong&gt;&lt;span style="font-size: large;"&gt;Optical Character Recognition (OCR) and Capture&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;
&lt;br /&gt;
So what is document capture software and what does it have to do with OCR applications.&amp;nbsp; So, I think first, we need to differentiate between scanning software and capture software.&amp;nbsp; Here is a good blog post that goes over the differences, with regards to&amp;nbsp;&lt;a href="http://scanningwithsharepoint.wordpress.com/2009/11/24/sharepoint-scanning-vs-capture-software/"&gt;SharePoint Scanning&lt;/a&gt;.&amp;nbsp; Scanning Software just gives you the ability to convert paper to a digital form, and then OCR.&amp;nbsp; Capture Software takes this a step further, and is really a catalyst for some enhanced processing with your recognition engine.&amp;nbsp; Typical capture software will allow you to perform zone OCR, scan multiple documents in a single stack through separation, perform OCR based separation or even analyze the OCR text for expressions and then automatically extract the data.&amp;nbsp; PSIGEN &lt;a href="http://www.psigen.com/"&gt;Document Capture&lt;/a&gt;&amp;nbsp;software provides enhanced data extraction, as an example, as do other vendors like Kofax, AnyDoc, Captiva, etc.&lt;br /&gt;
&lt;br /&gt;
So, I guess the whole point here is that OCR software in most cases just provides a basic framework for the conversion process.&amp;nbsp; you really need a capture application to harness the true power of any OCR or recognition engine.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-5390822684096437490?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/Mx7EKmG5GreFi1l4KZzwg95J4pw/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/Mx7EKmG5GreFi1l4KZzwg95J4pw/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/Mx7EKmG5GreFi1l4KZzwg95J4pw/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/Mx7EKmG5GreFi1l4KZzwg95J4pw/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/5390822684096437490/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2009/12/optical-character-recognition-ocr-and.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/5390822684096437490?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/5390822684096437490?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2009/12/optical-character-recognition-ocr-and.html" title="Optical Character Recognition (OCR) and Capture" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total></entry><entry gd:etag="W/&quot;DEQCRng8eSp7ImA9WxBSGU4.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-6348138355927230912</id><published>2009-12-27T09:52:00.000-08:00</published><updated>2009-12-27T09:52:47.671-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-12-27T09:52:47.671-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="OCR accuracy" /><category scheme="http://www.blogger.com/atom/ns#" term="recognition software" /><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR Software" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR results" /><title>How do I pick the right OCR Software?</title><content type="html">In the space of &lt;strong&gt;OCR Sof&lt;/strong&gt;tware, or &lt;strong&gt;Optical Character Recognition&lt;/strong&gt;, it can be confusing to say the least on which option you should pick.&amp;nbsp; It really comes down to the use case, or how you will utilize the software.&amp;nbsp; Below are some great question to ask your self:&lt;br /&gt;
&lt;br /&gt;
&lt;em&gt;What do I need to convert with my OCR Software?&lt;/em&gt;&amp;nbsp; &lt;br /&gt;
This question is very important, and it really comes down to what you are looking to output with your software.&amp;nbsp; Do you want a word file that you can edit, or are you just looking to create a &lt;strong&gt;searchable PDF&lt;/strong&gt;?&amp;nbsp; Many engines are tuned for accuracy, and will give you the best formatted output, others are built for speed.&amp;nbsp; &lt;a href="http://www.nuance.com/imaging/products/omnipage.asp"&gt;Omni-page&lt;/a&gt; is an excellent engine for creating nicely formatted output, but can be rather slow due to its focus on acuracy.&amp;nbsp; A production engine, like &lt;a href="http://www.psigen.com/psicapture-enterprise.aspx"&gt;PSI:Capture&lt;/a&gt;, which offers multiple OCR choices, can give you great flebility, no matter your ouput choice.&lt;br /&gt;
&lt;br /&gt;
&lt;em&gt;Are they pre-existing images, or ones that I will scan?&amp;nbsp; PDFs or TIFFs?&lt;/em&gt;&lt;br /&gt;
It is really important when you are choosing Optical Character &lt;strong&gt;Recognition Software&lt;/strong&gt;, to make sure that you have all the functionality you require, whether you are scanning, or just processing non-searchable PDFs from a directory.&amp;nbsp; Most of the &lt;strong&gt;OCR Software&lt;/strong&gt; will let you choose the file that you perform recognition on, and others will let you scan in paper for conversion.&amp;nbsp; If you are utilizing &lt;strong&gt;MFPs or Scanning copiers&lt;/strong&gt;, and want to perform OCR on the &lt;strong&gt;scanned documents&lt;/strong&gt;, you may want to choose a product that performs auto-import, or one that is focused on &lt;a href="http://www.psigen.com/psicapture-for-mfp.aspx"&gt;MFP Scanning&lt;/a&gt;.&amp;nbsp; Also, you want flexibility in the types of file you can process, and want to be able to OCR any image type:&amp;nbsp; &lt;strong&gt;PDF, TIFF, JPG, GIF,&lt;/strong&gt; BMP, etc.&lt;br /&gt;
&lt;em&gt;How fast can I do conversions?&lt;/em&gt;&lt;br /&gt;
So, some engines are built for OCR Accuracy, others built for speed in the OCR process. Most of the desktop engines, like &lt;a href="http://www.ecopy.com/products_ecopy_desktop.asp"&gt;eCopy Desktop&lt;/a&gt;, provide a good mix of both.&amp;nbsp; Other engines, like Glyphreader or Docustar, provide the ability to choose whether you want speed or accuracy in your OCR results.&amp;nbsp; It is always good to choose a &lt;a href="http://www.psigen.com/modules/capture/imaging/core_imaging_scanning_capture_modules.aspx"&gt;document capture&lt;/a&gt; option that allows you multiple OCR engine options to perform diffferent recognition tasks.&lt;br /&gt;
&lt;br /&gt;
&lt;em&gt;How ddo I get the best accuracy in the OCR ouput?&lt;/em&gt;&lt;br /&gt;
All of the OCR Software mentioned within this post reuires a high quality image for the best recognition accuracy.&amp;nbsp; With that said, a high quality scanning software with image processing options will lead to the best &lt;strong&gt;OCR accuracy when converting from image to text&lt;/strong&gt;.&amp;nbsp; So what does image processing have to do with OCR Software?&amp;nbsp; The cleaner the image, the better the accuracy, and if you can deskew, despeckle, deshade and sharpen text, you will get better OCR results.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-6348138355927230912?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/VXZ1wHdCwoA4Wl5sM1mTucMgLLA/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/VXZ1wHdCwoA4Wl5sM1mTucMgLLA/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/VXZ1wHdCwoA4Wl5sM1mTucMgLLA/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/VXZ1wHdCwoA4Wl5sM1mTucMgLLA/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/6348138355927230912/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2009/12/how-do-i-pick-right-ocr-software.html#comment-form" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/6348138355927230912?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/6348138355927230912?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2009/12/how-do-i-pick-right-ocr-software.html" title="How do I pick the right OCR Software?" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total></entry><entry gd:etag="W/&quot;CEUMR3Y9eSp7ImA9WxBVF0Q.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-5460058906767220100</id><published>2009-12-13T13:29:00.000-08:00</published><updated>2010-02-21T14:44:46.861-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-21T14:44:46.861-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition Software" /><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition" /><category scheme="http://www.blogger.com/atom/ns#" term="zone ocr" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR Software" /><title>What is Zone OCR?</title><content type="html">&lt;strong&gt;&lt;span style="font-size: large;"&gt;What is Zone OCR?&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;strong&gt;Zone OCR Software&lt;/strong&gt; provides the ability to focus in on just a single, or multiple, sections (zones) of a scanned document or image.&amp;nbsp; Converting specific zones to text is an important &lt;strong&gt;optical character recognition&lt;/strong&gt; feature set, and one that can be applied in just about any business type.&amp;nbsp; Its main use is to harvest values from images, and utilize them as index values, to provide search capability later.&amp;nbsp; Not all zone OCR engines are equal, and you typically need a very accurate engine to produce the required results. Some accurate engines include &lt;a href="http://www.atalasoft.com/Products/DotImage/OCR/GlyphReader/default.aspx"&gt;Glyphreader&lt;/a&gt;, &lt;a href="http://www.captaris-dt.com/product/recostar/en/"&gt;Recostar&lt;/a&gt;, Docustar and many others.&lt;br /&gt;
&lt;br /&gt;
It is often imperative to "clean up" the zone prior to attempting the conversion to text.&amp;nbsp; Clean up can include line removal, despeckle, deskew, etc., which are found&amp;nbsp; in almost any product that provides &lt;a href="http://www.psigen.com/modules/capture/ocr/optical_character_mark_recognition.aspx"&gt;OCR and Image Processing&lt;/a&gt; features.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-5460058906767220100?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/p1Xltil-ucAxRv0obUY4b5tEAAw/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/p1Xltil-ucAxRv0obUY4b5tEAAw/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/p1Xltil-ucAxRv0obUY4b5tEAAw/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/p1Xltil-ucAxRv0obUY4b5tEAAw/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/5460058906767220100/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2009/12/what-is-zone-ocr.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/5460058906767220100?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/5460058906767220100?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2009/12/what-is-zone-ocr.html" title="What is Zone OCR?" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total></entry><entry gd:etag="W/&quot;CEQERng5cSp7ImA9WxBVF0Q.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-578917536319135749</id><published>2009-12-07T22:20:00.000-08:00</published><updated>2010-02-21T14:45:07.629-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-21T14:45:07.629-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition Software" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR Software" /><category scheme="http://www.blogger.com/atom/ns#" term="open source OCR" /><title>Open Source OCR Software</title><content type="html">&lt;span style="font-size: large;"&gt;&lt;strong&gt;Open Source OCR Software&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
The open source&amp;nbsp; movement has created some great &lt;strong&gt;OCR Software / Optical Character Recognition&lt;/strong&gt; Software.&amp;nbsp; Below are links and info:&lt;br /&gt;
&lt;br /&gt;
&lt;strong&gt;OCRopus OCR Software&lt;/strong&gt;&lt;br /&gt;
This is a project sponsored by Google, and is a state of the art &lt;strong&gt;OCR application&lt;/strong&gt;.&amp;nbsp; It is focused on high volume OCR needs, and includes a conversion engine, layout analysis, modeling and multi-lingual capabilities.&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://code.google.com/p/ocropus/"&gt;OCRopus OCR Software Download&lt;/a&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;strong&gt;GOCR OCR Application&lt;/strong&gt;&lt;br /&gt;
Developed under the GNU Public License, is can be used with various front ends to convert immages to text, and is open to different image formats.&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://jocr.sourceforge.net/"&gt;GOCR OCR Application Download&lt;/a&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;strong&gt;Tesseract OCR Engine&lt;/strong&gt;&lt;br /&gt;
Engine developed by HP in the late 80s when OCR Software was in its infancy.&amp;nbsp; Google uses the engine in its OCRopus.&amp;nbsp; &lt;a href="http://www.psigen.com/"&gt;Document Capture&lt;/a&gt; companies like PSIGEN have made the Tesseract Engine an option for afvanced capture.&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://code.google.com/p/tesseract-ocr/"&gt;Tesseract OCR Engine Download&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-578917536319135749?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/KD3XUXF5NiCDZ05vk1_KjPJWzIw/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/KD3XUXF5NiCDZ05vk1_KjPJWzIw/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/KD3XUXF5NiCDZ05vk1_KjPJWzIw/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/KD3XUXF5NiCDZ05vk1_KjPJWzIw/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/578917536319135749/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2009/12/open-source-ocr-software.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/578917536319135749?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/578917536319135749?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2009/12/open-source-ocr-software.html" title="Open Source OCR Software" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total></entry><entry gd:etag="W/&quot;A04HQXs5fip7ImA9WxBTF04.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-5231085371455904099</id><published>2009-12-05T08:22:00.000-08:00</published><updated>2009-12-13T13:32:10.526-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-12-13T13:32:10.526-08:00</app:edited><title>What is OCR Software?</title><content type="html">&lt;span style="font-size: large;"&gt;&lt;em&gt;OCR Software and Application Features&lt;/em&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
We typically use the term &lt;strong&gt;OCR (Optical Character Recognition)&lt;/strong&gt; to mean the conversion of an image to text. There are several OCR functions that are utilized in &lt;a href="http://www.psigen.com/"&gt;advanced document capture software&lt;/a&gt;, such as PSIGEN PSI:Capture, and your every day scanning software, such as &lt;a href="http://www.vizitsp.com/"&gt;VizitSP&lt;/a&gt;. &lt;br /&gt;
&lt;br /&gt;
For an in depth definition, go here -&amp;gt;&amp;nbsp; &lt;a href="http://en.wikipedia.org/wiki/Optical_character_recognition"&gt;Wikipedia OCR Software&lt;/a&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
So let’s talk a bit about the different OCR functionalities:&lt;br /&gt;
&lt;br /&gt;
&lt;strong&gt;Full Text OCR&lt;/strong&gt;&lt;br /&gt;
&lt;br /&gt;
Full Text OCR takes the entire image and converts it to a text output. The OCR output can be in several formats, including: Searchable PDF, Microsoft Word, Plain Text, HTML, etc. The main goal of full text OCR is typically “searchability”, and the results are usually placed into a backend repository, such as Microsoft SharePoint (For more about SharePoint Scanning and Capture – &lt;a href="http://scanningwithsharepoint.wordpress.com/"&gt;SharePoint Scanning&lt;/a&gt;).&lt;br /&gt;
&lt;br /&gt;
&lt;strong&gt;Zone OCR&lt;/strong&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://ocrsoftware-1.blogspot.com/2009/12/what-is-zone-ocr.html"&gt;Zone OCR&lt;/a&gt; only looks at a particular region, or zone, of the scanned page and converts just that portion to text. There are several reasons to use zone OCR rather than full text:&lt;br /&gt;
&lt;br /&gt;
• You only want to search on the information in that zone.&lt;br /&gt;
&lt;br /&gt;
• Full Text OCR takes much longer, so Zone OCR can speed up processing time.&lt;br /&gt;
&lt;br /&gt;
• You want to extract the contents of the zone, and place it into an index field.&lt;br /&gt;
&lt;br /&gt;
Most advanced document capture applications provide the ability to map the contents of a zone to an index field, that can then provide granular search capabilities based only on that field.&lt;br /&gt;
&lt;br /&gt;
&lt;strong&gt;OCR-Assisted Indexing&lt;/strong&gt;&lt;br /&gt;
&lt;br /&gt;
OCR-Assisted Indexing, or point-and-click indexing, provide the user the capability to just click on words or segments of a document, and convert that image portion to text. This capability exists in many different capture applications, and provides a simple, easy indexing function on documents.&lt;br /&gt;
&lt;br /&gt;
&lt;strong&gt;Rubberband OCR&lt;/strong&gt;&lt;br /&gt;
&lt;br /&gt;
Rubberband OCR provides the ability to drag a box with the mouse over a portion of text, and automatically convert that segment into text, and even place it into an index field. It is similar to OCR-Assisted Indexing, but allows the capture of large portions of text on a scanned image.&lt;br /&gt;
&lt;br /&gt;
&lt;strong&gt;OCR Separation&lt;/strong&gt;&lt;br /&gt;
&lt;br /&gt;
One of the key challenges in document scanning and capture is the ability to easily split a stack of paper into individual documents. Advanced Document Capture Software can provide the ability split whenever a key term or word is found on a page through the OCR process. Utilizing the Optical Character Recognition engine in this manner can save on document preparation time before scanning and capture.&lt;br /&gt;
&lt;br /&gt;
&lt;strong&gt;Advanced Data Extraction&lt;/strong&gt;&lt;br /&gt;
&lt;br /&gt;
Many of the Document Capture applications on the market today provide a means of extracting data through some type expression matching, or extraction engine. OCR Software is utilized to do the text conversion prior to the extraction.&amp;nbsp; For an example of data extraction, see this YouTube Video - &lt;a href="http://www.youtube.com/psigensoftware#p/u/5/aSSgSd5cqEM"&gt;EOB Processing and Data Extraction&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
&lt;strong&gt;Forms Identification&lt;/strong&gt;&lt;br /&gt;
&lt;br /&gt;
Another key use of the OCR results can be the identification of documents. &lt;strong&gt;Optical Character Recognition&lt;/strong&gt; can identify key elements on a document, and then determine how to process it based on those elements.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-5231085371455904099?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/VC6HoRd1bDqUGQYFBBAtP5CyNCo/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/VC6HoRd1bDqUGQYFBBAtP5CyNCo/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/VC6HoRd1bDqUGQYFBBAtP5CyNCo/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/VC6HoRd1bDqUGQYFBBAtP5CyNCo/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/5231085371455904099/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2009/12/what-is-ocr-software.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/5231085371455904099?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/5231085371455904099?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2009/12/what-is-ocr-software.html" title="What is OCR Software?" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total></entry><entry gd:etag="W/&quot;D0IGQXcyeCp7ImA9WxNaFU4.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-5811566341987931442</id><published>2009-11-29T14:58:00.000-08:00</published><updated>2009-11-29T14:58:40.990-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-11-29T14:58:40.990-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR Software Definition" /><title>What is OCR? (Optical Character Recognition)</title><content type="html">&lt;span style="font-size: large;"&gt;&lt;em&gt;OCR Definition&lt;/em&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;strong&gt;OCR&lt;/strong&gt; &lt;strong&gt;software&lt;/strong&gt; or &lt;strong&gt;Optical Character Recognition&lt;/strong&gt; &lt;strong&gt;Software&lt;/strong&gt; is a function of certain software applications that provides the means to convert images, or portions of images to text.&amp;nbsp; Scanned documents are almost always create as non-text image formats, such as &lt;strong&gt;TIFF, PDF, JPG&lt;/strong&gt;, etc.&amp;nbsp; The process of basic OCR makes them searchable, and thus more useful when you require the ability to search the contents of scanned documents.&amp;nbsp; The core system uses a combination of pattern recognition and artifical intelligence to interpret the images, and create the&amp;nbsp; most accurate output.&amp;nbsp; Many of the more popular engines provide the ability to output not only to text, but word processor formts, HTML, PDF, etc.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-5811566341987931442?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/QCjqdZXYS0ouYUPj8cTX5ZdhHHI/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/QCjqdZXYS0ouYUPj8cTX5ZdhHHI/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/QCjqdZXYS0ouYUPj8cTX5ZdhHHI/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/QCjqdZXYS0ouYUPj8cTX5ZdhHHI/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/5811566341987931442/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2009/11/what-is-ocr-optical-character.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/5811566341987931442?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/5811566341987931442?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2009/11/what-is-ocr-optical-character.html" title="What is OCR? (Optical Character Recognition)" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total></entry><entry gd:etag="W/&quot;CEQBQH8_fyp7ImA9WxBVF0Q.&quot;"><id>tag:blogger.com,1999:blog-3486681360798130403.post-7619266609756139883</id><published>2009-11-29T14:40:00.000-08:00</published><updated>2010-02-21T14:45:51.147-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-21T14:45:51.147-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Optical Character Recognition" /><category scheme="http://www.blogger.com/atom/ns#" term="OCR Software" /><title>OCR Software</title><content type="html">&lt;span style="font-size: large;"&gt;&lt;strong&gt;OCR Software&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
This is a new blog dedicated to &lt;strong&gt;OCR Software&lt;/strong&gt;, &lt;strong&gt;OCR Technologies&lt;/strong&gt; and &lt;strong&gt;Optical Character Recognition Software&lt;/strong&gt; review.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3486681360798130403-7619266609756139883?l=ocrsoftware-1.blogspot.com' alt='' /&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/4TWj6flCh9Rqz514kEDU8rSKZMI/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/4TWj6flCh9Rqz514kEDU8rSKZMI/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/4TWj6flCh9Rqz514kEDU8rSKZMI/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/4TWj6flCh9Rqz514kEDU8rSKZMI/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content><link rel="replies" type="application/atom+xml" href="http://ocrsoftware-1.blogspot.com/feeds/7619266609756139883/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://ocrsoftware-1.blogspot.com/2009/11/ocr-software.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/7619266609756139883?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/3486681360798130403/posts/default/7619266609756139883?v=2" /><link rel="alternate" type="text/html" href="http://ocrsoftware-1.blogspot.com/2009/11/ocr-software.html" title="OCR Software" /><author><name>Stephen</name><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total></entry></feed>

