<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>In-Depth Understanding</title>
	
	<link>http://blogs.northernlight.com/ceo</link>
	<description>by David Seuss, CEO of Northern Light.  Search engines must evolve to have an in-depth understanding of the searched material.  It is necessary that the search engine grasp the business purpose for the search and that search goes beyond presenting document lists to users.  The search engine must interpret and analyze the search results and then present findings that would be considered most significant by the user if the user were able to read all of the retrieved documents.</description>
	<lastBuildDate>Wed, 17 Feb 2010 19:57:02 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/DavidSeuss" /><feedburner:info uri="davidseuss" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><creativeCommons:license>http://creativecommons.org/licenses/by-nc-nd/2.0/</creativeCommons:license><feedburner:emailServiceId>DavidSeuss</feedburner:emailServiceId><feedburner:feedburnerHostname>http://feedburner.google.com</feedburner:feedburnerHostname><item>
		<title>Why Search Is the Wrong Answer To the Wrong Problem</title>
		<link>http://feedproxy.google.com/~r/DavidSeuss/~3/aXRbQvbN8rI/</link>
		<comments>http://blogs.northernlight.com/ceo/2010/02/17/why-search-is-the-wrong-answer-solving-the-wrong-problem/#comments</comments>
		<pubDate>Wed, 17 Feb 2010 19:52:03 +0000</pubDate>
		<dc:creator>David Seuss</dc:creator>
				<category><![CDATA[Connects]]></category>
		<category><![CDATA[SinglePoint]]></category>

		<guid isPermaLink="false">http://blogs.northernlight.com/ceo/?p=148</guid>
		<description><![CDATA[Search is the enterprise application that suffers the most from reinvention of the wheel.  While you are researching a business opportunity, competitive development, or market event using a search engine, other people in your company are probably repeating the same search.  They may be in an office near you, on another floor of the building, across the corporate [...]]]></description>
			<content:encoded><![CDATA[<p>Search is the enterprise application that suffers the most from reinvention of the wheel.  While you are researching a business opportunity, competitive development, or market event using a search engine, other people in your company are probably repeating the same search.  They may be in an office near you, on another floor of the building, across the corporate campus, or in a business unit on another continent.  There is no greater loss productivity from any enterprise application than in the search process when many times each day employees in different locations, divisions, and product groups of the same company, unbeknownst to each other, search their enterprise repositories and licensed market intelligence information for the same (or closely related) pieces of information.  The search engine industry has even institutionalized this problem, making it a “feature not a bug,” by popularizing notions such as “most popular queries.”  Think about it, most popular queries?  How again is it good that same query is repeated over and over?    </p>
<p> </p>
<p>Not only are huge amounts of professional labor wasted, the results and success of these duplicated searches vary widely based on the different levels of skill and domain expertise of the individuals conducting them.  That means some people (the less proficient searchers and those with less time to spend on the search process) will end up making less well-informed decisions, because they didn’t find the most relevant information, the material with the best commentary, analysis, and perspective when they did their research. </p>
<p><span id="more-148"></span></p>
<p> </p>
<p>The good news is search applications are evolving from “find me a list of documents to read” to “find me a person that knows the answer.”   It turns out that a particular capability that strategic research applications can borrow from the Web 2.0 world can form a basis for turning the search engine into a collaborative environment.  That capability is bookmarking and tagging.  </p>
<p> </p>
<p>For instance, you are researching a particular topic and find a report especially insightful or useful.  You  want to be able to recall this document later.  Of course, all search applications suffer from the problem of the document you loved was hit number 3 today, but four months from now when you want to find it again it is hit number 367 because so much new content has been added.  And you have to remember how you found it in the first place…what were those query terms, was the document from this source or that one, was its title this or that?   The absolute solution to this problem is to bookmark the document and tag it with a useful word or two to help you find it later; tags that indicate why you liked it in the first place.  Maybe you also put a note on the document explaining even more, a note that becomes part of the document record and travels with it.  To find the document again later, you simply click on the relevant word in a tag cloud of all of your bookmarked documents and presto! there it is.</p>
<p> </p>
<p>Users bookmark and tag documents to make their own life easier, not for altruistic reasons.  They are not taking their precious time “rating”  documents for unknown future users who may or may not benefit from the ratings.  (You don&#8217;t rate for yourself, which explains why document rating applications never work in the enterprise.)  But who cares what the motivations are – they’ve done it.  A strategic research application can observe each user’s tagging behavior and the expose it in other places to other users that will benefit from seeing the tags and notes when the context is right.   Such a strategic research application becomes a vehicle for discovering not only documents that have already been deemed especially useful or relevant in the context of a particular research inquiry, <em>but also</em> for identifying the individuals who were there first and most often.  So the search application can lead subsequent researchers to a domain expert and natural collaboration partner without that person doing anything other than what he or she already does for his or her own purposes.  This has the effect of greatly levering that domain expert&#8217;s impact on the business.</p>
<p> </p>
<p>Users of such a “collaborative” enterprise search application get to know over time those individuals in their extended organization has research interests similar to their own, and can subscribe to the bookmarking behavior of those individuals, following them a la Twitter.   Users can follow the path of breadcrumbs to the internal experts and go directly to them with the hardest, most important research problems.  What’s more, each user can affect the relevance ranking of documents identified through a particular search query through their tagging activity.  There is no mathematical algorithm that a search engine company will ever come up with for relevance ranking that can match the intelligence of a domain expert deeming a document worthy by tagging it and by providing additional texture through the notes and commentary they add. </p>
<p> </p>
<p>It’s time to advance to enterprise search applications that lead people to the domain experts on a topic, as well as to documents that contain information about it.  Better than asking a search engine a question is asking a person that knows the answer.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.northernlight.com/ceo/2010/02/17/why-search-is-the-wrong-answer-solving-the-wrong-problem/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://blogs.northernlight.com/ceo/2010/02/17/why-search-is-the-wrong-answer-solving-the-wrong-problem/</feedburner:origLink></item>
		<item>
		<title>Beyond Sentiment Scoring</title>
		<link>http://feedproxy.google.com/~r/DavidSeuss/~3/Uc7HNBd_g2g/</link>
		<comments>http://blogs.northernlight.com/ceo/2009/10/26/beyond-sentiment-scoring/#comments</comments>
		<pubDate>Mon, 26 Oct 2009 19:38:03 +0000</pubDate>
		<dc:creator>David Seuss</dc:creator>
				<category><![CDATA[MI Analyst]]></category>

		<guid isPermaLink="false">http://blogs.northernlight.com/ceo/?p=140</guid>
		<description><![CDATA[There are times I think the text analytics industry has painted itself into a corner with sentiment scoring.  Not too long ago I attended an industry event in which every provider of text analytics that presented talked about how their solution could do sentiment scoring, and also a few other things.  Speaker after speaker, I thought [...]]]></description>
			<content:encoded><![CDATA[<p>There are times I think the text analytics industry has painted itself into a corner with sentiment scoring.  Not too long ago I attended an industry event in which every provider of text analytics that presented talked about how their solution could do sentiment scoring, and also a few other things.  Speaker after speaker, I thought the &#8220;few other things&#8221; mentioned in passing were way more useful than sentiment scoring.  But the speakers appeared to feel that sentiment scoring was what text analytics is about, at least from a PR and marketing perspective.</p>
<p>I suspect it was history and commercial pressure that caused this.  The history part is that the origin of text analytics was in intelligence agencies trying to analyze world media.  Around 1997, I was invited to observe what must have been one of the first implementations of text analytics.  The developer was government research firm Bolt, Beranek, and Newman.  (Right, that BBN, the one that invented the Internet.)  This modest project was recording all of the media broadcasts in all languages around the world, translating them to English, performing entity extraction, and scoring the sentiment toward each entity.  The client of BBN was, of course, the obvious three-letter intelligence agency.  The entities in question were nations, governments, political leaders, militaries, guerrilla organizations, and the like.  The use was to assess such things as developing threats and political upheavals.   Needless to say, I was blown away.  And sentiment scoring was the essence of the value add the application delivered.  The whole operation was carried out, fundamentally, for the purpose of scoring the sentiment toward the entities and watching for trends.</p>
<p>A few years later sentiment scoring made the jump to the commercial space.   Companies doing media monitoring, counting stories, and providing &#8220;clips&#8221; were able to add sentiment scoring as a flashy new technology.  And (Whamo! Presto!)  the industry of reputation management was born.  From a marketing perspective, sentiment scoring for reputation management was a brilliant move.  If you weren&#8217;t tracking your positives and negatives, you were just plain inadequate as a marketing communications manager.  And the economics were (and still are) terrific for those of us in the sentiment scoring business; we had a tool that we could sell to almost every company and industry with few if any changes to the core platform.</p>
<p><span id="more-140"></span></p>
<p>But now I think our success as text analytics vendors of sentiment scoring solutions has painted us into the corner I mentioned above.  Selling sentiment scoring worked so well we haven&#8217;t learned how to apply the techniques of text analytics to other problems.  And these other problems may be more interesting to solve.  Take this piece of text for example.</p>
<p style="padding-left: 30px">“Investors were cheered by Company A’s announcement that it is engaged in a cost reduction effort in order improve profits.”</p>
<p>Entity extraction can clearly identify <span style="text-decoration: underline">Company A</span> as a company entity mentioned in this article.  Also, a sentiment scoring engine might conclude that <span style="text-decoration: underline">Company A</span>  has positive sentiment being expressed toward it because of the emotionally laden words <span style="text-decoration: underline">cheered</span> and <span style="text-decoration: underline">improve</span>. </p>
<p>Of course, the real world is messy.  Now consider the rest of the news article.</p>
<p style="padding-left: 30px">&#8220;Company A will be laying off employees, cutting salaries, terminating health benefits for retirees, and closing plants.”</p>
<p>Suddenly, we don&#8217;t know how to score the sentiment of the article or the sentiment toward the entity Company A.    <span style="text-decoration: underline">Cheered</span> and <span style="text-decoration: underline">improve</span> are mixed in with <span style="text-decoration: underline">laying off</span>, <span style="text-decoration: underline">cutting</span>, <span style="text-decoration: underline">terminating</span>, and <span style="text-decoration: underline">closing</span>.  We still suspect the investors are positive toward Company A, but now worry that the retirees are not.  We are not sure about the author, editor, or publisher of the story; they may just be dispassionately reporting facts, and have no feelings whatsoever toward Company A. </p>
<p>But more important than how to score the sentiment, who cares about sentiment in this story anyway?</p>
<p>This news story is rich with strategic information about Company A.   A text analytics solution that is trying to extract business meaning rather than score sentiment would latch onto these &#8220;meaning-loaded entities,&#8221; expressed as strategic scenarios of interpretation:</p>
<p style="padding-left: 30px"><span style="text-decoration: underline">Company A</span> is <span style="text-decoration: underline">Reducing Costs</span> with a <span style="text-decoration: underline">Staff Reduction</span>.</p>
<p style="padding-left: 30px"><span style="text-decoration: underline">Company A</span> is <span style="text-decoration: underline">Reducing Costs</span> with a <span style="text-decoration: underline">Salary and Wage Reduction</span></p>
<p style="padding-left: 30px"><span style="text-decoration: underline">Company A</span> is <span style="text-decoration: underline">Reducing Costs</span> with <span style="text-decoration: underline">Plant Closings</span></p>
<p>Automatically extracted meaning such as this, when performed not on one story that is all of two sentences long but on tens of thousands of full-length news articles being published weekly on all the companies doing business with one&#8217;s own organization, can greatly shorten the time to insight for business analysts who may be competitors, customers, or suppliers of organizations like Company A.  If you are a competitor, you may be facing a leaner, meaner, more price-aggressive Company A in the market place.  If you are a customer, you may want to look around from alternative sources of supply if one of the plants being closed is the one you have been buying from.  If you are a supplier, you may want to pitch Company A on ways they can cut costs by using your products in place of your own competitor&#8217;s offering.</p>
<p>Sentiment scoring is a nice idea.  But it is time for the text analytics industry to move forward with new, more powerful capabilities.</p>
<p style="padding-left: 30px"> </p>
<p style="padding-left: 30px"> </p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.northernlight.com/ceo/2009/10/26/beyond-sentiment-scoring/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		<feedburner:origLink>http://blogs.northernlight.com/ceo/2009/10/26/beyond-sentiment-scoring/</feedburner:origLink></item>
		<item>
		<title>Innovation and Customer Insight</title>
		<link>http://feedproxy.google.com/~r/DavidSeuss/~3/Bpqf1VXTmTs/</link>
		<comments>http://blogs.northernlight.com/ceo/2009/08/20/innovation-and-customer-insight/#comments</comments>
		<pubDate>Thu, 20 Aug 2009 14:06:41 +0000</pubDate>
		<dc:creator>David Seuss</dc:creator>
				<category><![CDATA[Case Studies]]></category>
		<category><![CDATA[MI Analyst]]></category>
		<category><![CDATA[SinglePoint]]></category>

		<guid isPermaLink="false">http://blogs.northernlight.com/ceo/?p=112</guid>
		<description><![CDATA[In April, my old employer, the Boston Consulting Group (BCG), and BusinessWeek released a study indentifying the most innovative global companies.   The report is based on a survey by BCG of 2,700 senior executives &#8220;representing all major markets and industries.&#8221;   Of even more interest is the research report published by BCG detaling the survey&#8217;s complete findings .  (I will provide [...]]]></description>
			<content:encoded><![CDATA[<p>In April, my old employer, the Boston Consulting Group (BCG), and BusinessWeek released a study indentifying the most innovative global companies.   The report is based on a survey by BCG of 2,700 senior executives &#8220;representing all major markets and industries.&#8221;   Of even more interest is the research report published by BCG detaling the survey&#8217;s complete findings .  (I will provide links to both the BusinessWeek article and the BCG research report at end of the this blog post.)  The research report gives considerable more depth on questions like why a company would want to be an innovator, what degree of importance enterprises ascribe to innovation, what the actual contributions are to operating performance of innovation, and what factors facilitate innovation? </p>
<p>First, why innovate?  BCG found that companies that rank as innovative have stock market performance that is 30% higher over 10 years compared to companies that are not considered innovative.  In this &#8220;big fish eats the smaller fish&#8221; world, a stock price premium of 30% can be tantamount to survival.</p>
<p>Second, how important is the need to innovate perceived to be?  Two out of three senior executives rank innovation as one of their top three strategic priorities. </p>
<p><span id="more-112"></span></p>
<p>Third, how do senior executives measure the contribution of innovation?  The key benefits of being an innovator were identified as: customer satisfaction and revenue growth.  Innovation makes customers believe you are serving their needs better than the other guy and then they buy more from you.  Now that&#8217;s a bottom line for you. </p>
<p>Fourth, what are the key drivers of innovation identified by the senior executives surveyed in the report?  Here are the findings in order of significance:</p>
<ol type="1">
<li>Executive level support</li>
<li>Developing a deep understanding of customers</li>
<li>Fostering a culture of innovation</li>
<li>Partnering with suppliers and others for new ideas</li>
<li>Earmarking sufficient funds</li>
<li>Balancing risks, time frames, and returns across the portfolio of projects</li>
<li>Enforcing project timelines and milestones</li>
<li>Moving quickly from idea generation to initial sales</li>
</ol>
<p>What I find interesting about this list is how high customer insight places.   Senior executives around the world see developing a deep understanding of customers as the second most important strength to have to become an innovative company.  (And, number one and number two on the list scored almost exactly the same - a statistical tie for first place.)</p>
<p>I would submit that one of the key mechanisms for facilitating innovation is having information systems that are effective at searching and interpreting research on customers, markets, competitors, and technologies.  Such systems greatly lever the innovation process<em>.</em>  Let&#8217;s call them &#8220;strategic research portals&#8221; to distinguish them from common-variety enterprise search engines.</p>
<p>What often makes strategic research portals so difficult to develop and operate is that they require integration of research content from external sources, where the most valuable strategic information about customers resides.  IT departments are unprepared technically and culturally to deal with diverse content sets over which they have no control, produced by organizations they cannot influence, and provided via business models that are inconsistent with enterprise computing security conventions.  The most innovative companies often solve this problem first, because without solving it, all subsequent efforts at innovation are handicapped from the start.  </p>
<p> Meaning extraction has a key role to play here.  Search engines of the common variety sold by enterprise computing companies like Microsoft and Google only produce lists of documents for users to wade into.  Common-variety search engine applied to a market research database of rich customer information puts a great burden on the user to read enough of the reports to gain an overall understanding of a topic like, &#8220;What are my customer&#8217;s strategic priorities right now?&#8221;  Let me illustrate how a meaning-extraction enabled search engine can help this process.</p>
<p>A market intelligence professional for an information technology company was assigned to work with a key account team that was planning an important sales call at a strategic client, a leading freight and package delivery company.  The account team was preparing a presentation that dealt with long term information technology platform issues like enhancing &#8220;business agility&#8221; (whatever that means).  The researcher was asked to suggest sales strategies to the account team that would give the team a competitive advantage in winning more business. </p>
<p>This was during the summer of 2008, right after oil prices had hit historic highs.  The market intelligence professional searched for news and analyst commentary on the customer using Northern Light&#8217;s meaning extraction-enabled search engine.  The researcher was presented with this actual search result from the automated analysis of hundreds of news articles and industry-authority blog posts concerning the customer:</p>
<p style="padding-left: 30px">&lt;Customer&gt; is being affected by the business issue Falling Profits</p>
<p style="padding-left: 30px">&lt;Customer&gt; is being affected by the business issue Oil Prices</p>
<p style="padding-left: 30px">&lt;Customer&gt; is being affected by the business issue Energy Costs</p>
<p style="padding-left: 30px">&lt;Customer&gt; is being affected by the business issue Falling Demand</p>
<p>Looking at the automated analysis of such a body of current information about the client, the research professional <em>instantly</em> realized that the client was in no mood to consider long-term, somewhat conceptual benefits of new technology solutions.  Rather, the client was dealing with the present and immediate crisis of oil prices driving up costs and driving down demand for its delivery services &#8212; a double-whammy that was driving an earnings collapse. </p>
<p>The market intelligence professional suggested that the account team throw out its presentation on long-term conceptual benefits of new technology and replace it with a presentation showing how the innovative application of information solutions could help the client lower its fuel costs in the short run.  This key piece of customer insight gave the account team an idea that could be a home run in the competition for new business.</p>
<p>Meaning extraction = more customer insight.  More customer insight = more successful innovation.  More successful innovation = more satisfied customers buying more.  More satisfied customers buying more = higher stock price.   Higher stock price = corporate survival.</p>
<p>Lastly, I was pleased that three of the top ten on the BCG list of most innovative companies are Northern Light SinglePoint strategic research portal clients. This is not an accident.  Providing solutions that help develop deep understanding of customers is what we do.</p>
<p> </p>
<p>p.s. Links:</p>
<h5><a href="http://images.businessweek.com/ss/09/04/0409_most_innovative_cos/1.htm">http://images.businessweek.com/ss/09/04/0409_most_innovative_cos/1.htm</a></h5>
<h5><a href="http://www.bcg.com/publications/files/BCG_Innovation_2009_Apr_2009.pdf">http://www.bcg.com/publications/files/BCG_Innovation_2009_Apr_2009.pdf</a> </h5>
]]></content:encoded>
			<wfw:commentRss>http://blogs.northernlight.com/ceo/2009/08/20/innovation-and-customer-insight/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://blogs.northernlight.com/ceo/2009/08/20/innovation-and-customer-insight/</feedburner:origLink></item>
		<item>
		<title>How To Save $50 million</title>
		<link>http://feedproxy.google.com/~r/DavidSeuss/~3/6uQQ7jfdCJg/</link>
		<comments>http://blogs.northernlight.com/ceo/2009/07/07/how-to-save-50-million/#comments</comments>
		<pubDate>Tue, 07 Jul 2009 21:42:04 +0000</pubDate>
		<dc:creator>David Seuss</dc:creator>
				<category><![CDATA[Case Studies]]></category>
		<category><![CDATA[SinglePoint]]></category>
		<category><![CDATA[linkedin]]></category>

		<guid isPermaLink="false">http://blogs.northernlight.com/ceo/?p=98</guid>
		<description><![CDATA[Woody Allen is famously reported to have said, &#8220;80% of success is showing up.&#8221;  Well, there is something about the ROI for a strategic research portal that is analogous.  By simply concentrating all of a company&#8217;s market research in one repository that is available to everyone involved in strategic business or product  research, huge benefits can be realized [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal" style="margin: 10pt 0in">Woody Allen is famously reported to have said, &#8220;80% of success is showing up.&#8221;  Well, there is something about the ROI for a strategic research portal that is analogous.  By simply concentrating all of a company&#8217;s market research in one repository that is available to everyone involved in strategic business or product  research, huge benefits can be realized without a lot of fuss.</p>
<p class="MsoNormal" style="margin: 10pt 0in">One of our clients just gave us a wonderful example.  They were in a super-time-pressured bidding-war for a potential acquisition.  Like most Fortune 500 companies, this client is global, with many geographic and product divisions around the world employing thousands of people in marketing, strategic planning, and product research.  The acquisition team that was assembled to consider the opportunity was not aware of anyone inside the company who had studied this market in great depth.  Mostly, the acquisition team was using market analysis provided by the target and what they could find with general searching on the Web.  Based on information from these sources, combined with a corporate interest in entering markets like this one, the acquisition team was preparing what was hoped would be a winning bid of over $50 million for the target company.</p>
<p class="MsoNormal" style="margin: 10pt 0in">One of the members of the acquisition team thought to check their new Northern Light SinglePoint Strategic Research Portal to see if there were any reports in the research repository evaluating the market addressed by the potential acquisition.  There in the aggregated collection of tens of thousands of reports from around the world, the acquisition team discovered studies of the market segment preformed in earlier years by different geographic and product divisions.  These studies had not been carried out in the time-curtailed setting of an acquisition bidding-war that demanded an immediate response.  Rather, the study teams had the luxury of sufficient time to research trends in fine detail, to challenge conventional industry wisdom, to question commonly-held assumptions, and to evaluate emerging alternative technologies that addressed the same underlying customer need.  Significantly, the prior studies raised serious questions about the long-term potential of the technical approach being pursued by the acquisition target.  With this knowledge in hand, our client dropped out of the bidding and <em>avoided a $50 million mistake</em>.</p>
<p><span id="more-98"></span></p>
<p class="MsoNormal" style="margin: 10pt 0in">Wow.  Now that&#8217;s an ROI on the strategic research portal.</p>
<p class="MsoNormal" style="margin: 10pt 0in">There is a $50 million decision being made somewhere out there in your company right now.  Wouldn&#8217;t you like to know that the best marketing and technology analysis ever created or purchased by your company that could affect this decision is in the hands of the decision makers?  Perhaps if Woody Allen had been a marketing, product development, or business strategy professional, he might have instead made the observation, &#8220;80% of success is being able to find your research,&#8221; though it would not have been nearly as funny.</p>
<p class="MsoNormal" style="margin: 10pt 0in"> </p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.northernlight.com/ceo/2009/07/07/how-to-save-50-million/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.northernlight.com/ceo/2009/07/07/how-to-save-50-million/</feedburner:origLink></item>
		<item>
		<title>Using Meaning Extraction To Improve Search Results</title>
		<link>http://feedproxy.google.com/~r/DavidSeuss/~3/Pg0kaAJQO6g/</link>
		<comments>http://blogs.northernlight.com/ceo/2009/03/30/using-meaning-extraction-to-improve-search-results/#comments</comments>
		<pubDate>Mon, 30 Mar 2009 21:49:28 +0000</pubDate>
		<dc:creator>David Seuss</dc:creator>
				<category><![CDATA[MI Analyst]]></category>
		<category><![CDATA[SinglePoint]]></category>
		<category><![CDATA[Add new tag]]></category>
		<category><![CDATA[linkedin]]></category>
		<category><![CDATA[Meaning Extraction]]></category>

		<guid isPermaLink="false">http://blogs.northernlight.com/ceo/?p=14</guid>
		<description><![CDATA[In my prior post, &#8220;Meaning Extraction for Business Strategy&#8221;, I describe a new type of search result: &#8220;scenarios of meaning.&#8221;  For example, in a search on &#8216;Cisco and VOIP&#8217;, we get back the scenario &#8220;Cisco is using a product marketing strategy of Professional Services&#8221; that presents a finding about how Cisco is using a particular marketing [...]]]></description>
			<content:encoded><![CDATA[<p>In my prior post, &#8220;Meaning Extraction for Business Strategy&#8221;, I describe a new type of search result: &#8220;scenarios of meaning.&#8221;  For example, in a search on &#8216;Cisco and VOIP&#8217;, we get back the scenario &#8220;Cisco is using a product marketing strategy of Professional Services&#8221; that presents a finding about how Cisco is using a particular marketing initiative in that market.   This new type of search result is unique and exciting.  But the story doesn&#8217;t really stop there.  Meaning extraction can be used to bump up the value of plain old search results as well.  Let me explain.</p>
<p> </p>
<p><strong>Plain Old Search Results</strong></p>
<p><span id="more-14"></span></p>
<p>Let us consider the ubiquitous search result.  Sometime around, oh 1994, the search result made its debut with the Web search engines Lycos, InfoSeek, Excite, and Alta Vista.  A user would input words that the search engine would compare to an index of the words in Web pages (or &#8220;documents&#8221; in industry parlance), generate a list of documents that matched the user&#8217;s inputted words, relevance rank the document list with a secret formula, and then present the list to the user to browse along with a little summary of each document.  The list of relevance-ranked, summarized documents - when translated into an HTML page and delivered to the user&#8217;s browser &#8211; constituted search results as the search engine industry conceived them in 1994, or as I like to call them, <em>plain old search results</em>.</p>
<p> </p>
<p>Sound familiar?  Of course it does, because not much has changed with search results since 1994.  The indexes have gotten bigger, the secret formulas for relevance ranking are now better, and the techniques used to generate the summaries have improved.  But the 1994 format for search results is with us today.  Google has supplanted the industry&#8217;s pioneers for Web search, Microsoft and Autonomy have established the enterprise search market, and Northern Light has defined search for research portals, but we later innovators all adopted the established, traditional structure for search results.  Name another software or web application user interface that is unchanged since 1994!   Come on search engine industry, we can do better than this.  It is time to dump <em>plain old search results.</em></p>
<p><em></em> </p>
<p>What is the pressing need to do better?  Well, the present structure of search results, when a user is performing research as opposed to just looking for a fact like a phone number, just doesn&#8217;t help the user enough.  The little summaries of each document are often not sufficient to judge how helpful the document will be to the user.   This is especially true for substantive documents like journal articles, market research reports, patents etc. </p>
<p> </p>
<p>For example, in Northern Light&#8217;s database of information technology analyst firm research reports, the median report length is five pages, and many have 25 pages; some have 50 pages.  Typical four-line summaries of such reports do not reveal very much about such documents.  What traditional search results do is force the user to guess as to what might be relevant, download a few and try to scan them quickly looking for helpful indicators of whether the document will be useful for the business purposes of the search.  The user persists at this until he or she finds something that might be useful or becomes frustrated by the process and abandons it.  </p>
<p> </p>
<p>Our portal logs suggest that most business researchers on a serious project will actually download between tw0 and four business research documents from the typically 10,000+ hits on a <em>plain old search result</em> and make do with those.  So as users, we are not all that persistent in this interrupt-driven business world.</p>
<p> </p>
<p><strong>Meaning-Loaded Entities</strong></p>
<p>One of the really great things about search applications enhanced with meaning extraction is that we learn a very great amount about each document while we are indexing it.  The industry term for what a text analytics application identifies in a document is <em>entities </em>and the process is called <em>entity extraction.</em>  Almost always, what the industry means by the word <em>entity</em> is really <em>proper</em> <em>noun.  </em>The big three are company names, people names, and locations.  You see those discussed over and over at industry conferences and in sales literature.  Interface an extraction of those three items to a sentiment scoring engine and (presto!) you have a nice generalizeable horizontal reputation management application that you can sell to companies in any industry.  At Northern Light, we just don&#8217;t think this approach makes enough difference.</p>
<p> </p>
<p>Northern Light also looks for company names.  But after that our entity extraction jumps to a different level.  We extract references to conditions, circumstances, events, technologies, strategies, trends, and outcomes that have significance to the business purpose of the search.  Northern Light has a special term we have coined for these types of entities: <em>meaning-loaded entities.  </em>Examples of <em>meaning-loaded </em>entities would include price cut, market share gain, credit crisis, acquisition strategy, brand and customer loyalty, energy price increase, and government bailout.  In the life sciences arena, <em>meaning-loaded entities</em> includes concepts like diseases, drugs, protiens, and clinical trials.  Currently, Northern Light&#8217;s meaning extraction application, <em>MI Analyst</em>, has around 20,000 <em>meaning-loaded entities </em>in the taxonomy.</p>
<p> </p>
<p>Using <em>MI Analyst</em> to index a document, we learn all kinds of interesting things about what is in each document.  For example not only can we find out what company names are mentioned, but also what technology trends are discussed, and how business issues are affecting the players in a market.  Or, in a life sciences setting, we can learn what diseases, drugs, and therapeutic strategies are discussed in a journal article.   </p>
<p> </p>
<p>We also know where each of these items is within the document, which gives us an opportunity to assess how related they are to one another, and to find scenarios of meaning when <em>meaning-loaded entities </em>are close to one another.  (Northern Light calls these scenarios of meaning either <em>Business Scenarios</em>  or <em>Life Sciences Scenarios</em> depending on the context).  For example, if we find <em>Cisco</em> near <em>Strategic Partnerships </em>near <em>Internet Telephony Market</em> we may have just learned something significant.   Or if we find the drug <em>Taxol</em>  near <em>Phase I Clinical Trial</em>  near <em>Lung Neoplasms </em>(&#8221;cancer&#8221; for you non-life sciences types), this might suggest a new avenue of research. </p>
<p> </p>
<p>Once we know all these things about a document, why not use that intelligence about the document to help our poor user we left three paragraphs up above in this post still struggling with <em>plain old search results</em>?</p>
<p> </p>
<p><strong>Changing the Structure of Search Results</strong></p>
<p>Having recorded the presence of <em>meaning-loaded entities, Business Scenarios, </em>and<em> Life Sciences Scenarios </em>in each document, it is time to rethink this 1994-vintage search results that started us down this path.   One of the really nice things about research applications, as opposed to general Web search engines, is that we know the business purpose of the search.  It might be a search engine of market research reports for product managers to help determine new features in an information technology company, or a competitive intelligence monitoring tool, or a pharmaceutical research application.  Given this, we can select the types of <em>meaning-loaded entities</em> that will be most helpful to the users.  </p>
<p> </p>
<p>For example, in a market research setting, <em>Companies</em>, <em>Venture-Funded Companies</em>, <em>Technologies</em>, and <em>Business Issues</em> might be most helpful.   Then as each search is performed by users, the search result for each document can be enhanced to show the selected entities which can raise the value of search results, as in the example below based on a search of &#8220;VOIP&#8217; on an information technology analyst database. </p>
<p><em>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</em></p>
<p><em></em> </p>
<p><em>Plain Old Search Results</em></p>
<p> <span style="font-size: x-small"><span style="font-family: Arial"><span> </span></span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><strong><span style="font-size: x-small"><span style="font-family: Arial">1.<span>  </span><span style="text-decoration: underline"><span style="color: #1f497d"><a href="http://">Measuring and Diagnosing VoIP Voice Quality</a></span></span><span>    </span></span></span></strong></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial">97%,Licensed Content, 03/22/2009</span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial">Voice over Internet Protocol (VoIP) places strict requirements on the network infrastructure. If there are any problems in network configuration or operation, voice quality suffers and users complain. This report, updated by Senior Analyst Tom Smith to incorporate the latest changes in measurement standards, technologies, and vendor products, examines the tools and techniques for both pre-deployment testing and post-deployment problem detection and diagnosis of voice quality.<span>  </span>Report number 98736</span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small;font-family: Arial"> </span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial">2. <strong><span style="text-decoration: underline"><span style="color: #1f497d"><a href="http://">Cable Voice Brings VoIP Into the Mainstream</a></span></span></strong></span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial">91%, Licensed Content, 10/06/2008</span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial"><span> </span>While pure-play and telco voice over IP (VoIP) providers continue to struggle to win subscribers to their over-the-top (OTT) services, cablecos have been wildly successful in bringing VoIP technology to 12% of US consumers. What&#8217;s more, pure-play and telco VoIP users remain the same niche early adopters who&#8217;ve been using VoIP for the past three years, while cable voice subscribers are more mainstream. &#8211; This is report number 149147</span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"> </p>
<p class="MsoNormal" style="margin: 6pt 0in">&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</p>
<p class="MsoNormal" style="margin: 6pt 0in"><em></em> </p>
<p class="MsoNormal" style="margin: 6pt 0in"><em>Meaning-Loaded Search Results</em></p>
<p class="MsoNormal" style="margin: 6pt 0in"> </p>
<p> <span style="font-size: x-small"><strong><span style="font-size: x-small"><span style="font-family: Arial">1.<span>  </span><a href="http://"><span style="text-decoration: underline"><span style="color: #1f497d">Measuring and Diagnosing VoIP Voice Quality<span>  </span></span></span><span>  </span></a></span></span></strong></span><em> </em></p>
<p><span style="font-size: x-small"><span style="font-family: Arial">Licensed Content, 03/22/2009</span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial">Voice over Internet Protocol (VoIP) places strict requirements on the network infrastructure. If there are any problems in network configuration or operation, voice quality suffers and users complain. This report, updated by Senior Analyst Tom Smith to incorporate the latest changes in measurement standards, technologies, and vendor products, examines the tools and techniques for both pre-deployment testing and post-deployment problem detection and diagnosis of voice quality.<span> </span></span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial"><strong>Business Issues mentioned: Legacy Systems (2), Competitors (1), High Growth Product Market (1) <a href="http://">more</a></strong></span></span><span style="font-size: x-small"><span style="font-family: Arial"><strong></strong></span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial"><strong>Companies mentioned: Cisco Systems Inc (38), Nortel (21), NetIQ (9) <a href="http://">more</a></strong></span></span><span style="font-size: x-small"><span style="font-family: Arial"><strong></strong></span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial"><strong>Technologies mentioned: Voice Over IP (VoIP) (195), Private Branch Exchange (PBX) (45), Simple Network Management Protocol (SNMP) (13) <a href="http://">more</a></strong></span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial"><strong><span>  </span></strong></span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial">2. <a href="http://"><strong><span style="text-decoration: underline"><span style="color: #1f497d">Cable Voice Brings VoIP Into the Mainstream</span></span></strong> </a></span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial">Licensed Content, 10/06/2008</span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial"><span> </span>While pure-play and telco voice over IP (VoIP) providers continue to struggle to win subscribers to their over-the-top (OTT) services, cablecos have been wildly successful in bringing VoIP technology to 12% of US consumers. What&#8217;s more, pure-play and telco VoIP users remain the same niche early adopters who&#8217;ve been using VoIP for the past three years, while cable voice subscribers are more mainstream.  Report</span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial"><strong>Business Issues mentioned: Product Strategy and Roadmap (8), Benchmarks (4), Customer Demand (1) <a href="http://">more</a></strong></span></span><span style="font-size: x-small"><span style="font-family: Arial"><strong></strong></span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial"><strong>Companies mentioned: Time Warner Inc (4), Comcast Inc (3), Verizon Communications (2) <a href="http://">more</a></strong></span></span><span style="font-size: x-small"><span style="font-family: Arial"><strong></strong></span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial"><strong>Venture Funded Companies mentioned: PurePlay (1) <a href="http://">more</a></strong></span></span><span style="font-size: x-small"><span style="font-family: Arial"><strong></strong></span></span></p>
<p class="MsoNormal" style="margin: 6pt 0in"><span style="font-size: x-small"><span style="font-family: Arial"><strong>Technologies mentioned: Quality of Service (QoS) (1), Bundled Services (1), Mobile Phones (1) <a href="http://">more</a></strong></span></span></p>
<p> </p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;</p>
<p>As you can see from the above example, the second set of search results is way more helpful.  Take the first document for example: with <em>meaning-loaded search results</em> we learn important new facts about the report that the<em> plain old search results</em> did not reveal.  We learn that <em>Legacy Systems </em>is an important issue and that <em>PBX </em>is a technology that is mentioned often in the report.  Bing &#8211; the light bulb goes on over our heads.  It suddenly dawns on us that in the enterprise VOIP market how to deal with the omnipresent legacy PBX system with its epicenter right at the front reception desk could be a major issue. </p>
<p> </p>
<p>Also, right from the search results I learn that Cisco and Nortel are companies to pay attention to in the enterprise VOIP market.  Bing bing.  </p>
<p> </p>
<p>Glancing at the second hit&#8217;s <em>meaning-loaded result</em>, I learn that this report most likely will lay out the <em>Product Strategy and Roadmap</em>  for cable providers Time Warner and Comcast.  Bing, bing, bing &#8211; my interest in this report just jumped off the scale if I am a product manager for an IT company that makes VOIP network gear.</p>
<p> </p>
<p>Note that with <em>plain old search results</em> none of this intelligence from either document comes through at all.  We have the little summaries that provide a whisper of what the documents are about, but the additional information that both taught me things and got me interested in reading more was absent.  Now, finally, we can dump the structure of search results first used for users of search engines in 1994 and can make search results way more useful.  Northern Light believes that <em>meaning-loaded search results</em> are a game-changer for users. </p>
<p> </p>
<p>Lastly, at this point my dear reader, you may be wondering what the heck the &#8220;more&#8221; link in each<em> meaning loaded-results</em> above connects, to. The answer is that they link to <em>meaning-loaded document summaries.</em>  But to avoid this blog post from growing to <em>War and Peace</em> proportions (if it hasn&#8217;t already), I will save that discussion for the next post.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.northernlight.com/ceo/2009/03/30/using-meaning-extraction-to-improve-search-results/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.northernlight.com/ceo/2009/03/30/using-meaning-extraction-to-improve-search-results/</feedburner:origLink></item>
		<item>
		<title>Meaning Extraction for Business Strategy</title>
		<link>http://feedproxy.google.com/~r/DavidSeuss/~3/8H2Po_wl6Bg/</link>
		<comments>http://blogs.northernlight.com/ceo/2009/03/24/meaning-extraction-for-business-strategy/#comments</comments>
		<pubDate>Tue, 24 Mar 2009 06:26:45 +0000</pubDate>
		<dc:creator>David Seuss</dc:creator>
				<category><![CDATA[MI Analyst]]></category>
		<category><![CDATA[SinglePoint]]></category>

		<guid isPermaLink="false">http://blogs.northernlight.com/ceo/?p=15</guid>
		<description><![CDATA[
 In my previous post, I gave an example of how meaning extraction works in a life sciences research setting.  I am pleased to report to you all that the essence of that blog post was expanded into an article that is going to run in Future Pharmaceuticals magazine soon.  I will post a link to [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal" style="margin: 0in 0in 0pt"><strong></strong></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><strong><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot"> </span></strong><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">In my previous post, I gave an example of how meaning extraction works in a life sciences research setting.  I am pleased to report to you all that the essence of that blog post was expanded into an article that is going to run in Future Pharmaceuticals magazine soon.  I will post a link to it when it runs if that is practical.</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">But since most people work for companies that are not pharmaceuticals or performing life sciences research,  I thought I should provide another example in a different context.  <span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">So let&#8217;s say we want to analyze the strategy of a company like Cisco in the arena of voice over Internet protocol (VOIP) and that we have a content repository of a few hundred thousand market research reports from scores of authoritative analyst firms like Gartner, Forrester, and IDC.  (Northern Light clients actually have such a database available to them.)   </span></span></p>
<p><span id="more-15"></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot"><strong></strong></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot"><strong>Hope for Amazing Good Luck</strong></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">If we perform our search &#8216;Cisco and VOIP&#8217; on a traditional search engine we will get back a search result of thousands of  reports.   Most search engines then wash their hands of the situation, metaphorically dumping the pile of documents on the user&#8217;s desk and saying, &#8220;see ya&#8221; as the search engine bolts out the door.  The user is left to sort through the pile, find some that might be interesting, and then start reading them.  The search result itself provides a little guidance in this process, though precious little.  For example, the search result will be sorted by some secret formula that will attempt to put more relevant documents nearer to the top of the list.  And there will be a little summary provided of each document, perhaps a short paragraph of text, that the user can review.   Acting on these scant hints, the user selects a few reports and starts reading.  Some are helpful, some are not, and the user preserveres for as long as he or she has time for or for as long as he or she can tolerate this hit or miss process.  </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">Because one cannot know what one did not find, there is no objective way for a user to assess whether the documents that he or she actually took the time to read comprehensively represent the body of knowledge contained in the thousands of returned documents on the search result.  What the user is actually doing is desperately wishing that the a documents he or she selects to read contain all the important findings, analysis, and perspective.   As even the most determined researcher will read only a very small percentage of the reports on any give search result, Northern Light calls this research strategy: <em>hope for amazing good luck.  </em></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">The use of <em>hope for amazing good luck </em>as a strategy for dealing with search results is not, of course, the fault of the user in question.  It is the fault of a search engine industry that believes lists of documents are the right response to a user whose business purpose for doing the search is to gain intellectual command of a body of knowledge, to answer a profound question, to explore the meaning of events and trends, or, in our example, to analyze the business strategy of a leading global company that can drive the evolution of a new technology and its market.</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">So how might it work better?  Let&#8217;s start by review how meaning extraction works:</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">1.  Extract references to important terms and concepts, particular concepts that imply meaning for the business purpose of the search.</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">2.  Apply proximity intelligence to determine which terms and concepts are often in proximity to one another.</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">3.  Identify patterns of proximity-related concepts that imply meaning to a knowledgeable practitioner, and s</span><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">can all the documents responsive to a query and identify those patterns in all the documents to the user.  We like to call these patterns &#8220;scenarios&#8221; since we cannot really tell if they are significant, only that they are present.</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><strong></strong></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><strong></strong></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><strong></strong></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><strong><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">&#8220;Meaning Taxonomies&#8221;</span></strong></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><strong></strong></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">Northern Light MI Analyst, our meaning extraction application, can find all references in search results to terms and concepts such as IT technologies, business issues, product marketing initiatives, corporate strategies, and company names.  For example, MI Analyst can find references to product marketing and strategy concepts like price cut, market share gain, new products, clinical trials, acquisition strategy, financial crisis, energy costs, or government bailout.  </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot"> MI Analyst exposes these terms and concepts to the user at both the document level and the search results level.  At the document level, MI Analyst assists a user in gaining an at-a-glance understanding of what is in the document so the user can make a more informed decision about which reports to download and read  This helps a user find those reports and articles that are most likely to be of most value.  </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">At the level of search results, MI Analyst assists a user in determining what overall concepts are found in all the documents that are responsive to the query.  This helps the user discover knowledge (e.g., what strategy is Cisco following in the VOIP market), as well as drill down into a subset of the search results that will be most helpful to the user.</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">Northern Light maintains an extensive taxonomy of terms and concepts (with tens of thousands of entries) to facilitate MI Analyst text analytics.    We refer to these taxonomies as &#8220;meaning taxonomies&#8221; because they are designed to organize concepts that when found in proximity to one another and in proximity to company names, imply meaning to the users of the service.</span><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><strong></strong></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><strong><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">Proximity Analysis on Terms, Concepts, and Words Used In Queries</span></strong></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><strong></strong></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">At indexing time, MI Analyst stores the location within each document in the repository of all the words in the document and all the analytical terms in our meaning taxonomies.  This allows the search engine to  then perform analysis using proximity as an indicator of relationship.  So for example, a searcher could find research reports that have &#8216;VOIP&#8217; within 20 words of the company name &#8216;Cisco.&#8217;<span>  </span>Proximity analysis permits the user to force very tight associations between concepts so that the resulting document set is highly relevant and more likely to produce an in-depth understanding of the topic.  </span><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><strong></strong></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><strong><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">Automatic Identification of Relationships</span></strong></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><strong></strong></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">MI Analyst trolls all the documents on each search result and automatically identifies relationships between concepts and company names, which we call scenarios, flagging those that it finds for the user to review.  These scenarios operate on business issues, corporate strategy concepts, and technologies.  </span><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">For example, a search using MI Analyst on market research reports discussing the VOIP market produces these <strong>actual search</strong> <strong>results</strong>:</span></p>
<ul>
<li>
<div class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">Cisco is using a corporate strategy of Acquisitions</span></div>
</li>
<li>
<div class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">Cisco is using a corporate strategy of Strategic Partnerships</span></div>
</li>
<li>
<div class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">Cisco is using a product marketing strategy of Market Segmentation</span></div>
</li>
<li>
<div class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">Cisco is using a product marketing strategy of Target Market</span></div>
</li>
<li>
<div class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">Cisco is using a product marketing strategy of Professional Services</span></div>
</li>
<li>
<div class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">Cisco is using a product marketing strategy of Service and Support</span></div>
</li>
</ul>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> <span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">I am willing to bet you a contribution to your favorite charity that you have never, ever, seen search results like the ones above.  </span>Cisco’s strategy in the VOIP market jumps right off the page of MI Analyst search results.<span>  </span>Specifically, Cisco is <em>targeting specific market segments</em> in the VOIP market and using a combination of <em>high levels of professional services and support</em> and <em>partnerships/acquisitions</em> to penetrate the market.<span>  </span>Each of the search results listed above is linked to a list of reports that discuss that strategy concept, sorted by the number of times the strategy concept is in the report so users can rapidly drill into the documents that best elaborate on Cisco&#8217;s strategy.  </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">And by the way, when a user is presented with a group of documents that are conceptually related to his or her research interest with a meaningful indication of the concepts the documents contain, the user persists and actually reads more reports than when the user is forced to use the <em>hope for amazing good luck</em> strategy.</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot"><strong>Time To Insight</strong></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">When the above capabilities are taken together, meaning extraction permits strategy analysts, market planners, product managers, and competitive intelligence professionals to understand markets, technologies, and competitors more thoroughly and more rapidly.<span>  </span></span><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"> </p>
<p class="MsoNormal" style="margin: 0in 0in 0pt"><span style="font-size: 11pt;font-family: &quot;Calibri&quot;,&quot;sans-serif&amp;quot">The key benefit: meaning extraction significantly reduces the <em>time to insight</em>.</span></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.northernlight.com/ceo/2009/03/24/meaning-extraction-for-business-strategy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.northernlight.com/ceo/2009/03/24/meaning-extraction-for-business-strategy/</feedburner:origLink></item>
		<item>
		<title>How Does Meaning Extraction Actually Work?</title>
		<link>http://feedproxy.google.com/~r/DavidSeuss/~3/92jP9fC30A8/</link>
		<comments>http://blogs.northernlight.com/ceo/2009/01/06/how-does-meaning-extraction-actually-work/#comments</comments>
		<pubDate>Tue, 06 Jan 2009 19:42:27 +0000</pubDate>
		<dc:creator>David Seuss</dc:creator>
				<category><![CDATA[Case Studies]]></category>
		<category><![CDATA[MI Analyst]]></category>
		<category><![CDATA[SinglePoint]]></category>

		<guid isPermaLink="false">http://blogs.northernlight.com/ceo/2009/01/06/how-does-meaning-extraction-actually-work/</guid>
		<description><![CDATA[Meaning extraction.What exactly is that, you ask?  Other than a catch phrase, one of those one of those unique combinations of words that marketing folks crave to have associated with their brand.I submit this definition of “meaning extraction” for your consideration:
Meaning extraction is an emerging technology that identifies elements of information and concepts contained within documents [...]]]></description>
			<content:encoded><![CDATA[<p><font size="2" face="Arial">Meaning extraction.</font><font size="2" face="Arial">What exactly is that, you ask?  Other than a catch phrase, one of those one of those unique combinations of words that marketing folks crave to have associated with their brand.</font><font size="2" face="Arial">I submit this definition of “meaning extraction” for your consideration:</font><font size="2" face="Arial"></p>
<blockquote><p>Meaning extraction is an emerging technology that identifies elements of information and concepts contained within documents and document repositories, and surfaces combinations of these informative elements and concepts that imply meaning in the context of the business, professional, or technical purpose of the search process.  Meaning extraction applied to search applications dramatically improves and accelerates a searcher’s ability to gain insight into a topic and answer specific research questions.</p></blockquote>
<p>Since the above has a theoretical tone that drains all the flash and boom from the concept, I thought it might be useful to provide a real-world example from the pharmaceutical industry.   But as the private meaning extraction applications Northern Light operates for its clients cannot be viewed by anyone but our clients, I will illustrate meaning extraction using a database of research documents that is publicly available.   </p>
<p><span id="more-13"></span></p>
<p>The National Institutes of Health (NIH) operates PubMed, a research database of journal abstracts that is freely available to researchers in life sciences.  PubMed indexes the abstracts of over 5,000 journals and 18 million scientific articles.  Life sciences researchers can easily access PubMed (<a href="http://www.ncbi.nlm.nih.gov/pubmed/">http://www.ncbi.nlm.nih.gov/pubmed/</a>) and execute searches using standard keyword search techniques commonly known to any individual that works with a web search engine such as Google.  PubMed returns lists of documents that match the search criteria, relevance ranked, to the user in the traditional method of search engines.  Over 100,000 searches per month are carried out on PubMed.</p>
<p>Separately, the NIH maintains a controlled vocabulary of life sciences terms under its Medical Subject Headings (MeSH) program.  MeSH consists of sets of terms naming descriptors in a hierarchical structure that permits searching at various levels of specificity.  There are thousands of descriptors in MeSH.  Article citations in PubMed are indexed using MeSH and knowledgeable users that understand the structure and term lists of MeSH can use terms from MeSH to search PubMed at multiple levels of aggregation since MeSH is a hierarchical system with inheritance.</p>
<p>As useful as the above system is, it suffers from a severe limitation in that text analytics have not been applied to the PubMed document repository nor to the search technology supporting it.  If you do a search using a text string as a query, you will get a traditional search engine result in the form of a list of documents that contain the search terms.  Like most search engines, these lists of search results are dauntingly long.  And worse, the summary information on the search results often, perhaps <em>most often</em>, provides little help in deciding if the document would actually be helpful.   For example, assume a researcher is interested in finding out what diseases and drugs are related to air pollution.  Using PubMed&#8217;s search engine, here are the first five hits on a search on &#8216;air pollution&#8217;:</p>
<p>   1: Responses of herbaceous plants to urban air pollution: Effects on growth, phenology and leaf surface characteristics.Honour SL, B Bell JN, Ashenden TW, Cape JN, Power SA.Environ Pollut. 2008 Dec 29. [Epub ahead of print] PMID: 19117655 [PubMed - as supplied by publisher]</p>
<p>    2: Long-Term Exposure to Road Traffic Noise and Myocardial Infarction.Selander J, Nilsson ME, Bluhm G, Rosenlund M, Lindqvist M, Nise G, Pershagen G.Epidemiology. 2008 Dec 29. [Epub ahead of print] PMID: 19116496 [PubMed - as supplied by publisher]</p>
<p>    3: Urinary 8-oxodeoxyguanosine levels in children exposed to air pollutants.Svecova V, Rossner P Jr, Dostal M, Topinka J, Solansky I, Sram RJ.Mutat Res. 2008 Dec 9. [Epub ahead of print] PMID: 19114049 [PubMed - as supplied by publisher]</p>
<p>    4: Air pollution and mutations in the germline: are humans at risk?Somers CM, Cooper DN.Hum Genet. 2008 Dec 27. [Epub ahead of print] PMID: 19112582 [PubMed - as supplied by publisher]</p>
<p>    5: Emissions investigation for a novel medical waste incinerator.Xie R, Li WJ, Li J, Wu BL, Yi JQ.J Hazard Mater. 2008 Nov 18. [Epub ahead of print]</p>
<p>    Etc.</p>
<p>There are over 33,000 hits in all.  As you can see, there is no way to actually answer the research question without examining each document in detail.  Good luck! One idea might be to use snippets of the full-text on the search result that would show search terms in context.  Snippets are useful on short documents like web pages and news stories where two to five snippets might well represent the document, but are of greatly reduced value in long documents like research reports. Snippets inherently make a selection of text excerpts to display using some algorithm set by the search technology developer, and only a small number of snippets can be practically displayed for any set of search terms and documents.  When there are far more references to a text string in a document than the number of snippets that can be displayed, the ability of snippets to represent the document in any meaningful way declines sharply.The intellectual frontier in search is meaning extraction.  To support the excellent and comprehensive content in PubMed for life sciences research using meaning extraction, the following steps would be required : </p>
<ul>
<li>Create full-text, metadata, and phrase indexes of the PubMed documents. </li>
<li>Convert MeSH terms to forms suitable for entity extraction/text analytics.</li>
<li>Extract entities from the PubMed document text using the converted MeSH vocabularies.</li>
<li>Create word, phrase, and entity proximity indexes of the PubMed documents. </li>
<li>Specify algorithms that can be used by the text analytics technology to discover knowledge.</li>
<li>Embody the indexes, extracted entities, proximity intelligence, and analytical algorithms in a user-friendly application that can be used by researchers.</li>
</ul>
<p>Automated meaning discovery is enabled by the entity extraction, word and phrase indexes, proximity indexes, and analytical algorithms.  With these foundations in place, it is possible to specify algorithms that search automatically across the entire repository for meaning.  For example, an algorithm might be:·        </p>
<ul>
<li>Identify all two and three element combinations of Diseases, Therapies, Drugs, Gene, Proteins, and Enzyme names that are within 40 words of each other in documents containing a text string specified by the researcher.</li>
</ul>
<p>Northern Light has performed these tasks on the PubMed document repository using our meaning extraction platform we call MI Analyst.</p>
<p>At this point, I would like to pause and do a little combinational math.  Our term list for identifying elements of information in the application we have running on PubMed content currently numbers 12,281 terms and phrases.  MI Analyst examines each document for 150 million potential two-element combinations and 1.9 trillion potential three-element combinations, making this information available on each and every user query against the system.  This requires some seriously clever software engineering to accomplish while returning a search result in a second or two.  But more importantly, it illustrates the leveraging of human capacity that meaning extraction technology can bring to bear.</p>
<p>Now let&#8217;s return to our research question.  Using the PubMed database of research reports, what diseases are mentioned in the context of air pollution?  With a system like the one above, the researcher enters his or her search terms, “air pollution” in this case, and the search engine returns a list that answers the question directly.</p>
<p>For example, here is the list of diseases found by Northern Light&#8217;s MI Analyst meaning extraction application running against the PubMed database and exposed to the user via a single click on the Diseases facet on the search results list:Diseases mentioned in documents with “air pollution” </p>
<ol>
<li>Asthma (462)</li>
<li>Cough (259)</li>
<li>Rhinitis (133)</li>
<li>Pneumonia (131)</li>
<li>Stroke (129)</li>
<li>Williams Syndrome (76)</li>
<li>Influenza (110)</li>
<li>Bronchitis (105)</li>
<li>Sinusitis (87)</li>
<li>Bronchial Spasm (86)</li>
<li>Silicosis (79)</li>
<li>Bronchiectasis (77)</li>
<li>Hemoptysis (73)</li>
<li>Pulmonary Fibrosis (73)</li>
<li>Atelectasis (72)</li>
<li>Bronchopulmonary Dysplasia (72)</li>
<li>Ciliary Motility Disorders (72)</li>
<li>Pulmonary Hypertension (72)</li>
<li>Anti-Glomerular Basement Membrane Disease (71)</li>
<li>Berylliosis (71)</li>
<li>Hantavirus Pulmonary Syndrome (71)</li>
<li>Lung Neoplasms (71)</li>
<li>Pulmonary Embolism (71)</li>
<li>Lung Neoplasms (71)</li>
<li>Sleep Apnea (48)</li>
<li>Lymphangioleiomyomatosis (45)</li>
<li>Pneumothorax (45)</li>
<li>Tracheobronchomegaly (43)</li>
<li>Pleurisy (29)</li>
<li>Bronchiolitis (25)</li>
<li>Dyspnea (22)</li>
<li>Confusion (32)</li>
<li>Neurologic Disorders (17)</li>
<li>Syncope (14)</li>
<li>Deafness (13)</li>
<li>Multiple Sclerosis (10)</li>
</ol>
<p>   Etc.</p>
<p>The number following each item represents the count in the PubMed database for documents that have both the element (e.g., Deafness) and the search term (“air pollution”).  In the actual application, every entry in the above lists link to the documents contributing to the result for that line item so the researcher can drill down where he or she sees items of interest.</p>
<p>While elements like Asthma and Rhinitis can hardly be surprising as outcomes related to air pollution,  suppose that the researcher did not already know that Williams Syndrome, Sleep Apnea, or Deafness were implicated as a consequence of air pollution.  In that case, the above results list would be a moment of revelation and discovery.   This is an example of an immediate form of meaning extraction.  By telling the researcher what is in the documents on the results lists, the search technology contributes to the user’s understanding of the topic.  The search engine has evolved from just providing document lists into an analytical tool that can assist in understanding.  This by itself is a great surge ahead.</p>
<p>But it gets better. </p>
<p>Suppose that the researcher wonders what drugs are being discussed as therapy for the diseases he or she identifies using the tool above.  For example, what drugs are being discussed in the 76 papers that mention Williams Syndrome and air pollution?  Using Northern Light MI Analyst and investing a total of three mouse clicks in the analytical process, the researcher sees that the research papers that contain “air pollution” and Williams Syndrome mention these drugs:</p>
<ol>
<li>Insulin (27)</li>
<li>Bayer ASA (8)</li>
<li>Accutane (1)</li>
<li>Decadron (1)</li>
<li>Folvite (1)</li>
<li>Mucomyst (1)</li>
<li>Neoral (1)</li>
</ol>
<p>     Etc.</p>
<p>Now, perhaps 10 seconds after starting the process, the researcher knows two new ideas that he or she did not know before: that Williams Syndrome is related to air pollution and that insulin and aspirin may be common treatments for Williams Syndrome in settings when air pollution is mentioned as a factor.  The researcher might then be tempted to consider if these drugs would help with the other diseases related to air pollution, and potentially the process of knowledge creation takes over from that of meaning discovery.</p>
<p>Please imagine the effort to get to this result using traditional search engines.  It is obviously not practical for the researcher to perform exhaustive repetitive searches substituting one disease after another and then one drug name after another in the query with air pollution as there are millions of combinations of these elements.  What our researcher would really do out here in the real world is examine a relatively small sample of documents from the 33,000 on the initial search result and hope for amazing good luck in noticing something relevant, insightful, and unique.</p>
<p>But it gets even better.</p>
<p>The meaning extraction application can be directed to analyze the documents returned on a search result and identify relationships that imply meaning; surfacing those to the researcher to consider.  In our search on air pollution, here are some of the relationships that MI Analyst finds in the PubMed research database:</p>
<ol>
<li>Osteoporosis is related Skin Disease (82)</li>
<li>Chelation Therapy is related to Williams Syndrome (75)</li>
<li>Atelectasis is realted to Bronchiectasis(72)</li>
<li>Bronchiectasis is related to Hemoptysis (72)</li>
<li>Ciliary Motility Disorder is related to Dyskinesisas (72)</li>
</ol>
<p>    Etc.</p>
<p>As an aside, it doesn’t require a genius looking at these results to wonder about the relationship between Atelectasis and Hemoptysis.  (MI Analyst actually suggests this as well a little further down on the search result when it considers three-element relationships.)</p>
<p>We like to call these relationships “scenarios” because at the level of MI Analyst, we cannot really tell if they are significant or spurious.  All we can say is that the relationships are there in the document repository, and we can measure how many times each one is there. </p>
<p>MI Analyst identifies these scenarios and presents them to the researcher as possibly worthy of follow-up.  For example, with a few more mouse clicks MI Analyst facilitates investigation of whether there are common elements contributing to the relationship in the form of overlapping genes or proteins or other elements. The identification of the relationships is done automatically for the researcher, without any specific direction other than the initial restriction, in this case, to documents with the text string “air pollution.”  After that, the meaning extraction application analyzes all the text in all the documents of interest and finds the elements and the relationships between them. </p>
<p>In many cases, the researcher will already know about the relationship, and in these cases meaning extraction is helping the researcher narrow down a document list to those that contain the scenarios he or she finds the most interesting. </p>
<p>But in some cases, the researcher will not have previously considered the relationship that is identified -   and then breakthroughs are enabled.  I have been present in the room when a lead researcher for a major pharmaceutical firm spontaneously reacted the scenarios presented on the Northern Light MI Analyst results list he was looking at with the exclamation:  “This is an Ah Ha Moment!” </p>
<p>Now that’s meaning extraction.</p>
<p></font></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.northernlight.com/ceo/2009/01/06/how-does-meaning-extraction-actually-work/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.northernlight.com/ceo/2009/01/06/how-does-meaning-extraction-actually-work/</feedburner:origLink></item>
		<item>
		<title>IT in crisis</title>
		<link>http://feedproxy.google.com/~r/DavidSeuss/~3/GRxVePmarbI/</link>
		<comments>http://blogs.northernlight.com/ceo/2008/12/02/it-in-crises/#comments</comments>
		<pubDate>Tue, 02 Dec 2008 20:34:15 +0000</pubDate>
		<dc:creator>David Seuss</dc:creator>
				<category><![CDATA[Enterprise IT]]></category>

		<guid isPermaLink="false">http://blogs.northernlight.com/ceo/2008/12/02/it-in-crises/</guid>
		<description><![CDATA[IT organizations are in crisis.  In the name of control, standards, security, and economies of scale centralized IT organizations have taken over many tasks that were previously managed in whole and in part by business units and functional departments.  As a result, IT budgets have been growing faster than corporate revenues for many years.  But [...]]]></description>
			<content:encoded><![CDATA[<p><font size="2"><font face="Arial">IT organizations are in crisis.  In the name of control, standards, security, and economies of scale centralized IT organizations have taken over many tasks that were previously managed in whole and in part by business units and functional departments.  As a result, IT budgets have been growing faster than corporate revenues for many years.  But that task was never doable; no single organization can support the nuance of every use-case across an entire enterprise with a toolkit of, by necessity, standardized lowest-common-denominator solutions.  And now that the financial crisis of 2008 is hitting 2009 corporate budgets, cutting deeply into IT resources, what was undoable has become unimaginable.  The massively centralized IT process has become the bottleneck that affects every project, product, operational initiative and revenue plan; freezing companies into the glacier.  </font></font></p>
<p><font size="2"><font face="Arial">The solution is to change the mindset of IT from trying to execute every project and manage every application to performing an active role of facilitator within a decentralized IT model.  In this model, many &#8212; if not most of the applications that business units and departments need &#8212; are carried out by the business units themselves, using IT as a trusted advisor.   Business unit and departmental level applications are therefore executed by “local” (as opposed to “centralized”) staff working in the organizational units that both need an application and understand its peculiarities.  This local staff may or may not even carry IT job titles, but may be functional and departmental professionals using inside and outside IT specialists as appropriate and cost-effective.  IT then becomes a value-add resource in this process:  consulting, advising, critiquing plans, setting requirements for security, and vetting outside resources.  </font></font><font size="2"><font face="Arial">Some enterprise computing applications, such as financial reporting and email, will always be the responsibility of a centralized IT organization. But many enterprise computing projects can be successfully downloaded to business units and functional departments where the affected business managers can determine project priority and investment-worthiness against other needs at the business unit level.  The advantage of this approach is that projects that pass muster with the business managers can actually be carried out in a timeframe consistent with the business needs.   The status-quo alternative to this new model is an endless backlog in IT, with the resulting loss of competitive advantage and consequent business decline as other companies achieve higher levels of IT agility and responsiveness.  </font></font></p>
<p><font size="2"><font face="Arial">We can already see early indicators of the reallocation of application responsibility back to business units and departments in the high growth rates of Software as a Service (SaaS) and custom applications that are based on open source.  IT does not have to fail in the new economic environment we find ourselves in; there is no inescapable Fate here as in a Greek tragedy.  IT can succeed &#8212; and succeed spectacularly &#8212; by embracing the new paradigm.  The IT strategies of highly effective competitors in many industries are already evolving in this direction.  In 2009, we will see the mind shift in IT organizations accelerate toward a more decentralized model as the most successful companies stretch their lead over those that cling to the old ways.   </font></font></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.northernlight.com/ceo/2008/12/02/it-in-crises/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.northernlight.com/ceo/2008/12/02/it-in-crises/</feedburner:origLink></item>
		<item>
		<title>Why meaning extraction?</title>
		<link>http://feedproxy.google.com/~r/DavidSeuss/~3/Os0cPGaIqV0/</link>
		<comments>http://blogs.northernlight.com/ceo/2008/12/02/why-meaning-extraction/#comments</comments>
		<pubDate>Tue, 02 Dec 2008 20:31:11 +0000</pubDate>
		<dc:creator>David Seuss</dc:creator>
				<category><![CDATA[MI Analyst]]></category>

		<guid isPermaLink="false">http://blogs.northernlight.com/ceo/2008/12/02/why-meaning-extraction/</guid>
		<description><![CDATA[Every year, for enterprise clients, Northern Light provides aggregation and search for 750,000 market research reports with a value of $1 billion (if you bought each report at its list price) from 80 of the leading analyst firms.  On a late night two summers ago I was sitting alone, brooding about the future of information technology, as [...]]]></description>
			<content:encoded><![CDATA[<p>Every year, for enterprise clients, Northern Light provides aggregation and search for 750,000 market research reports with a value of $1 billion (if you bought each report at its list price) from 80 of the leading analyst firms.  On a late night two summers ago I was sitting alone, brooding about the future of information technology, as I often do.   The thought occurred to me that if I could read every report we aggregate each year, then I would be a lot smarter about this question, or at least much better informed.  Well, I mused, I cannot read 750,000 reports, <em>but the computer can.</em></p>
<p>In that moment, the future just reached back and hit me.  What if search engines could read all the market intelligence documents a researcher has access to, identify the business issues reported on, suggest the trends, flag the threats, highlight the opportunities, and distinguish those documents that are the most important, not from a search relevance perspective, but from a <u>meaning</u><em> </em>perspective?</p>
<p>For example, what if you could conduct a search on one of your product lines and have the search engine zero in on the reports that describe threats to your company’s market share or pricing strategy?   What if you could feed the search engine a company name and have it provide you not with a list of documents, but with a report that highlights the company’s corporate strategy, business position, and opportunities and challenges?</p>
<p><span id="more-8"></span></p>
<p>This thought lead us to launch the development project that culminated in <em>MI Analyst</em>, which is to my knowledge the first search engine providing automated analysis and discovery of business meaning from large stores of market intelligence content. </p>
<p>In the past, there have been many attempts to improve the intelligence of search engines, and these attempts have generally failed.  For example:</p>
<ul>
<li>Increase the size of the database.  All users and search engine journalists believe that bigger is better in search engine databases.  If one is looking for a fact, the bigger the database the better the chance of finding the fact.  For example, if I want the phone number of my local pharmacy, I hope the search engine searches billions of documents and that the results summary of the first hit has the number I need, hopefully with the phone number in bold type.  But unlimited raw document count actually hurts the goal of the search if one is looking for analysis, commentary, and perspective on, say, a business issue.  The more documents searched that contain uninformed opinion, the lower the density of  quality search results, which means the user never sees some or a lot of the best material from the most informed commentators.  Database size has failed to make search engines more intelligent, and in many cases makes them dumber.</li>
<li>Expand the search semantically.  The argument put forward by the proponents of semantic search is that related terms can improve the recall of relevant documents.  So, far example, if an analysis of the corpus of material reveals that “federal budget” is often found with “federal spending,”  then a search on “federal budget” that is expanded by the search engine to include a search for “federal spending” will find documents that have just the second term but not the first.  The argument of supporters of semantic search is that it helps users by giving them this extra increment of documents.  The problem here is that the search engine is not doing anything more than expanding the list of search results, and most searches result in far more hits than any user has a chance of considering.  If the user is only going to look at 10-30 hits anyway, what exactly is the value of expanding the total list of hits from 500,000 to 600,000?  Zero, I would argue.  Semantic search does not make search engines smarter, just more daunting to work with.</li>
<li>Parse the query with natural language processing.  The idea here is that if people could ask questions in plain English that the search engine could parse, more on target results could be returned.  After all, if you ask a human researcher a question in English, you get back a cogent reply, so why not apply this idea to search engines?  But I would submit that the designers of natural language parsing solutions are not examining the actual reason why natural language questioning of a human researcher actually works.  It is not that the researcher simply understands the question; it is that the researcher takes the question and then examines relevant material, identifies the trends, summarizes the issues, and constructs a conceptual framework for relating bits of information back to the original question.  These are the real things that make the process work when a human researcher is asked a question in natural language.  What search engine natural language front-ends  do is take the user’s question, parse it, intelligently structure the question into a well-formed Boolean query, and then turn it over to the usual search process to spawn a long list of hits.  All of the intelligent processing is done on the question, not on the answer.  The user gets back a traditional search engine results page, and is abandoned to the usual process of manually examining the documents in order figure out what they mean.  Natural language processing does not make search engines more intelligent, it just makes it easier to communicate with the dumb back-end.</li>
<li>Reorganize the UI or trick-up the results list display.  We have all witnessed the stream of UI tweaks from one company or another  like showing hits from images and videos as well as text pages.  Or how about graphical tools for displaying search results, for example, as blobs with connecting lines showing related websites or subjects?  The fact is that these efforts are superficial; the search engines in quesiton are not interpreting the documents meaningfully, they are just finding jazzy ways of showing you the same old dumb results list of documents for  you to examine.</li>
</ul>
<p>Pretty much, all of these efforts have failed to transform the search experience when one is trying to understand a complicated subject in depth.  One of my favorite marketing slogans is the ever popular <em>stop searching and start finding, </em>which has been used by dozens of companies over the years (marketing people sometimes appear to have no “industry memory” so they keep inventing the same marketing campaign over and over).  Despite all this flash and smoke, search today is pretty much like it was in 1994.  You enter a search term, get a list of documents, and then have at it, personally examining each document one at a time trying to sort it all out.  While there have been improvements in relevance ranking, and in the ability of search engines to find facts due to database size, the ability of search to assist in meaningful analysis is at its 1994 level, or at least was until recently when applications involving text analytics like <em>MI Analyst </em>started to appear.  </p>
<p>I know of two applications other than <em>MI Analyst </em>that<em> </em>turn search into an analytical process.  One is a pharmaceutical solution that identifies candidate pathways for drug research by looking for terms that are near each other, the other is available on the Web: Google’s research project on flu trends at <a href="http://www.google.org/flutrends"><font color="#585d8b">http://www.google.org/flutrends</font></a>.  Flu Trends looks for searches on terms like “flu,” “cough,” “sneeze,” etc. and using the frequency of the search terms and IP addresses of the searchers constructs trend and geographic information on the progress of the annual flu epidemic.  Google has announced that they worked with experts at the CDC in selecting the terms to analyze.  When you use Flu Trends, you do not get a list of documents to read, you get an analysis of user behavior on the search engine with charts, graphs, and maps that is a proxy for the underlying flu season. </p>
<p>There is one component of text analytic solutions that work: they are built (or assisted) by people that understand the research purpose of the search  and who can use the search process to facilitate that research purpose by providing frameworks, analytic routines, and algorithms for interpreting textual information.  This is not the job of horizontal, one technology fits all search companies.    What does one care about when researching a competitor, how does one tell that a new treatment strategy might work, where are the problems and how severe are they?  These are questions of analysis, not of search, or at least of just search.</p>
<p>Search engines must evolve to have in-depth understanding of the searched material.  Beyond search, categorization, faceted navigation, and entity extraction, which we all understand by this point, the future of search is <em>meaning extraction.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.northernlight.com/ceo/2008/12/02/why-meaning-extraction/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.northernlight.com/ceo/2008/12/02/why-meaning-extraction/</feedburner:origLink></item>
		<item>
		<title>ROI for Market Research Portals</title>
		<link>http://feedproxy.google.com/~r/DavidSeuss/~3/GN10A-GjMuk/</link>
		<comments>http://blogs.northernlight.com/ceo/2008/12/02/roi-for-market-research-portals/#comments</comments>
		<pubDate>Tue, 02 Dec 2008 20:29:29 +0000</pubDate>
		<dc:creator>David Seuss</dc:creator>
				<category><![CDATA[Case Studies]]></category>
		<category><![CDATA[SinglePoint]]></category>

		<guid isPermaLink="false">http://blogs.northernlight.com/ceo/2008/12/02/roi-for-market-research-portals/</guid>
		<description><![CDATA[When it comes to SinglePoint  market research portal implementations, our clients have experienced quantifiable savings in a number of areas that are well worth remembering when the time comes to build the corporate case.  Since we get the ROI question all the time, I thought I would share what we know about this topic with you.  Here [...]]]></description>
			<content:encoded><![CDATA[<p><font size="2"><font face="Arial">When it comes to <em>SinglePoint</em>  market research portal implementations, our clients have experienced quantifiable savings in a number of areas that are well worth remembering when the time comes to build the corporate case.  Since we get the ROI question all the time, I thought I would share what we know about this topic with you.  Here are some things to consider:</font></font><font size="2" face="Arial"> </font></p>
<ul>
<li><font size="2"><font face="Arial"><u>Cost of time saved</u>.  </font></font><font size="2" face="Arial">Our experience is that, even when done conservatively, the value of time saved can be pretty staggering when faced head on with a calculation.  For example, our client Verizon estimates that its <em>SinglePoint</em> saves 1.5 hours per user session.  Our average client runs 36,000 user sessions per year, and saving 1.5 hours per user session would total 54,000 hours of saved professional time per year.   One can cost out professional time at perhaps $100 per hour fully-loaded, so such a savings could be valued at $5.4 million per year.  Yikes!</font></li>
<li><font size="2"><font face="Arial"><u>Improved decision making.</u>  Without a simple to access and use comprehensive research portal, the fact is that most users give up and stop looking for the best information that bears on their projects and just make do with whatever they can find.  One of our clients estimated that critical strategic research projects before their <em>SinglePoint</em> portal was implemented were often attempted using Google’s Web search engine for an average of six hours of research, with the result most often being failure to find relevant, high quality information.  With <em>SinglePoint</em>, users have the best information available on every search easily and quickly, and even more important than the time they save is that their business analyses are well-informed.  What is the value of having better researched and analyzed business decisions?  It could easily dwarf all the millions of dollars of savings combined from all the other ROI considerations.·         </font></font></li>
<li><font size="2"><font face="Arial"><u>Supporting Large numbers of users with limited staff.</u>  In this era of budget cuts and staff reductions, a self-service market research portal makes it feasible to support a wide audience of consumers of research with a very limited internal staff.  For example, one of our clients has one person supporting 5,000 users of secondary research via a <em>SinglePoint</em> portal.  Another client has 5 people supporting over 50,000 users of secondary research using <em>SinglePoint</em>.  In another case, 6 internal market research professionals produce important original research reports and publish them to an audience of 300 users in the marketing department via the <em>SinglePoint</em> portal without having to allocate precious researcher time away from the task of actually doing the research in order to field numerous requests from users for reports that already exist.  In all of these examples, without the <em>SinglePoint</em> portal significantly more staff would be required to help the users of research find the research that had been purchased or created for them.</font></font></li>
<li><font size="2"><font face="Arial"><u>Obtaining new business</u><em>.  </em>Each organization has a good idea of what value a closed sale might bring to the table or what being able to rapidly respond to a competitive situation means to a company.  Most of Northern Light’s clients use <em>SinglePoint</em> to prepare for sales presentations, customer briefings, and to assess their product roadmaps to make them more competitive.  Our client HP requested success stories from its sales and marketing staff to document the contribution of HP’s <em>SinglePoint </em>(which they call MarketVision) to business and customer wins, and so many poured in that they had to stop collecting them because the market research staff did not have time to read them all!  Suffice it say many millions of dollars was identified as having been won with direct support to the sales and marketing teams being provided by HP’s <em>SinglePoint</em>.</font></font></li>
<li><font size="2"><font face="Arial"><u>Reduced number of websites/portals</u><em>.  </em>It is not uncommon for departments and functional groups within an organization to provision and field numerous websites, each facing a different audience such as Sales, Marketing, Product Management.  HP has spoken publicly of having saved many millions of dollars when <em>SinglePoint</em> enabled them to unify 150 distinct intranet sites that hosted research in all their divisions around the world.  On the IT side alone, they saved in excess of $1 million, in year one, in hardware and administration.</font></font></li>
<li><font size="2"><font face="Arial"><u>Intellectual property issues and fair usage.</u>  It is easy for users to unknowingly violate the usage terms of their agreements with sources of secondary research — an exposure that no large company wants. This happens when users post documents to multiple internal portals without any system for enforcing licensing arrangements.  For example, Northern Light was told of one company that does not use <em>SinglePoint</em> that was presented with a $460,000 bill from a market research provider for a single report that a well-meaning but naive employee carelessly posted to a departmental website for general consumption without access controls reflecting the report’s seat license business rules .  Northern Light <em>SinglePoint </em>enforces the terms of the content licensed and frees organizations from day-to-day concern with such usage issues.</font></font></li>
<li><font size="2"><font face="Arial"><u>Consolidated purchasing of information</u><em>.  </em>Duplicated and underutilized information contracts are the norm in many large organizations.  <em>SinglePoint </em>makes consolidated licensing and enterprise-wide sharing of purchased content a practical goal. For the many clients who have adopted this approach, the savings are substantial.   HP has estimated that its <em>SinglePoint</em> portal saves them over $1 million per year in avoiding duplicate research purchases in their operations around the world.  </font></font></li>
<li><font size="2"><font face="Arial"><u>Primary research savings</u><em>.  </em>Primary market research projects are, by nature, strategic and closely held ventures.  And they are very expensive undertakings.  Often, the research has already been performed but the reports are neither widely known nor findable since they are scattered on network folders and laptops.  <em>SinglePoint</em> can consolidate primary research into a single repository and make it available to authorized users throughout the organization.  This can eliminate the need for duplicate primary research, saving substantial amounts of money and increasing the impact of primary research that has been performed.</font></font></li>
</ul>
<p><span id="more-7"></span></p>
<p><font size="2" face="Arial">Hope this helps!   Let me know if you have any stories to add.</font></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.northernlight.com/ceo/2008/12/02/roi-for-market-research-portals/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blogs.northernlight.com/ceo/2008/12/02/roi-for-market-research-portals/</feedburner:origLink></item>
	</channel>
</rss>
