<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" version="2.0">
<channel>
	<title>Comments for Andrei Zmievski</title>
	
	<link>http://zmievski.org</link>
	<description>Life, technology, and other good things</description>
	<lastBuildDate>Mon, 20 May 2013 16:48:08 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4.1</generator>
	<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/gravitonic-comments-rss2" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="gravitonic-comments-rss2" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Comment on Duplicates Detection with ElasticSearch by directory of professional uk companies</title>
		<link>http://zmievski.org/2011/03/duplicates-detection-with-elasticsearch/comment-page-1#comment-2817</link>
		<dc:creator>directory of professional uk companies</dc:creator>
		<pubDate>Mon, 20 May 2013 16:48:08 +0000</pubDate>
		<guid isPermaLink="false">http://zmievski.org/?p=1122#comment-2817</guid>
		<description>Amazing blog! Is your theme custom made or did you download it from somewhere?
A theme like yours with a few simple adjustements would really make my blog stand out.
Please let me know where you got your theme.
Bless you</description>
		<content:encoded><![CDATA[<p>Amazing blog! Is your theme custom made or did you download it from somewhere?<br />
A theme like yours with a few simple adjustements would really make my blog stand out.<br />
Please let me know where you got your theme.<br />
Bless you</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Duplicates Detection with ElasticSearch by citebuzz.com</title>
		<link>http://zmievski.org/2011/03/duplicates-detection-with-elasticsearch/comment-page-1#comment-2816</link>
		<dc:creator>citebuzz.com</dc:creator>
		<pubDate>Fri, 17 May 2013 15:58:58 +0000</pubDate>
		<guid isPermaLink="false">http://zmievski.org/?p=1122#comment-2816</guid>
		<description>The write-up offers proven necessary to me. It’s quite informative and you really are certainly quite experienced in this region.
You possess opened up my own sight in order to various views on this 
kind of topic along with intriguing, notable and strong articles.</description>
		<content:encoded><![CDATA[<p>The write-up offers proven necessary to me. It’s quite informative and you really are certainly quite experienced in this region.<br />
You possess opened up my own sight in order to various views on this<br />
kind of topic along with intriguing, notable and strong articles.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Duplicates Detection with ElasticSearch by body Vitamins minerals</title>
		<link>http://zmievski.org/2011/03/duplicates-detection-with-elasticsearch/comment-page-1#comment-2815</link>
		<dc:creator>body Vitamins minerals</dc:creator>
		<pubDate>Thu, 16 May 2013 04:49:37 +0000</pubDate>
		<guid isPermaLink="false">http://zmievski.org/?p=1122#comment-2815</guid>
		<description>An outstanding share! I've just forwarded this onto a coworker who was doing a little homework on this. And he actually ordered me lunch simply because I discovered it for him... lol. So allow me to reword this.... Thank YOU for the meal!! But yeah, thanx for spending the time to discuss this issue here on your web page.</description>
		<content:encoded><![CDATA[<p>An outstanding share! I&#8217;ve just forwarded this onto a coworker who was doing a little homework on this. And he actually ordered me lunch simply because I discovered it for him&#8230; lol. So allow me to reword this&#8230;. Thank YOU for the meal!! But yeah, thanx for spending the time to discuss this issue here on your web page.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Duplicates Detection with ElasticSearch by vinyl34linen.xanga.com</title>
		<link>http://zmievski.org/2011/03/duplicates-detection-with-elasticsearch/comment-page-1#comment-2813</link>
		<dc:creator>vinyl34linen.xanga.com</dc:creator>
		<pubDate>Tue, 14 May 2013 02:41:02 +0000</pubDate>
		<guid isPermaLink="false">http://zmievski.org/?p=1122#comment-2813</guid>
		<description>Thanks for sharing your thoughts on partypoker bonus code july.

Regards</description>
		<content:encoded><![CDATA[<p>Thanks for sharing your thoughts on partypoker bonus code july.</p>
<p>Regards</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Duplicates Detection with ElasticSearch by Kerrie</title>
		<link>http://zmievski.org/2011/03/duplicates-detection-with-elasticsearch/comment-page-1#comment-2812</link>
		<dc:creator>Kerrie</dc:creator>
		<pubDate>Tue, 14 May 2013 00:13:10 +0000</pubDate>
		<guid isPermaLink="false">http://zmievski.org/?p=1122#comment-2812</guid>
		<description>Your own write-up provides established necessary to 
me. It’s very helpful and you're simply certainly really experienced of this type. You have popped our eye to different opinion of this particular subject together with interesting and strong articles.</description>
		<content:encoded><![CDATA[<p>Your own write-up provides established necessary to<br />
me. It’s very helpful and you&#8217;re simply certainly really experienced of this type. You have popped our eye to different opinion of this particular subject together with interesting and strong articles.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Duplicates Detection with ElasticSearch by Formal Gowns</title>
		<link>http://zmievski.org/2011/03/duplicates-detection-with-elasticsearch/comment-page-1#comment-2811</link>
		<dc:creator>Formal Gowns</dc:creator>
		<pubDate>Mon, 13 May 2013 07:40:16 +0000</pubDate>
		<guid isPermaLink="false">http://zmievski.org/?p=1122#comment-2811</guid>
		<description>I believe that is one of the so much vital information for me.

And i am happy studying your article. But should statement on some basic issues, The site taste is great, the articles is really excellent :
D. Good process, cheers</description>
		<content:encoded><![CDATA[<p>I believe that is one of the so much vital information for me.</p>
<p>And i am happy studying your article. But should statement on some basic issues, The site taste is great, the articles is really excellent :<br />
D. Good process, cheers</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Duplicates Detection with ElasticSearch by candy crush saga cheats</title>
		<link>http://zmievski.org/2011/03/duplicates-detection-with-elasticsearch/comment-page-1#comment-2810</link>
		<dc:creator>candy crush saga cheats</dc:creator>
		<pubDate>Sun, 12 May 2013 07:13:05 +0000</pubDate>
		<guid isPermaLink="false">http://zmievski.org/?p=1122#comment-2810</guid>
		<description>Hi there to all, as I am actually keen of reading 
this weblog's post to be updated on a regular basis. It carries pleasant material.</description>
		<content:encoded><![CDATA[<p>Hi there to all, as I am actually keen of reading<br />
this weblog&#8217;s post to be updated on a regular basis. It carries pleasant material.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Duplicates Detection with ElasticSearch by thermogenics</title>
		<link>http://zmievski.org/2011/03/duplicates-detection-with-elasticsearch/comment-page-1#comment-2809</link>
		<dc:creator>thermogenics</dc:creator>
		<pubDate>Sun, 12 May 2013 00:57:52 +0000</pubDate>
		<guid isPermaLink="false">http://zmievski.org/?p=1122#comment-2809</guid>
		<description>This is really interesting, You are a very skilled 
blogger. I've joined your feed and look forward to seeking more of your fantastic post. Also, I've shared your site in my 
social networks!</description>
		<content:encoded><![CDATA[<p>This is really interesting, You are a very skilled<br />
blogger. I&#8217;ve joined your feed and look forward to seeking more of your fantastic post. Also, I&#8217;ve shared your site in my<br />
social networks!</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Duplicates Detection with ElasticSearch by Andrei</title>
		<link>http://zmievski.org/2011/03/duplicates-detection-with-elasticsearch/comment-page-1#comment-2788</link>
		<dc:creator>Andrei</dc:creator>
		<pubDate>Wed, 06 Mar 2013 03:52:40 +0000</pubDate>
		<guid isPermaLink="false">http://zmievski.org/?p=1122#comment-2788</guid>
		<description>Your project looks quite interesting, and I'm sure produces better quality results than my approach. I wish I had run across it at the time. But, I wanted to keep the number of pieces of technology as small as possible and using ElasticSearch was a quick &amp; dirty approach that seemed to yield decent results.</description>
		<content:encoded><![CDATA[<p>Your project looks quite interesting, and I&#8217;m sure produces better quality results than my approach. I wish I had run across it at the time. But, I wanted to keep the number of pieces of technology as small as possible and using ElasticSearch was a quick &#038; dirty approach that seemed to yield decent results.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Duplicates Detection with ElasticSearch by Lars Marius Garshol</title>
		<link>http://zmievski.org/2011/03/duplicates-detection-with-elasticsearch/comment-page-1#comment-2786</link>
		<dc:creator>Lars Marius Garshol</dc:creator>
		<pubDate>Sat, 02 Mar 2013 09:19:08 +0000</pubDate>
		<guid isPermaLink="false">http://zmievski.org/?p=1122#comment-2786</guid>
		<description>It's interesting to see what you chose to approach this problem by simply querying the search engine directly. I'm surprised your results seem so good, because generally this is a tricky problem, where information from many fields (name, address, phone number, geoposition) etc all need to be considered and weighed against one another.

I had the same problem and chose to build &lt;a href="http://code.google.com/p/duke/" rel="nofollow"&gt;a full record linkage engine&lt;/a&gt; on top of Lucene. That basically uses Lucene to find candidate matches (much like you do), but then does configurable detailed comparison with weighted Levenshtein, q-grams etc etc and combines results for different properties using Bayes's Theorem. It also cleans and normalizes data before comparison.

Even that requires a lot of tuning and work to produce good results, so, like I said, I'm surprised your results look so good. But maybe I'm missing something.</description>
		<content:encoded><![CDATA[<p>It&#8217;s interesting to see what you chose to approach this problem by simply querying the search engine directly. I&#8217;m surprised your results seem so good, because generally this is a tricky problem, where information from many fields (name, address, phone number, geoposition) etc all need to be considered and weighed against one another.</p>
<p>I had the same problem and chose to build <a href="http://code.google.com/p/duke/" rel="nofollow">a full record linkage engine</a> on top of Lucene. That basically uses Lucene to find candidate matches (much like you do), but then does configurable detailed comparison with weighted Levenshtein, q-grams etc etc and combines results for different properties using Bayes&#8217;s Theorem. It also cleans and normalizes data before comparison.</p>
<p>Even that requires a lot of tuning and work to produce good results, so, like I said, I&#8217;m surprised your results look so good. But maybe I&#8217;m missing something.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
