<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr="http://purl.org/syndication/thread/1.0" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" gd:etag="W/&quot;DkUGR3sycSp7ImA9WhRaE0U.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108</id><updated>2012-02-16T17:23:46.599+08:00</updated><category term="TBB" /><category term="hpc" /><category term="auto" /><category term="programming" /><category term="cloud" /><category term="algorithm" /><category term="OpenMP" /><category term="multi core" /><category term="Cilk++" /><category term="gpu" /><category term="parallelization" /><category term="PThread" /><category term="uC++" /><category term="matrix" /><category term="Ruby" /><category term="CUDA" /><category term="parallel" /><category term="Brook+" /><category term="OpenCL" /><category term="Charm++" /><category term="PVM" /><category term="apu" /><category term="cpu" /><category term="MPI" /><category term="compiler" /><title>SpeedGo Computing</title><subtitle type="html">Speed breaking computational problems with multi-core CPU and GPU</subtitle><link rel="http://schemas.google.com/g/2005#feed" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/posts/default" /><link rel="alternate" type="text/html" href="http://blog.speedgocomputing.com/" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><generator version="7.00" uri="http://www.blogger.com">Blogger</generator><openSearch:totalResults>22</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/SpeedgoComputing" /><feedburner:info uri="speedgocomputing" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry gd:etag="W/&quot;AkYNRHsyeip7ImA9WhZaEEk.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-295106998493068235</id><published>2011-06-26T08:29:00.000+08:00</published><updated>2011-06-26T08:29:55.592+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2011-06-26T08:29:55.592+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="CUDA" /><title>Being Nvidia CUDA Certified Programmer!</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/295106998493068235/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2011/06/being-nvidia-cuda-certified-programmer.html#comment-form" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/295106998493068235?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/295106998493068235?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/JvXUMpij2r4/being-nvidia-cuda-certified-programmer.html" title="Being Nvidia CUDA Certified Programmer!" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>2</thr:total><content type="html">It takes some courage and effort to take the Nvidia CUDA Certification exam. You'll have to pay S$350 for that yet there is no guarantee of real use in business and career. The exam questions are perfect to squeeze out all your brain juice.

After much feedback and long awaiting, delayed plans, finally I received an email about being Nvidia CUDA certified programmer now. It's better arrived late 
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/EzYCeTgs3fX66eI4miKKZKhaMZ4/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/EzYCeTgs3fX66eI4miKKZKhaMZ4/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/EzYCeTgs3fX66eI4miKKZKhaMZ4/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/EzYCeTgs3fX66eI4miKKZKhaMZ4/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/JvXUMpij2r4" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2011/06/being-nvidia-cuda-certified-programmer.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DEYNQnk8fSp7ImA9WhZXGUk.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-6500095993166424993</id><published>2011-05-09T21:03:00.000+08:00</published><updated>2011-05-09T21:03:13.775+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2011-05-09T21:03:13.775+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="CUDA" /><category scheme="http://www.blogger.com/atom/ns#" term="Ruby" /><category scheme="http://www.blogger.com/atom/ns#" term="gpu" /><title>The Choice is Yours: CUDA in C++ or Ruby</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/6500095993166424993/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2011/05/choice-is-yours-cuda-in-c-or-ruby.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/6500095993166424993?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/6500095993166424993?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/1xX_qmNAEII/choice-is-yours-cuda-in-c-or-ruby.html" title="The Choice is Yours: CUDA in C++ or Ruby" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><content type="html">
See the output here: Ruby Query Output


See the output here: C++ Query Output
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/Hl_t0Eiq105QfIn6yjkqcXHGLp0/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/Hl_t0Eiq105QfIn6yjkqcXHGLp0/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/Hl_t0Eiq105QfIn6yjkqcXHGLp0/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/Hl_t0Eiq105QfIn6yjkqcXHGLp0/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/1xX_qmNAEII" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2011/05/choice-is-yours-cuda-in-c-or-ruby.html</feedburner:origLink></entry><entry gd:etag="W/&quot;D08CSXo6cCp7ImA9WhZXFEw.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-2908664133889538528</id><published>2011-05-03T17:44:00.000+08:00</published><updated>2011-05-03T17:44:28.418+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2011-05-03T17:44:28.418+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="CUDA" /><category scheme="http://www.blogger.com/atom/ns#" term="gpu" /><category scheme="http://www.blogger.com/atom/ns#" term="programming" /><title>Web Seminar: Programming GPUs Beyond CUDA</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/2908664133889538528/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2011/05/web-seminar-programming-gpus-beyond.html#comment-form" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/2908664133889538528?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/2908664133889538528?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/bG6pOE8r-RM/web-seminar-programming-gpus-beyond.html" title="Web Seminar: Programming GPUs Beyond CUDA" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total><content type="html">GPU/CUDA programming is easy if we ignore the performance, or even the correctness of the program. It becomes tough when the performance is critical, one has to optimize very hard on the specific hardware. Fortunately, GPU hardware performance improves drastically every 2 years. Unfortunately, the performance is not portable across different generations of GPUs.

Prof Chen from Tshing Hua 
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/k2rXkro1N_tehkoclyXrBrhw0gg/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/k2rXkro1N_tehkoclyXrBrhw0gg/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/k2rXkro1N_tehkoclyXrBrhw0gg/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/k2rXkro1N_tehkoclyXrBrhw0gg/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/bG6pOE8r-RM" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2011/05/web-seminar-programming-gpus-beyond.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CEEDQXc9fyp7ImA9WhZXE08.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-266462278933999672</id><published>2011-04-30T17:56:00.003+08:00</published><updated>2011-05-02T15:51:10.967+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2011-05-02T15:51:10.967+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="hpc" /><category scheme="http://www.blogger.com/atom/ns#" term="CUDA" /><category scheme="http://www.blogger.com/atom/ns#" term="Ruby" /><category scheme="http://www.blogger.com/atom/ns#" term="gpu" /><category scheme="http://www.blogger.com/atom/ns#" term="programming" /><title>First Release of SGC Ruby CUDA - Beginning of a long way path</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/266462278933999672/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2011/04/first-release-of-sgc-ruby-cuda.html#comment-form" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/266462278933999672?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/266462278933999672?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/Y3fgbp9zong/first-release-of-sgc-ruby-cuda.html" title="First Release of SGC Ruby CUDA - Beginning of a long way path" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total><content type="html">Today we decided to put up the first release of the SGC Ruby CUDA v0.1.0 as a mean to attract Rubyists to try out GPU programming as their new toy projects, and also to encourage HPC developers to evaluate if Ruby is good to use for their HPC applications.

When important software libraries are not available in Ruby, we certainly do not expect much Ruby usage in the area. As time is running short
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/8eFEpy3SlkeK-4bh0RDFnbiv1mU/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/8eFEpy3SlkeK-4bh0RDFnbiv1mU/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/8eFEpy3SlkeK-4bh0RDFnbiv1mU/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/8eFEpy3SlkeK-4bh0RDFnbiv1mU/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/Y3fgbp9zong" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2011/04/first-release-of-sgc-ruby-cuda.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CEQDSX89fSp7ImA9WhZQFkw.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-177248029062926548</id><published>2011-04-24T10:32:00.001+08:00</published><updated>2011-04-24T10:32:58.165+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2011-04-24T10:32:58.165+08:00</app:edited><title>GPU Computing with Ruby</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/177248029062926548/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2011/04/gpu-computing-with-ruby.html#comment-form" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/177248029062926548?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/177248029062926548?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/u1qWSncLHMA/gpu-computing-with-ruby.html" title="GPU Computing with Ruby" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>2</thr:total><content type="html">Presented in RedDotRubyConf 2011 - PechaKucha Night Singapore.GPU Computing with RubyView more presentations from myxman.
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/TlP0v8DP4OmS0IGLmbrn76jf9PA/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/TlP0v8DP4OmS0IGLmbrn76jf9PA/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/TlP0v8DP4OmS0IGLmbrn76jf9PA/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/TlP0v8DP4OmS0IGLmbrn76jf9PA/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/u1qWSncLHMA" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2011/04/gpu-computing-with-ruby.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CUYHRHkzfip7ImA9Wx9TFUo.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-2538637127840456512</id><published>2010-11-19T21:16:00.003+08:00</published><updated>2010-11-24T12:32:15.786+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-11-24T12:32:15.786+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="CUDA" /><category scheme="http://www.blogger.com/atom/ns#" term="cloud" /><category scheme="http://www.blogger.com/atom/ns#" term="Ruby" /><category scheme="http://www.blogger.com/atom/ns#" term="gpu" /><title>Using SGC-Ruby-CUDA on the Newly Launched Amazon EC2 Cluster GPU</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/2538637127840456512/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/11/using-sgc-ruby-cuda-on-newly-launched.html#comment-form" title="5 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/2538637127840456512?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/2538637127840456512?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/lqqlotom8SE/using-sgc-ruby-cuda-on-newly-launched.html" title="Using SGC-Ruby-CUDA on the Newly Launched Amazon EC2 Cluster GPU" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>5</thr:total><content type="html">Wonder if GPU works for you? No budget for a system with decent GPU? Installations and configurations are too much trouble for you? You can now try out SGC-Ruby-CUDA on Amazon EC2 with the system image, located at US East Virginia zone, called SGCRubyCUDA.1 which is available as a community AMI.

Compile for rubycu shared library and run tests.

[root@ip-10-17-130-174 sgc-ruby-cuda.git]# rake
(in
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/oc2B7E2g8oWSPPgzoHRCNrFIo_c/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/oc2B7E2g8oWSPPgzoHRCNrFIo_c/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/oc2B7E2g8oWSPPgzoHRCNrFIo_c/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/oc2B7E2g8oWSPPgzoHRCNrFIo_c/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/lqqlotom8SE" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/11/using-sgc-ruby-cuda-on-newly-launched.html</feedburner:origLink></entry><entry gd:etag="W/&quot;Dk4FQn48fip7ImA9Wx5aGEo.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-2873007604611679152</id><published>2010-11-16T10:35:00.000+08:00</published><updated>2010-11-16T10:35:13.076+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-11-16T10:35:13.076+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="hpc" /><category scheme="http://www.blogger.com/atom/ns#" term="cloud" /><category scheme="http://www.blogger.com/atom/ns#" term="gpu" /><title>GPU Anywhere with Cloud Computing</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/2873007604611679152/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/11/gpu-anywhere-with-cloud-computing.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/2873007604611679152?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/2873007604611679152?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/AI0U7ClaQpg/gpu-anywhere-with-cloud-computing.html" title="GPU Anywhere with Cloud Computing" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><content type="html">Simulation taking months to run? Buying and maintaining new systems causing too much hassle? Perhaps Cluster GPU would be a good candidate to save time and trouble. Cloud solution is an excellent platform for proof of concept before committed to a large system in-house.

Paying $2.10 per hour (Amazon pricing as of 16 Nov 2010) for the spec of:

22 GB of memory
33.5 EC2 Compute Units (2 x Intel 
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/0rZ2yhXLa2AiBQAe5-sWf8E0v_k/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/0rZ2yhXLa2AiBQAe5-sWf8E0v_k/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/0rZ2yhXLa2AiBQAe5-sWf8E0v_k/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/0rZ2yhXLa2AiBQAe5-sWf8E0v_k/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/AI0U7ClaQpg" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/11/gpu-anywhere-with-cloud-computing.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CkMFSHY6eSp7ImA9Wx5UEkQ.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-2793087737173429540</id><published>2010-09-26T08:10:00.001+08:00</published><updated>2010-10-17T12:40:19.811+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-17T12:40:19.811+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="programming" /><category scheme="http://www.blogger.com/atom/ns#" term="parallel" /><title>Parallel programming knowledge is must-have skill for Wall Street</title><link rel="related" href="http://www.multicoreinfo.com/2010/09/parallel-wall/?utm_source=feedburner&amp;utm_medium=feed&amp;utm_campaign=Feed:+MulticoreInfo+(MulticoreInfo.com)" title="Parallel programming knowledge is must-have skill for Wall Street" /><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/2793087737173429540/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/09/parallel-programming-knowledge-is-must.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/2793087737173429540?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/2793087737173429540?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/keA2QJwj8qU/parallel-programming-knowledge-is-must.html" title="Parallel programming knowledge is must-have skill for Wall Street" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><content type="html">Parallel programming knowledge is must-have skill for Wall Street
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/n6EpYDLInWbPBLgDEyeR_toJXrw/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/n6EpYDLInWbPBLgDEyeR_toJXrw/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/n6EpYDLInWbPBLgDEyeR_toJXrw/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/n6EpYDLInWbPBLgDEyeR_toJXrw/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/keA2QJwj8qU" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/09/parallel-programming-knowledge-is-must.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CkMCSX4-eyp7ImA9Wx5UEkQ.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-6750439634358197261</id><published>2010-09-17T21:58:00.001+08:00</published><updated>2010-10-17T12:41:08.053+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-17T12:41:08.053+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="CUDA" /><category scheme="http://www.blogger.com/atom/ns#" term="OpenCL" /><category scheme="http://www.blogger.com/atom/ns#" term="gpu" /><category scheme="http://www.blogger.com/atom/ns#" term="programming" /><title>Unigine crew: CUDA vs OpenCL vs SPU Part IV</title><link rel="related" href="http://unigine.blogspot.com/2010/09/cuda-vs-opencl-vs-spu-part-iv.html#links" title="Unigine crew: CUDA vs OpenCL vs SPU Part IV" /><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/6750439634358197261/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/09/unigine-crew-cuda-vs-opencl-vs-spu-part.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/6750439634358197261?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/6750439634358197261?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/jwtDeM9lxAE/unigine-crew-cuda-vs-opencl-vs-spu-part.html" title="Unigine crew: CUDA vs OpenCL vs SPU Part IV" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><content type="html">Which language or library you choose to use for your software development has great and prolong impact to the software. Just come across a simple yet interesting benchmark. Perhaps, more details on why such numbers are obtained would be even more enlightening.Unigine crew: CUDA vs OpenCL vs SPU Part IV
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/g5k_axhoZbKqY2NS-2rtmYMx_aw/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/g5k_axhoZbKqY2NS-2rtmYMx_aw/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/g5k_axhoZbKqY2NS-2rtmYMx_aw/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/g5k_axhoZbKqY2NS-2rtmYMx_aw/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/jwtDeM9lxAE" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/09/unigine-crew-cuda-vs-opencl-vs-spu-part.html</feedburner:origLink></entry><entry gd:etag="W/&quot;AkQERX85fSp7ImA9Wx5UF0Q.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-3053667535687587978</id><published>2010-09-17T10:46:00.005+08:00</published><updated>2010-10-23T09:45:04.125+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-23T09:45:04.125+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="CUDA" /><category scheme="http://www.blogger.com/atom/ns#" term="Ruby" /><category scheme="http://www.blogger.com/atom/ns#" term="gpu" /><category scheme="http://www.blogger.com/atom/ns#" term="programming" /><title>CUDA Programming with Ruby</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/3053667535687587978/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/09/cuda-programming-with-ruby.html#comment-form" title="3 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/3053667535687587978?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/3053667535687587978?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/0Q4KjR1Tees/cuda-programming-with-ruby.html" title="CUDA Programming with Ruby" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>3</thr:total><content type="html">Need GPU computing power in your Ruby program? Great! SpeedGo Computing is developing Ruby bindings for CUDA, called sgc-ruby-cuda. Take advantage of your Nvidia CUDA-enabled graphics cards with Ruby now.Currently, only part of the CUDA Driver API is included. More components such as the CUDA Runtime API will be included to make it as complete as possible.CUDA Programming with Rubyrequire '
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/nin7jHvXKv6i9omVfvAE--pd8Mo/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/nin7jHvXKv6i9omVfvAE--pd8Mo/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/nin7jHvXKv6i9omVfvAE--pd8Mo/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/nin7jHvXKv6i9omVfvAE--pd8Mo/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/0Q4KjR1Tees" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/09/cuda-programming-with-ruby.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CkcMQ388cSp7ImA9Wx5UEkQ.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-3830094285084907840</id><published>2010-09-07T19:23:00.003+08:00</published><updated>2010-10-17T12:34:42.179+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-17T12:34:42.179+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="hpc" /><category scheme="http://www.blogger.com/atom/ns#" term="cpu" /><category scheme="http://www.blogger.com/atom/ns#" term="multi core" /><category scheme="http://www.blogger.com/atom/ns#" term="gpu" /><title>High Performance for All</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/3830094285084907840/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/09/high-performance-for-all.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/3830094285084907840?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/3830094285084907840?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/RGVnVytjLEA/high-performance-for-all.html" title="High Performance for All" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><content type="html">Parallel programming is much more affordable now as multi-core CPU and programmable GPU become commodity products. Unlike a decade ago where a minimum dual socket system equipped with lower clocked CPU &amp;amp; RAM would relatively cost a fortune to a typical desktop user, but dual-core system is basically everywhere nowadays. The use of dual-core systems is not really because it's affordable, but 
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/XElJuxrgb69F9BPjpcenQJWQ4Zs/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/XElJuxrgb69F9BPjpcenQJWQ4Zs/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/XElJuxrgb69F9BPjpcenQJWQ4Zs/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/XElJuxrgb69F9BPjpcenQJWQ4Zs/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/RGVnVytjLEA" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/09/high-performance-for-all.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CkYER3o-fCp7ImA9Wx5UEkQ.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-5226702773969879284</id><published>2010-08-25T12:38:00.020+08:00</published><updated>2010-10-17T12:35:06.454+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-17T12:35:06.454+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="cpu" /><category scheme="http://www.blogger.com/atom/ns#" term="multi core" /><category scheme="http://www.blogger.com/atom/ns#" term="apu" /><title>AMD’s Bulldozer vs Intel's Hyper-Threading?</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/5226702773969879284/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/08/amds-bulldozer-vs-intels-hyper.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/5226702773969879284?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/5226702773969879284?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/YrSEIb-0Q-o/amds-bulldozer-vs-intels-hyper.html" title="AMD’s Bulldozer vs Intel&amp;#39;s Hyper-Threading?" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://1.bp.blogspot.com/_YbIXF-lMDF8/THSb8XK1uaI/AAAAAAAAAt4/DA5kTmuc8Nk/s72-c/bulldozer_module.jpg" height="72" width="72" /><thr:total>0</thr:total><content type="html">AMD's so called Strong Thread approach in the Bulldozer module is that really compelling?Extra cores are added when a processor can't operate at a faster clock speed, that's a good and easy way to expand a product line with effectively faster products, even though it may NOT be any faster depending on whether the applications are taking advantage of the multiple cores. But fully duplicating x86 
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/qrNwmcsxBO_gypFwf4AicbVBdds/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/qrNwmcsxBO_gypFwf4AicbVBdds/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/qrNwmcsxBO_gypFwf4AicbVBdds/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/qrNwmcsxBO_gypFwf4AicbVBdds/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/YrSEIb-0Q-o" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/08/amds-bulldozer-vs-intels-hyper.html</feedburner:origLink><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="enclosure" href="http://feedproxy.google.com/~r/SpeedgoComputing/~5/X_qKKyT0N_0/bulldozer-bobcat-hot-chips,2724.html" length="0" type="text/html" /><feedburner:origEnclosureLink>http://www.tomshardware.com/reviews/bulldozer-bobcat-hot-chips,2724.html</feedburner:origEnclosureLink></entry><entry gd:etag="W/&quot;A08HRXY7fSp7ImA9Wx5UEkU.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-3630854547807992104</id><published>2010-08-17T12:41:00.005+08:00</published><updated>2010-10-17T12:30:34.805+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-17T12:30:34.805+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="parallelization" /><category scheme="http://www.blogger.com/atom/ns#" term="programming" /><category scheme="http://www.blogger.com/atom/ns#" term="parallel" /><category scheme="http://www.blogger.com/atom/ns#" term="MPI" /><category scheme="http://www.blogger.com/atom/ns#" term="matrix" /><title>Parallelizing Matrix Multiplication using MPI</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/3630854547807992104/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/08/parallelizing-matrix-multiplication_17.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/3630854547807992104?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/3630854547807992104?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/rs0-Jtz99X4/parallelizing-matrix-multiplication_17.html" title="Parallelizing Matrix Multiplication using MPI" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><content type="html">MPI is a popular mechanism in high performance computing. It works for both cluster and shared memory environment. Why don't we simply use MPI when it works for both environments? Why do we care about OpenMP? Cilk++? etc. Perhaps that depends on the complexity of the applications you are dealing with.Parallel Matrix Multiplication using MPI/* matrix-mpi.cpp */#include &amp;lt;mpi.h&amp;gt;const int size = 1000
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/s0jP1zERCLCxW6xToRQOkhY-YhE/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/s0jP1zERCLCxW6xToRQOkhY-YhE/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/s0jP1zERCLCxW6xToRQOkhY-YhE/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/s0jP1zERCLCxW6xToRQOkhY-YhE/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/rs0-Jtz99X4" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/08/parallelizing-matrix-multiplication_17.html</feedburner:origLink></entry><entry gd:etag="W/&quot;A08HRHgyfSp7ImA9Wx5UEkU.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-3689730311957495531</id><published>2010-08-15T14:13:00.004+08:00</published><updated>2010-10-17T12:30:35.695+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-17T12:30:35.695+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="parallelization" /><category scheme="http://www.blogger.com/atom/ns#" term="TBB" /><category scheme="http://www.blogger.com/atom/ns#" term="programming" /><category scheme="http://www.blogger.com/atom/ns#" term="parallel" /><category scheme="http://www.blogger.com/atom/ns#" term="matrix" /><title>Parallelizing Matrix Multiplication using TBB</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/3689730311957495531/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/08/parallelizing-matrix-multiplication_8641.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/3689730311957495531?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/3689730311957495531?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/AbkxwrCQXYo/parallelizing-matrix-multiplication_8641.html" title="Parallelizing Matrix Multiplication using TBB" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><content type="html">Parallelizing matrix multiplication using TBB isn't too difficult. It's just a little more work than OpenMP or Cilk++.Parallel Matrix Multiplication using TBB/* matrix-tbb.cpp */#include &amp;lt;tbb/parallel_for.h&amp;gt;#include &amp;lt;tbb/blocked_range.h&amp;gt;using namespace tbb;const int size = 1000;float a[size][size];float b[size][size];float c[size][size];class Multiply{public:    void operator()(blocked_range&amp;lt;int&amp;gt;
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/fo7ZjuiI4x4Ps62B7ZCqNF8R8O4/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/fo7ZjuiI4x4Ps62B7ZCqNF8R8O4/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/fo7ZjuiI4x4Ps62B7ZCqNF8R8O4/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/fo7ZjuiI4x4Ps62B7ZCqNF8R8O4/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/AbkxwrCQXYo" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/08/parallelizing-matrix-multiplication_8641.html</feedburner:origLink></entry><entry gd:etag="W/&quot;A08HR3g7fSp7ImA9Wx5UEkU.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-7031147384088228471</id><published>2010-08-15T11:01:00.011+08:00</published><updated>2010-10-17T12:30:36.605+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-17T12:30:36.605+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="parallelization" /><category scheme="http://www.blogger.com/atom/ns#" term="Cilk++" /><category scheme="http://www.blogger.com/atom/ns#" term="programming" /><category scheme="http://www.blogger.com/atom/ns#" term="parallel" /><category scheme="http://www.blogger.com/atom/ns#" term="matrix" /><title>Parallelizing Matrix Multiplication using Cilk++ in Two Lines</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/7031147384088228471/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/08/parallelizing-matrix-multiplication_15.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/7031147384088228471?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/7031147384088228471?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/ch51VIEr4mQ/parallelizing-matrix-multiplication_15.html" title="Parallelizing Matrix Multiplication using Cilk++ in Two Lines" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><content type="html">Following the parallelization of matrix multiplication using OpenMP in Parallelizing Matrix Multiplication using OpenMP in One Line, can we do the same using Cilk++?Parallel Matrix Multiplication using Cilk++/* matrix.cilk */const int size = 1000;float a[size][size];float b[size][size];float c[size][size];int cilk_main(){    // Initialize buffers.    for (int i = 0; i &amp;lt; size; ++i) {        for (
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/Ab-jk34big0LwpQZJ3WugW5o62U/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/Ab-jk34big0LwpQZJ3WugW5o62U/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/Ab-jk34big0LwpQZJ3WugW5o62U/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/Ab-jk34big0LwpQZJ3WugW5o62U/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/ch51VIEr4mQ" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/08/parallelizing-matrix-multiplication_15.html</feedburner:origLink></entry><entry gd:etag="W/&quot;A08HRnc9fCp7ImA9Wx5UEkU.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-4724360251246360907</id><published>2010-08-14T22:29:00.006+08:00</published><updated>2010-10-17T12:30:37.964+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-17T12:30:37.964+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="OpenMP" /><category scheme="http://www.blogger.com/atom/ns#" term="parallelization" /><category scheme="http://www.blogger.com/atom/ns#" term="programming" /><category scheme="http://www.blogger.com/atom/ns#" term="parallel" /><category scheme="http://www.blogger.com/atom/ns#" term="matrix" /><title>Parallelizing Matrix Multiplication using OpenMP in One Line</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/4724360251246360907/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/08/parallelizing-matrix-multiplication.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/4724360251246360907?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/4724360251246360907?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/NW87uu3RWak/parallelizing-matrix-multiplication.html" title="Parallelizing Matrix Multiplication using OpenMP in One Line" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><content type="html">Matrix multiplication is often used for academic study. It's well suited for parallelization due to its intensive O(N^3) computation and independent computation. Parallel programming is hard. Does it surprise you if we parallelize matrix multiplication in merely one line of OpenMP directive?Serial Matrix Multiplication/* matrix.cpp */const int size = 1000;float a[size][size];float b[size][size];
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/TacDk8_kRvHjz-Ewnr4vxcrxg1I/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/TacDk8_kRvHjz-Ewnr4vxcrxg1I/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/TacDk8_kRvHjz-Ewnr4vxcrxg1I/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/TacDk8_kRvHjz-Ewnr4vxcrxg1I/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/NW87uu3RWak" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/08/parallelizing-matrix-multiplication.html</feedburner:origLink></entry><entry gd:etag="W/&quot;A08HSH47fyp7ImA9Wx5UEkU.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-4023392998806497194</id><published>2010-08-11T18:14:00.013+08:00</published><updated>2010-10-17T12:30:39.007+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-17T12:30:39.007+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="OpenMP" /><category scheme="http://www.blogger.com/atom/ns#" term="Cilk++" /><category scheme="http://www.blogger.com/atom/ns#" term="Ruby" /><category scheme="http://www.blogger.com/atom/ns#" term="TBB" /><category scheme="http://www.blogger.com/atom/ns#" term="programming" /><category scheme="http://www.blogger.com/atom/ns#" term="parallel" /><category scheme="http://www.blogger.com/atom/ns#" term="MPI" /><category scheme="http://www.blogger.com/atom/ns#" term="PThread" /><title>Parallel Programming - Hello World</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/4023392998806497194/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/08/parallel-programming-hello-world.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/4023392998806497194?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/4023392998806497194?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/nWnJ8vQRULA/parallel-programming-hello-world.html" title="Parallel Programming - Hello World" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><content type="html">Many computer science/engineering students learn to write Hello World program at their first programming lecture. What's your first parallel program? What about Hello World program in OpenMP, MPI, Cilk++, TBB, Ruby thread, PThread?Hello World in C/* hello.c */#include &amp;lt;stdio.h&amp;gt;int main(){    printf("hello world\n");    return 0;}$ gcc hello.c -o hello$ ./hellohello worldHello World in OpenMP/* 
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/8PopmCs2MrVG1aDItyGfSDMN5i4/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/8PopmCs2MrVG1aDItyGfSDMN5i4/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/8PopmCs2MrVG1aDItyGfSDMN5i4/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/8PopmCs2MrVG1aDItyGfSDMN5i4/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/nWnJ8vQRULA" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/08/parallel-programming-hello-world.html</feedburner:origLink></entry><entry gd:etag="W/&quot;A08AQX4_fSp7ImA9Wx5UEkU.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-1351378876605459310</id><published>2010-07-31T02:21:00.013+08:00</published><updated>2010-10-17T12:30:40.045+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-17T12:30:40.045+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="OpenCL" /><category scheme="http://www.blogger.com/atom/ns#" term="Cilk++" /><category scheme="http://www.blogger.com/atom/ns#" term="Brook+" /><category scheme="http://www.blogger.com/atom/ns#" term="MPI" /><category scheme="http://www.blogger.com/atom/ns#" term="parallel" /><category scheme="http://www.blogger.com/atom/ns#" term="CUDA" /><category scheme="http://www.blogger.com/atom/ns#" term="OpenMP" /><category scheme="http://www.blogger.com/atom/ns#" term="Charm++" /><category scheme="http://www.blogger.com/atom/ns#" term="programming" /><category scheme="http://www.blogger.com/atom/ns#" term="TBB" /><category scheme="http://www.blogger.com/atom/ns#" term="uC++" /><category scheme="http://www.blogger.com/atom/ns#" term="PThread" /><category scheme="http://www.blogger.com/atom/ns#" term="PVM" /><title>Parallel Programming - What Are The Options?</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/1351378876605459310/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/07/parallel-programming-what-are-options.html#comment-form" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/1351378876605459310?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/1351378876605459310?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/rsLDe4Mwhbs/parallel-programming-what-are-options.html" title="Parallel Programming - What Are The Options?" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>2</thr:total><content type="html">There are simply way too many parallel programming languages and libraries to keep track of. Many of them are no longer active in development, or difficult to get them working in decent operating systems. What are the practical options currently available for multi-core CPU or GPU?OpenMPHardware: Shared memory multi-core CPU system.Parallelization: Use directives e.g. #pragma omp parallel {} in C
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/Kp1ICcStdAq8AhBfnsiaz1AzstU/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/Kp1ICcStdAq8AhBfnsiaz1AzstU/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/Kp1ICcStdAq8AhBfnsiaz1AzstU/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/Kp1ICcStdAq8AhBfnsiaz1AzstU/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/rsLDe4Mwhbs" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/07/parallel-programming-what-are-options.html</feedburner:origLink></entry><entry gd:etag="W/&quot;A08AQH4ycSp7ImA9Wx5UEkU.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-2054027409662684340</id><published>2010-07-29T16:10:00.002+08:00</published><updated>2010-10-17T12:30:41.099+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-17T12:30:41.099+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="cpu" /><category scheme="http://www.blogger.com/atom/ns#" term="multi core" /><category scheme="http://www.blogger.com/atom/ns#" term="gpu" /><category scheme="http://www.blogger.com/atom/ns#" term="programming" /><title>Who Is Responsible For The Programming Of Multi Core CPU And GPU?</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/2054027409662684340/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/07/who-is-responsible-for-programming-of.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/2054027409662684340?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/2054027409662684340?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/4HmaQ6u7rN0/who-is-responsible-for-programming-of.html" title="Who Is Responsible For The Programming Of Multi Core CPU And GPU?" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><content type="html">Multi core CPU and GPU are now commodity products. But, where are the software that could take advantage of their parallel architecture? Who should be developing such software? The domain expert? HPC (high performance computing) sofware engineer? or parallel programming tools such as auto parallelizing compilers?Domain experts typically do not wish to spend too much time on computing problems. 
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/lTv9J7fGCrcMxxw0Zh7HuMgFAWM/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/lTv9J7fGCrcMxxw0Zh7HuMgFAWM/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/lTv9J7fGCrcMxxw0Zh7HuMgFAWM/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/lTv9J7fGCrcMxxw0Zh7HuMgFAWM/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/4HmaQ6u7rN0" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/07/who-is-responsible-for-programming-of.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CkYNQ38yeCp7ImA9Wx5UEkQ.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-5851352855862165993</id><published>2010-07-28T17:33:00.007+08:00</published><updated>2010-10-17T12:36:32.190+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-17T12:36:32.190+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="parallelization" /><category scheme="http://www.blogger.com/atom/ns#" term="compiler" /><category scheme="http://www.blogger.com/atom/ns#" term="auto" /><title>Why Can't Compilers Auto Parallelize Serial Code Effectively?</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/5851352855862165993/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/07/why-cant-compilers-auto-parallelize.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/5851352855862165993?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/5851352855862165993?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/B7FL9VaSWmc/why-cant-compilers-auto-parallelize.html" title="Why Can&amp;#39;t Compilers Auto Parallelize Serial Code Effectively?" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><content type="html">An auto parallelizing tool takes in a serial code base in C/C++/Fortran etc. and produces parallel version of the code. For instance, specifying -parallel option at Intel compiler compilation produces parallelized binary with OpenMP runtime. MIPSpro compiler provides similar auto parallelizing function with -apo option, where you can view the code transformation which consists of SGI OpenMP 
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/U8aKNz6LbRUhaSXoyuc-9MTGt9o/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/U8aKNz6LbRUhaSXoyuc-9MTGt9o/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/U8aKNz6LbRUhaSXoyuc-9MTGt9o/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/U8aKNz6LbRUhaSXoyuc-9MTGt9o/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/B7FL9VaSWmc" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/07/why-cant-compilers-auto-parallelize.html</feedburner:origLink></entry><entry gd:etag="W/&quot;A08AQn08fSp7ImA9Wx5UEkU.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-5504948159830210305</id><published>2010-07-22T12:14:00.007+08:00</published><updated>2010-10-17T12:30:43.375+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-17T12:30:43.375+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="algorithm" /><category scheme="http://www.blogger.com/atom/ns#" term="programming" /><category scheme="http://www.blogger.com/atom/ns#" term="parallel" /><title>Where Are All The Practical Parallel Algorithms and Libraries?</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/5504948159830210305/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/07/where-are-all-practical-parallel.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/5504948159830210305?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/5504948159830210305?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/oalYcaBUcNE/where-are-all-practical-parallel.html" title="Where Are All The Practical Parallel Algorithms and Libraries?" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><content type="html">Multi-core CPU and GPU are everywhere nowadays from laptops to desktops to high-end computing clusters. Is your particular application running any faster? Nope. But generally you need parallel algorithms for an application to make full use of the multiple cores.Perhaps you'll expect doing some searches on the web, research publications and academic books would provide you all the state of art 
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/gVen7uYH2nW6F4CJ_MhEf_OaaIw/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/gVen7uYH2nW6F4CJ_MhEf_OaaIw/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/gVen7uYH2nW6F4CJ_MhEf_OaaIw/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/gVen7uYH2nW6F4CJ_MhEf_OaaIw/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/oalYcaBUcNE" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/07/where-are-all-practical-parallel.html</feedburner:origLink></entry><entry gd:etag="W/&quot;A08ARXY6cSp7ImA9Wx5UEkU.&quot;"><id>tag:blogger.com,1999:blog-4718119077220204108.post-5482603593589817550</id><published>2010-07-21T03:14:00.020+08:00</published><updated>2010-10-17T12:30:44.819+08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-10-17T12:30:44.819+08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="programming" /><category scheme="http://www.blogger.com/atom/ns#" term="parallel" /><title>Why Is Parallel Programming Difficult?</title><link rel="replies" type="application/atom+xml" href="http://blog.speedgocomputing.com/feeds/5482603593589817550/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://blog.speedgocomputing.com/2010/07/why-is-parallel-programming-difficult.html#comment-form" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/5482603593589817550?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/4718119077220204108/posts/default/5482603593589817550?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/SpeedgoComputing/~3/_e8VzgCJt6g/why-is-parallel-programming-difficult.html" title="Why Is Parallel Programming Difficult?" /><author><name>xman</name><uri>http://www.blogger.com/profile/05695636905017529897</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><content type="html">Parallel programming is generally perceived as an activity only for people going after high tech, bleeding edge research. It is difficult and alien enough to drive most software engineers away, whether it is really the case or merely their misconceptions. The fact is, software engineers run away from parallel programming while modern general purpose processors consist more and more multiple cores
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/0YhMZnxbuQevK-JIOlBQOnUA9YE/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/0YhMZnxbuQevK-JIOlBQOnUA9YE/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/0YhMZnxbuQevK-JIOlBQOnUA9YE/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/0YhMZnxbuQevK-JIOlBQOnUA9YE/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/SpeedgoComputing/~4/_e8VzgCJt6g" height="1" width="1"/&gt;</content><feedburner:origLink>http://blog.speedgocomputing.com/2010/07/why-is-parallel-programming-difficult.html</feedburner:origLink></entry></feed>

