<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>hgpu.org</title>
	
	<link>http://hgpu.org</link>
	<description>High performance computing on Graphics Processing Units</description>
	<lastBuildDate>Thu, 23 May 2013 20:38:14 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/hgpuorg" /><feedburner:info uri="hgpuorg" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Graphics Processing Unit (GPU) Implementation Methodology of AERMOD Model</title>
		<link>http://feedproxy.google.com/~r/hgpuorg/~3/27LzGH7UzFw/</link>
		<comments>http://hgpu.org/?p=9433#comments</comments>
		<pubDate>Thu, 23 May 2013 20:38:14 +0000</pubDate>
		<dc:creator>hgpu</dc:creator>
				<category><![CDATA[CUDA]]></category>
		<category><![CDATA[Earth and Space Sciences]]></category>
		<category><![CDATA[paper]]></category>
		<category><![CDATA[nVidia]]></category>

		<guid isPermaLink="false">http://hgpu.org/?p=9433</guid>
		<description><![CDATA[Air pollution is one of the major problems the world is facing today. Air pollution is caused due to release of dangerous chemical substances such as carbon monoxide, CFC (Chlorofluorocarbon), carbon dioxide, hydro carbon, sulfur dioxide, etc. in to the &#8230; <a href="http://hgpu.org/?p=9433"><span class="meta-nav">>>></span></a>]]></description>
				<content:encoded><![CDATA[<p>Air pollution is one of the major problems the world is facing today. Air pollution is caused due to release of dangerous chemical substances such as carbon monoxide, CFC (Chlorofluorocarbon), carbon dioxide, hydro carbon, sulfur dioxide, etc. in to the atmosphere. These substances are produced by various anthropological activities such as usage of vehicles, factory activities, etc. There is a need to assess the air quality to prevent the ill effects of pollutants on the environment. Air Quality Modeling (AQM) is an attempt to predict or simulate the ambient concentrations of contaminants in the atmosphere. These models are used primarily as a quantitative tool to correlate cause and effect of concentration levels found in an area. There are numerous models proposed in this regard. This paper proposes a methodology for GPU implementation of American Meteorological Society/Environmental Protection Agency Regulatory Model (AERMOD) to evaluate the pollutant dispersion in atmosphere.</p>
<img src="http://feeds.feedburner.com/~r/hgpuorg/~4/27LzGH7UzFw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://hgpu.org/?feed=rss2&amp;p=9433</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://hgpu.org/?p=9433</feedburner:origLink></item>
		<item>
		<title>Composing multiple StarPU applications over heterogeneous machines: a supervised approach</title>
		<link>http://feedproxy.google.com/~r/hgpuorg/~3/DH0E-x8WvE4/</link>
		<comments>http://hgpu.org/?p=9432#comments</comments>
		<pubDate>Thu, 23 May 2013 20:37:24 +0000</pubDate>
		<dc:creator>hgpu</dc:creator>
				<category><![CDATA[Computer science]]></category>
		<category><![CDATA[CUDA]]></category>
		<category><![CDATA[paper]]></category>
		<category><![CDATA[Benchmarking]]></category>
		<category><![CDATA[Heterogeneous systems]]></category>
		<category><![CDATA[Linear Algebra]]></category>
		<category><![CDATA[nVidia]]></category>
		<category><![CDATA[Tesla M2070]]></category>

		<guid isPermaLink="false">http://hgpu.org/?p=9432</guid>
		<description><![CDATA[Enabling HPC applications to perform efficiently when invoking multiple parallel libraries simultaneously is a great challenge. Even if a single runtime system is used underneath, scheduling tasks or threads coming from different libraries over the same set of hardware resources &#8230; <a href="http://hgpu.org/?p=9432"><span class="meta-nav">>>></span></a>]]></description>
				<content:encoded><![CDATA[<p>Enabling HPC applications to perform efficiently when invoking multiple parallel libraries simultaneously is a great challenge. Even if a single runtime system is used underneath, scheduling tasks or threads coming from different libraries over the same set of hardware resources introduces many issues, such as resource oversubscription, undesirable cache flushes or memory bus contention. This paper presents an extension of StarPU, a runtime system specifically designed for heterogeneous architectures, that allows multiple parallel codes to run concurrently with minimal interference. Such parallel codes run within scheduling contexts that provide confined execution environments which can be used to partition computing resources. Scheduling contexts can be dynamically resized to optimize the allocation of computing resources among concurrently running libraries. We introduce a hypervisor that automatically expands or shrinks contexts using feedback from the runtime system (e.g. resource utilization). We demonstrate the relevance of our approach using benchmarks invoking multiple high performance linear algebra kernels simultaneously on top of heterogeneous multicore machines. We show that our mechanism can dramatically improve the overall application run time (-34%), most notably by reducing the average cache miss ratio (-50%).</p>
<img src="http://feeds.feedburner.com/~r/hgpuorg/~4/DH0E-x8WvE4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://hgpu.org/?feed=rss2&amp;p=9432</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://hgpu.org/?p=9432</feedburner:origLink></item>
		<item>
		<title>Sequential Consistency for Heterogeneous-Race-Free: Programmer-centric Memory Models for Heterogeneous Platforms</title>
		<link>http://feedproxy.google.com/~r/hgpuorg/~3/aLrGwW_dg1k/</link>
		<comments>http://hgpu.org/?p=9431#comments</comments>
		<pubDate>Thu, 23 May 2013 20:36:22 +0000</pubDate>
		<dc:creator>hgpu</dc:creator>
				<category><![CDATA[Computer science]]></category>
		<category><![CDATA[CUDA]]></category>
		<category><![CDATA[OpenCL]]></category>
		<category><![CDATA[paper]]></category>
		<category><![CDATA[Heterogeneous systems]]></category>
		<category><![CDATA[Memory model]]></category>
		<category><![CDATA[nVidia]]></category>
		<category><![CDATA[Performance]]></category>

		<guid isPermaLink="false">http://hgpu.org/?p=9431</guid>
		<description><![CDATA[Hardware vendors now provide heterogeneous platforms in commodity markets (e.g., integrated CPUs and GPUs), and are promising an integrated, shared memory address space for such platforms in future iterations. Because not all threads in a heterogeneous platform can communicate with &#8230; <a href="http://hgpu.org/?p=9431"><span class="meta-nav">>>></span></a>]]></description>
				<content:encoded><![CDATA[<p>Hardware vendors now provide heterogeneous platforms in commodity markets (e.g., integrated CPUs and GPUs), and are promising an integrated, shared memory address space for such platforms in future iterations. Because not all threads in a heterogeneous platform can communicate with the same latency, vendors are proposing synchronization mechanisms that allow threads to communicate with a subset of threads (called a scope). However, vendors have yet to define a comprehensive and portable memory model that programmers can use to reason about scopes. Moreover, existing CPU memory models, such as Sequential Consistency for Data-Race-Free (SC for DRF), are ill-suited, in part, because they define all synchronization operations globally and preclude low-energy, high-performance local coordination. Towards this end, we embrace scoped synchronization with a new class of memory consistency models: Sequential Consistency for Heterogeneous-Race-Free (SC for HRF). Inspired by SC for DRF (C++, Java), the new models provide programmers with SC for programs with &quot;sufficient&quot; synchronization (no data races) of &quot;sufficient&quot; scope. We develop the first such model, called HRF0, show how it can be used to develop high-performance code, show example hardware support, and motivate future work.</p>
<img src="http://feeds.feedburner.com/~r/hgpuorg/~4/aLrGwW_dg1k" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://hgpu.org/?feed=rss2&amp;p=9431</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://hgpu.org/?p=9431</feedburner:origLink></item>
		<item>
		<title>Surface Reconstruction from Scattered Point via RBF Interpolation on GPU</title>
		<link>http://feedproxy.google.com/~r/hgpuorg/~3/sU8WraJznBU/</link>
		<comments>http://hgpu.org/?p=9430#comments</comments>
		<pubDate>Thu, 23 May 2013 20:35:51 +0000</pubDate>
		<dc:creator>hgpu</dc:creator>
				<category><![CDATA[Computer science]]></category>
		<category><![CDATA[CUDA]]></category>
		<category><![CDATA[paper]]></category>
		<category><![CDATA[Numerical Analysis]]></category>
		<category><![CDATA[nVidia]]></category>
		<category><![CDATA[Tesla C1060]]></category>

		<guid isPermaLink="false">http://hgpu.org/?p=9430</guid>
		<description><![CDATA[In this paper we describe a parallel implicit method based on radial basis functions (RBF) for surface reconstruction. The applicability of RBF methods is hindered by its computational demand, that requires the solution of linear systems of size equal to &#8230; <a href="http://hgpu.org/?p=9430"><span class="meta-nav">>>></span></a>]]></description>
				<content:encoded><![CDATA[<p>In this paper we describe a parallel implicit method based on radial basis functions (RBF) for surface reconstruction. The applicability of RBF methods is hindered by its computational demand, that requires the solution of linear systems of size equal to the number of data points. Our reconstruction implementation relies on parallel scientific libraries and is supported for massively multi-core architectures, namely Graphic Processor Units (GPUs). The performance of the proposed method in terms of accuracy of the reconstruction and computing time shows that the RBF interpolant can be very effective for such problem.</p>
<img src="http://feeds.feedburner.com/~r/hgpuorg/~4/sU8WraJznBU" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://hgpu.org/?feed=rss2&amp;p=9430</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://hgpu.org/?p=9430</feedburner:origLink></item>
		<item>
		<title>GPU Enhancement of the Trigger to Extend Physics Reach at the LHC</title>
		<link>http://feedproxy.google.com/~r/hgpuorg/~3/jBnJPDCeLUw/</link>
		<comments>http://hgpu.org/?p=9429#comments</comments>
		<pubDate>Thu, 23 May 2013 20:35:49 +0000</pubDate>
		<dc:creator>hgpu</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[CUDA]]></category>
		<category><![CDATA[paper]]></category>
		<category><![CDATA[Physics]]></category>
		<category><![CDATA[Hadronic Colliders]]></category>
		<category><![CDATA[High Energy Physics - Experiment]]></category>
		<category><![CDATA[nVidia]]></category>
		<category><![CDATA[Tesla K20]]></category>

		<guid isPermaLink="false">http://hgpu.org/?p=9429</guid>
		<description><![CDATA[Significant new challenges are continuously confronting the High Energy Physics (HEP) experiments, in particular the two detectors at the Large Hadron Collider (LHC) at CERN, where nominal conditions deliver proton-proton collisions to the detectors at a rate of 40 MHz. &#8230; <a href="http://hgpu.org/?p=9429"><span class="meta-nav">>>></span></a>]]></description>
				<content:encoded><![CDATA[<p>Significant new challenges are continuously confronting the High Energy Physics (HEP) experiments, in particular the two detectors at the Large Hadron Collider (LHC) at CERN, where nominal conditions deliver proton-proton collisions to the detectors at a rate of 40 MHz. This rate must be significantly reduced to comply with both the performance limitations of the mass storage hardware and the capabilities of the computing resources to process the collected data in a timely fashion for physics analysis. At the same time, the physics signals of interest must be retained with high efficiency. The quest for rare new physics phenomena at the LHC leads us to evaluate a Graphics Processing Unit (GPU) enhancement of the existing High-Level Trigger (HLT), made possible by the current flexibility of the trigger system, which not only provides faster and more efficient event selection, but also includes the possibility of new complex triggers that were not previously feasible. A new tracking algorithm is evaluated on a NVIDIA Tesla K20c GPU, allowing for the first time the reconstruction of long-lived particles at the tracker system in the trigger. Preliminary time performance and efficiency will be presented.</p>
<img src="http://feeds.feedburner.com/~r/hgpuorg/~4/jBnJPDCeLUw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://hgpu.org/?feed=rss2&amp;p=9429</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://hgpu.org/?p=9429</feedburner:origLink></item>
		<item>
		<title>Evaluating the Performance of Legacy Applications on Emerging Parallel Architectures</title>
		<link>http://feedproxy.google.com/~r/hgpuorg/~3/SkdlK2Lsims/</link>
		<comments>http://hgpu.org/?p=9427#comments</comments>
		<pubDate>Tue, 21 May 2013 20:57:20 +0000</pubDate>
		<dc:creator>hgpu</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Computer science]]></category>
		<category><![CDATA[CUDA]]></category>
		<category><![CDATA[OpenCL]]></category>
		<category><![CDATA[paper]]></category>
		<category><![CDATA[ATI]]></category>
		<category><![CDATA[ATI FirePro V7800]]></category>
		<category><![CDATA[Benchmarking]]></category>
		<category><![CDATA[MPI]]></category>
		<category><![CDATA[nVidia]]></category>
		<category><![CDATA[nVidia GeForce 8400 GS]]></category>
		<category><![CDATA[nVidia GeForce 9800 GT]]></category>
		<category><![CDATA[nVidia GeForce GTX 680]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Tesla C1060]]></category>
		<category><![CDATA[Tesla C2050]]></category>
		<category><![CDATA[Thesis]]></category>

		<guid isPermaLink="false">http://hgpu.org/?p=9427</guid>
		<description><![CDATA[The gap between a supercomputer&#8217;s theoretical maximum (&#34;peak&#34;) floating-point performance and that actually achieved by applications has grown wider over time. Today, a typical scientific application achieves only 5-20% of any given machine&#8217;s peak processing capability, and this gap leaves &#8230; <a href="http://hgpu.org/?p=9427"><span class="meta-nav">>>></span></a>]]></description>
				<content:encoded><![CDATA[<p>The gap between a supercomputer&#8217;s theoretical maximum (&quot;peak&quot;) floating-point performance and that actually achieved by applications has grown wider over time. Today, a typical scientific application achieves only 5-20% of any given machine&#8217;s peak processing capability, and this gap leaves room for significant improvements in execution times. This problem is most pronounced for modern &quot;accelerator&quot; architectures &#8212; collections of hundreds of simple, low-clocked cores capable of executing the same instruction on dozens of pieces of data simultaneously. This is a significant change from the low number of high-clocked cores found in traditional CPUs, and effective utilisation of accelerators typically requires extensive code and algorithmic changes. In many cases, the best way in which to map a parallel workload to these new architectures is unclear. The principle focus of the work presented in this thesis is the evaluation of emerging parallel architectures (specifically, modern CPUs, GPUs and Intel MIC) for two benchmark codes &#8212; the LU benchmark from the NAS Parallel Benchmark Suite and Sandia&#8217;s miniMD benchmark &#8212; which exhibit complex parallel behaviours that are representative of many scientific applications. Using combinations of low-level intrinsic functions, OpenMP, CUDA and MPI, we demonstrate performance improvements of up to 7x for these workloads. We also detail a code development methodology that permits application developers to target multiple architecture types without maintaining completely separate implementations for each platform. Using OpenCL, we develop performance portable implementations of the LU and miniMD benchmarks that are faster than the original codes, and at most 2x slower than versions highly-tuned for particular hardware. Finally, we demonstrate the importance of evaluating architectures at scale (as opposed to on single nodes) through performance modelling techniques, highlighting the problems associated with strong-scaling on emerging accelerator architectures.</p>
<img src="http://feeds.feedburner.com/~r/hgpuorg/~4/SkdlK2Lsims" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://hgpu.org/?feed=rss2&amp;p=9427</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://hgpu.org/?p=9427</feedburner:origLink></item>
		<item>
		<title>Implementing Continuous Integration Software in an Established Computational Chemistry Software Package</title>
		<link>http://feedproxy.google.com/~r/hgpuorg/~3/ClUmqewIcyE/</link>
		<comments>http://hgpu.org/?p=9426#comments</comments>
		<pubDate>Tue, 21 May 2013 20:56:19 +0000</pubDate>
		<dc:creator>hgpu</dc:creator>
				<category><![CDATA[Chemistry]]></category>
		<category><![CDATA[CUDA]]></category>
		<category><![CDATA[paper]]></category>
		<category><![CDATA[Benchmarking]]></category>
		<category><![CDATA[Computational chemistry]]></category>
		<category><![CDATA[Molecular dynamics]]></category>
		<category><![CDATA[nVidia]]></category>
		<category><![CDATA[Software Engineering]]></category>
		<category><![CDATA[Tesla M2090]]></category>

		<guid isPermaLink="false">http://hgpu.org/?p=9426</guid>
		<description><![CDATA[Continuous integration is the software engineering principle of rapid and automated development and testing. We identify several key points of continuous integration and demonstrate how they relate to the needs of computational science projects by discussing the implementation and relevance &#8230; <a href="http://hgpu.org/?p=9426"><span class="meta-nav">>>></span></a>]]></description>
				<content:encoded><![CDATA[<p>Continuous integration is the software engineering principle of rapid and automated development and testing. We identify several key points of continuous integration and demonstrate how they relate to the needs of computational science projects by discussing the implementation and relevance of these principles to AMBER, a large and widely used molecular dynamics software package. The use of a continuous integration server has both improved collaboration and communication between AMBER developers, who are globally distributed, as well as making failure and benchmark information that would be time consuming for individual developers to obtain by themselves, available in real time. Continuous integration servers currently available are aimed at the software engineering community and can be difficult to adapt to the needs of computational science projects, however as demonstrated in this paper the effort payoff can be rapid since uncommon errors are found and contributions from geographically separated researchers are unified into one easily-accessible web-based interface.</p>
<img src="http://feeds.feedburner.com/~r/hgpuorg/~4/ClUmqewIcyE" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://hgpu.org/?feed=rss2&amp;p=9426</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://hgpu.org/?p=9426</feedburner:origLink></item>
		<item>
		<title>An Investigation of the Performance Portability of OpenCL</title>
		<link>http://feedproxy.google.com/~r/hgpuorg/~3/w8Q9xw8qNT4/</link>
		<comments>http://hgpu.org/?p=9425#comments</comments>
		<pubDate>Tue, 21 May 2013 20:55:36 +0000</pubDate>
		<dc:creator>hgpu</dc:creator>
				<category><![CDATA[Computer science]]></category>
		<category><![CDATA[CUDA]]></category>
		<category><![CDATA[OpenCL]]></category>
		<category><![CDATA[paper]]></category>
		<category><![CDATA[ATI]]></category>
		<category><![CDATA[ATI FirePro V7800]]></category>
		<category><![CDATA[Benchmarking]]></category>
		<category><![CDATA[Fortran]]></category>
		<category><![CDATA[MPI]]></category>
		<category><![CDATA[nVidia]]></category>
		<category><![CDATA[Package]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Tesla C1060]]></category>
		<category><![CDATA[Tesla C2050]]></category>

		<guid isPermaLink="false">http://hgpu.org/?p=9425</guid>
		<description><![CDATA[This paper reports on the development of an MPI/OpenCL implementation of LU, an application-level benchmark from the NAS Parallel Benchmark Suite. An account of the design decisions addressed during the development of this code is presented, demonstrating the importance of &#8230; <a href="http://hgpu.org/?p=9425"><span class="meta-nav">>>></span></a>]]></description>
				<content:encoded><![CDATA[<p>This paper reports on the development of an MPI/OpenCL implementation of LU, an application-level benchmark from the NAS Parallel Benchmark Suite. An account of the design decisions addressed during the development of this code is presented, demonstrating the importance of memory arrangement and work-item/work-group distribution strategies when applications are deployed on different device types. The resulting platform-agnostic, single source application is benchmarked on a number of different architectures, and is shown to be 1.3-1.5x slower than native FORTRAN or CUDA implementations on a single node and 1.3-3.1x slower on multiple nodes. We also explore the potential performance gains of OpenCL&#8217;s device fissioning capability, demonstrating up to a 3x speed-up over our original OpenCL implementation.</p>
<img src="http://feeds.feedburner.com/~r/hgpuorg/~4/w8Q9xw8qNT4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://hgpu.org/?feed=rss2&amp;p=9425</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://hgpu.org/?p=9425</feedburner:origLink></item>
		<item>
		<title>Super Earths and Dynamical Stability of Planetary Systems: First Parallel GPU Simulations Using GENGA</title>
		<link>http://feedproxy.google.com/~r/hgpuorg/~3/R1LzAYHGQqc/</link>
		<comments>http://hgpu.org/?p=9424#comments</comments>
		<pubDate>Tue, 21 May 2013 20:55:00 +0000</pubDate>
		<dc:creator>hgpu</dc:creator>
				<category><![CDATA[Astrophysics]]></category>
		<category><![CDATA[CUDA]]></category>
		<category><![CDATA[paper]]></category>
		<category><![CDATA[Earth and Planetary Astrophysics]]></category>
		<category><![CDATA[N-body simulation]]></category>
		<category><![CDATA[nVidia]]></category>
		<category><![CDATA[nVidia GeForce GTX 590]]></category>
		<category><![CDATA[Particle simulation]]></category>

		<guid isPermaLink="false">http://hgpu.org/?p=9424</guid>
		<description><![CDATA[We report on the stability of hypothetical Super-Earths in the habitable zone of known multi-planetary systems. Most of them have not yet been studied in detail concerning the existence of additional low-mass planets. The new N-body code GENGA developed at &#8230; <a href="http://hgpu.org/?p=9424"><span class="meta-nav">>>></span></a>]]></description>
				<content:encoded><![CDATA[<p>We report on the stability of hypothetical Super-Earths in the habitable zone of known multi-planetary systems. Most of them have not yet been studied in detail concerning the existence of additional low-mass planets. The new N-body code GENGA developed at the UZH allows us to perform numerous N-body simulations in parallel on GPUs. With this numerical tool, we can study the stability of orbits of hypothetical planets in the semi-major axis and eccentricity parameter space in high resolution. Massless test particle simulations give good predictions on the extension of the stable region and show that HIP 14180 and HD 37124 do not provide stable orbits in the habitable zone. Based on these simulations, we carry out simulations of 10 Earth mass planets in several systems (HD 11964, HD 47186, HD 147018, HD 163607, HD 168443, HD 187123, HD 190360, HD 217107 and HIP 57274). They provide more exact information about orbits at the location of mean motion resonances and at the edges of the stability zones. Beside the stability of orbits, we study the secular evolution of the planets to constrain probable locations of hypothetical planets. Assuming that planetary systems are in general closely packed, we find that apart from HD 168443, all of the systems can harbor 10 Earth mass planets in the habitable zone.</p>
<img src="http://feeds.feedburner.com/~r/hgpuorg/~4/R1LzAYHGQqc" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://hgpu.org/?feed=rss2&amp;p=9424</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://hgpu.org/?p=9424</feedburner:origLink></item>
		<item>
		<title>3DES ECB Optimized for Massively Parallel CUDA GPU Architecture</title>
		<link>http://feedproxy.google.com/~r/hgpuorg/~3/z3GvI8p_Yjs/</link>
		<comments>http://hgpu.org/?p=9423#comments</comments>
		<pubDate>Tue, 21 May 2013 20:54:57 +0000</pubDate>
		<dc:creator>hgpu</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Computer science]]></category>
		<category><![CDATA[CUDA]]></category>
		<category><![CDATA[paper]]></category>
		<category><![CDATA[nVidia]]></category>
		<category><![CDATA[nVidia GeForce GTS 250]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Tesla C2050]]></category>

		<guid isPermaLink="false">http://hgpu.org/?p=9423</guid>
		<description><![CDATA[Modern computers have graphics cards with much higher theoretical efficiency than conventional CPU. The paper presents application possibilities GPU CUDA acceleration for encryption of data using the new architecture tailored to the 3DES algorithm, characterized by increased security compared to &#8230; <a href="http://hgpu.org/?p=9423"><span class="meta-nav">>>></span></a>]]></description>
				<content:encoded><![CDATA[<p>Modern computers have graphics cards with much higher theoretical efficiency than conventional CPU. The paper presents application possibilities GPU CUDA acceleration for encryption of data using the new architecture tailored to the 3DES algorithm, characterized by increased security compared to the normal DES. The algorithm used in ECB mode (Electronic Codebook), in which 64-bit data blocks are encrypted independently by stream processors (CUDA cores).</p>
<img src="http://feeds.feedburner.com/~r/hgpuorg/~4/z3GvI8p_Yjs" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://hgpu.org/?feed=rss2&amp;p=9423</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://hgpu.org/?p=9423</feedburner:origLink></item>
	</channel>
</rss>
