<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">
    <title>ServerFramework.com</title>
    <link rel="alternate" type="text/html" href="http://www.serverframework.com/" />
    
    <id>tag:www.serverframework.com,2010-04-27:/1</id>
    <updated />
    <subtitle>Leave the networking to us and get Online, On Time...
The C++ framework for developing highly scalable, high performance servers on Windows platforms. </subtitle>
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type Pro 5.12</generator>

<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/Serverframework" /><feedburner:info uri="serverframework" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry>
    <title>Windows 8 Registered I/O Performance - 10 Gigabit networking... - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/uZuyMbbb_1I/windows-8-registered-io-performance---10-gigabit-networking.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1178</id>

    <published>2012-03-26T15:44:35Z</published>
    <updated>2012-03-26T15:56:42Z</updated>

    <summary> When I switched to looking at the performance of the more advanced RIO server designs that use IOCP it quickly became apparent that even multiple 1 Gigabit connections weren't enough of a challenge to give me any meaningful figures;...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
When I switched to looking at the performance of the more advanced RIO server designs that use IOCP it quickly became apparent that even multiple 1 Gigabit connections weren't enough of a challenge to give me any meaningful figures; my traditional IOCP datagram servers were easily able to keep up and increasing the workload per datagram required such high workloads that the tests became meaningless. So, we've brought forward the purchase of the hardware that we intended to use for our private cloud scalability testing and we now have 2 Intel 10 Gigabit AT2 cards. Switch prices are still prohibitive for lab use and so these two cards are directly connected, point to point.  
&lt;/div&gt; 
&lt;div&gt;&lt;br/&gt;&lt;/div&gt;
&lt;div&gt;
The good news is that we now have a 10 Gigabit network link between two of our test servers. The bad news is that I now have to work out how to use it, the traditional datagram generation program that I was previously using to test simply doesn't scale to saturate the new link.
&lt;/div&gt; 
 
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=uZuyMbbb_1I:1-6opJYPFJQ:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=uZuyMbbb_1I:1-6opJYPFJQ:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/uZuyMbbb_1I" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-performance---10-gigabit-networking.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O Performance - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/ZcB-hAsQW3w/windows-8-registered-io-performance.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1168</id>

    <published>2012-03-15T15:15:00Z</published>
    <updated>2012-03-19T09:53:07Z</updated>

    <summary> I've been looking at the Windows 8 Registered I/O Networking Extensions since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
I've been looking at the &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;Windows 8 Registered I/O Networking Extensions&lt;/a&gt; since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting together &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;some simple UDP servers&lt;/a&gt; using the various notification styles that RIO provides. I then put together some equally simple UDP servers using the "traditional" APIs so that I could compare performance.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Of course these comparisons should be taken as preliminary since we're working with a beta version of the operating system. However, though I wouldn't put much weight in the exact numbers until we have a non-beta OS to test on, it's useful to see how things are coming along and familiarise ourselves with the designs that might be required to take advantage of RIO once it ships. The main thing to take away from these discussions on RIO's performance are the example server designs, the testing methods and a general understanding of why RIO performs better than the traditional Windows networking APIs. With this you can run your own tests, build your own servers and get value from using RIO where appropriate.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Do bear in mind that I'm learning as I go here, RIO is a new API and there is precious little in the way of documentation about how and why to use the API. Comments and suggestions are more than welcome, feel free to put me straight if I've made some mistakes, and submit code to help make these examples better.
&lt;/div&gt;
        &lt;h2 class="entry-body"&gt;How to test RIO's performance&lt;/h2&gt;
&lt;div&gt;
The tests consist of sending a large number of datagrams to the server under test. We send two sizes of datagram, the test datagram and the shutdown datagram. The server counts the datagrams that it receives and the time taken. It shuts down as soon as it receives a shutdown datagram. The servers that we are using for these tests &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;are detailed here&lt;/a&gt; and the datagram generator &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---generating-load-for-the-performance-tests.html"&gt;is available here&lt;/a&gt;.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Whilst the numbers that the servers report are useful for getting a rough idea of how the various API's compare they're not the whole story. It's useful to look at performance counter logs that are taken whilst the test server is running. The CPU usage of the server under test, and the entire machine, are useful indicators of how much further we could push a given server. The number of datagrams received, and dropped by the network and Winsock are useful to see, as is the non-paged pool usage, etc.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
To make the testing repeatable I've put together some simple scripts which create the required performance logs using &lt;a href="http://technet.microsoft.com/en-us/library/cc753820(WS.10).aspx" target="_blank"&gt;logman&lt;/a&gt;, the command line interface to &lt;a href="http://technet.microsoft.com/en-us/library/bb490957.aspx" target="_blank"&gt;perfmon&lt;/a&gt;. This means that for each test run we can run a single command which creates and starts a performance counter log, runs the server and then stops the performance counter log. It would be nice to include custom performance counters in each of the example servers so that we can see more of what's going on inside, but whilst easy to do, using our &lt;a href="http://www.serverframework.com/products---the-performance-counters-option.html"&gt;Performance Counters Option pack&lt;/a&gt;, that's beyond the scope of these tests.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The test client, or clients for when we're using two network links into the test machine, are started manually. We could automate this with &lt;a href="http://www.windowsnetworking.com/articles_tutorials/How-Windows-Server-2008-WinRM-WinRS.html" target="_blank"&gt;winrs&lt;/a&gt;, as we've &lt;a href="http://www.lenholgate.com/blog/2010/05/performance-comparisons-for-recent-code-changes.html" target="_blank"&gt;done in the past&lt;/a&gt;, but these tests don't really warrant that level of complexity. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Our test system&lt;/h2&gt;
&lt;div&gt;
Our test system consists of a dual Xenon E5620 @ 2.40GHz, that's 16 CPUs in 2 Numa nodes with 16GB of memory. The machine has four 1Gb Ethernet network intefaces, a Broadcom BCM571C NetXtreme II GigE with two channels and a Intel 82576 Gigabit dual port adapter. We're using the Intel adapter for all of the tests shown here, sometimes using one NIC and sometimes two. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Windows Server 8 beta Datacentre edition is running directly on the hardware.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The client hardware is less impressive, but both client machines can push their 1Gb network interfaces to around 98% whilst running our datagram generator and that's more than enough for our purposes here.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;The first tests&lt;/h2&gt;
&lt;div&gt;
To get a feel for how the RIO API differs from the traditional API's the first test will compare a polled RIO server with a traditional, blocking, polled server. &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;The code for the servers is available here&lt;/a&gt; along with some commentary on their designs. You'll need Visual Studio 11 to build the examples.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The test scripts, mentioned above, can be &lt;a href="http://www.serverframework.com/zips/RIO-TestScripts.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'RIO-TestScripts.zip']);"&gt;downloaded from here&lt;/a&gt;. Each server has its own script and a text file that details the performance counters to capture during the test run. All of the scripts call a common script which sets up the performance counter log and then starts the server. You shouldn't start the clients until the server is running and has output its configuration details. Once the server receives its first datagram it will display "TimingStarted" and when it has received a shutdown datagram it will display "TimingStopped" and display the number of datagrams that it managed to receive, the time taken and the datagrams per second. You need to copy the x64 release builds of the example servers into the same directory of the test scripts and then be sure to run the batch file and not the exe directly. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
As an initial test we will run the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;traditional UDP server&lt;/a&gt; with one test client. We'll set the test client to send 10,000,000 datagrams, which takes a little over one minute. Once the test was completed the server reported that it had processed 9,952,510 datagrams in 86,880ms, a rate of 114,000 per second. Running the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;RIO polled server example&lt;/a&gt; with the same network load the results were broadly similar; 9,932,228 datagrams in 86,681ms, a rate of 114,000 per second.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
At first glance it seems that RIO isn't so impressive, however we need to remind ourselves of what these example servers are doing; all they're doing is pulling datagrams off of the wire as fast as they can. They're both doing so on a single CPU of a 16 CPU machine and, from these results, it seems that on, this hardware, both APIs can quite easily handle a single saturated 1Gb network link. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Digging deeper into RIO's performance&lt;/h2&gt;
&lt;div&gt;
Whilst the two servers at first appear to behave almost identically under the load it's only when we start looking at the performance counters that we can see that actually the two APIs have completely different performance characteristics.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Here's the graph for the traditional UDP server. Note the thick blue line, that's the amount of time the process spends in kernel mode, on average 37.133% of its time.
&lt;a href="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-SimplePolledUDP_03151103-313.html" onclick="window.open('http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-SimplePolledUDP_03151103-313.html','popup','width=807,height=564,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false"&gt;&lt;img src="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-SimplePolledUDP_03151103-thumb-500x349-313.gif" width="500" height="349" alt="RIO-Perf-SimplePolledUDP_03151103.gif" class="mt-image-center" style="text-align: center; display: block; margin: 0 auto 20px;" /&gt;&lt;/a&gt;
The graph from the RIO server is a little different. The thick blue line is still there, it's just that it's 0 most of the time. The average is 0.167%.
&lt;a href="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-RIOPolledUDP_03151109-316.html" onclick="window.open('http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-RIOPolledUDP_03151109-316.html','popup','width=784,height=541,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false"&gt;&lt;img src="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-RIOPolledUDP_03151109-thumb-500x345-316.gif" width="500" height="345" alt="RIO-Perf-RIOPolledUDP_03151109.gif" class="mt-image-center" style="text-align: center; display: block; margin: 0 auto 20px;" /&gt;&lt;/a&gt;
Another thing worth noting is that the spinning that the RIO server does is obvious from the fact that it uses 100% of a CPU (see the thick red line) and that most of that is spent in user mode code (the dotted green line that runs across the thick red line).
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Another thing that is interesting to see is the non-paged pool usage; the RIO server uses a fixed amount for the life of the process, 8,064 bytes, the traditional server uses 4,192 bytes for most of the time but has some random peaks, the highest of which is 133,656 bytes. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Increasing the network load&lt;/h2&gt;
&lt;div&gt;
Running two clients, each sending 10,000,000 datagrams to different network cards on the test machine gives us similar figures, the traditional server remains ahead of the RIO server with 19,847,578 datagrams to 19,279,842 datagrams. It seems that with the given hardware both APIs are capable of dealing with two saturated 1Gb links.  
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Increasing the work done per datagram received&lt;/h2&gt;
&lt;div&gt;
The example servers all have a &lt;code&gt;DoWork()&lt;/code&gt; function which allows us to add some "processing" for each datagram that is received, this gives us a slightly more realistic test as, except for &lt;a href="http://en.wikipedia.org/wiki/Discard_Protocol" target="_blank"&gt;discard servers&lt;/a&gt;, most servers need to do some work with each datagram that arrives.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Running the tests again, this time with a 'workload' of 100 gives the following results. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;b&gt;Traditional Server&lt;/b&gt;
&lt;ul&gt;
&lt;li&gt; 6,386,294 datagrams out of  10,000,000 on 1 1Gb link,  63%&lt;/li&gt;
&lt;li&gt; 4,824,707 datagrams out of  20,000,000 on 2 1Gb links, 24%&lt;/li&gt;
&lt;li&gt;38,830,887 datagrams out of 100,000,000 on 2 1Gb links, 38%&lt;/li&gt;
&lt;/ul&gt;
&lt;b&gt;RIO Server&lt;/b&gt;
&lt;ul&gt;
&lt;li&gt; 9,985,323 datagrams out of  10,000,000 on 1 1Gb link,  99%&lt;/li&gt;
&lt;li&gt;19,730,003 datagrams out of  20,000,000 on 2 1Gb links, 98%&lt;/li&gt;
&lt;li&gt;93,640,607 datagrams out of 100,000,000 on 2 1Gb links, 93%&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Clearly a more realistic example allows the RIO API to show what it's capable of. Note that I ran a longer test, with two clients each sending 50,000,000 datagrams as the second test showed some results that seemed to imply that the traditional server had become overwhelmed near the end of the test. The longer test was to see if it could recover, it didn't it simply entered the overwhelmed state and stayed there until the end of the test. This is possibly due to the socket's recv buffer filling up.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
As can be seen from these graphs, the traditional server quickly gets into a state where it is dropping vast numbers of datagrams (thick pink line) whilst burning more user mode CPU than kernel mode CPU having maxed out the single CPU that it's running on.
&lt;a href="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-SimplePolledUDP_03151306-319.html" onclick="window.open('http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-SimplePolledUDP_03151306-319.html','popup','width=807,height=564,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false"&gt;&lt;img src="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-SimplePolledUDP_03151306-thumb-500x349-319.gif" width="500" height="349" alt="RIO-Perf-SimplePolledUDP_03151306.gif" class="mt-image-center" style="text-align: center; display: block; margin: 0 auto 20px;" /&gt;&lt;/a&gt;
The RIO server doesn't drop any datagrams and the graph looks surprisingly like the previous one with no load.
&lt;a href="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-RIOPolledUDP_03151330-322.html" onclick="window.open('http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-RIOPolledUDP_03151330-322.html','popup','width=807,height=564,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false"&gt;&lt;img src="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-RIOPolledUDP_03151330-thumb-500x349-322.gif" width="500" height="349" alt="RIO-Perf-RIOPolledUDP_03151330.gif" class="mt-image-center" style="text-align: center; display: block; margin: 0 auto 20px;" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
To look at how the performance of the RIO server degraded as the workload per datagram increases I ran some more tests.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;b&gt;RIO Server, 10,000,000 datagrams&lt;/b&gt;
&lt;ul&gt;
&lt;li&gt;9,985,323 datagrams, 99% at a workload of  100&lt;/li&gt;
&lt;li&gt;9,888,733 datagrams, 98% at a workload of  300&lt;/li&gt;
&lt;li&gt;7,174,653 datagrams, 71% at a workload of  500&lt;/li&gt;
&lt;li&gt;5,573,046 datagrams, 55% at a workload of  700&lt;/li&gt;
&lt;li&gt;4,361,820 datagrams, 43% at a workload of 1000&lt;/li&gt;
&lt;li&gt;2,927,590 datagrams, 29% at a workload of 2000&lt;/li&gt;
&lt;/ul&gt;
And just to compare...
&lt;br /&gt;
&lt;b&gt;Traditional server, 10,000,000 datagrams&lt;/b&gt;
&lt;ul&gt;
&lt;li&gt;2,522,667 datagrams, 25% at a workload of 1000&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Some conclusions&lt;/h2&gt;
&lt;div&gt;
Bear in mind that these results are specific to the test machine I was running on and that we're testing on a beta version of the Windows 8 Server operating system. Even so, the figures are impressive. The lack of kernel mode transitions allow much more CPU to be used for real work on each datagram that arrives. The registering of I/O buffers once at program start up reduces the work done per operation and also means that your server will use a known amount of non-paged pool rather than a completely variable amount. Though &lt;a href="http://www.lenholgate.com/blog/2009/03/excellent-article-on-non-paged-pool.html" target="_blank"&gt;non-paged pool is more plentiful than it used to be pre-Vista&lt;/a&gt; this is likely still an advantage.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The RIO API isn't especially complicated but your server designs will be different. The simple polling example server that we used here is unlikely to be an ideal choice as it uses 100% of its CPU for the whole time that the server is running. It's also a little unfair to compare RIO to such a simple traditional server but; there are better alternatives, but it's a useful line in the sand. As we'll see in the following performance articles there are better, and more scalable ways to use both APIs.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
If you're interested in digging deeper into the results used in this article then all of the performance logs taken whilst running the tests are available &lt;a href="http://www.serverframework.com/zips/RIO-Perf-PerfLogs1.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'RIO-Perf-PerfLogs1.zip']);"&gt;here&lt;/a&gt;.
&lt;/div&gt;
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=ZcB-hAsQW3w:bfryiUNdnXs:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=ZcB-hAsQW3w:bfryiUNdnXs:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/ZcB-hAsQW3w" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-performance.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O - Generating load for the performance tests - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/ctdIdMENlGw/windows-8-registered-io---generating-load-for-the-performance-tests.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1176</id>

    <published>2012-03-14T17:15:00Z</published>
    <updated>2012-03-14T17:19:06Z</updated>

    <summary> Now that we have five example servers, four RIO designs and a traditional polled UDP design, we can begin to look at how the RIO API performs compared to the traditional APIs. Of course these comparisons should be taken...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
Now that we have &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;five example servers&lt;/a&gt;, four RIO designs and a traditional polled UDP design, we can begin to look at how the RIO API performs compared to the traditional APIs. Of course these comparisons should be taken as preliminary since we're working with a beta version of the operating system. However, though I wouldn't put much weight in the exact numbers until we have a non-beta OS to test on, it's useful to see how things are coming along and familiarise ourselves with the designs that might be required to take advantage of RIO once it ships.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Sending a stream of datagrams&lt;/h2&gt;
&lt;div&gt;
Before we can compare performance we need to be able to push the example servers hard. We do this by sending a stream of datagrams at them as fast as we can for a period of time. The servers start timing when they get the first datagram and then count the number of datagrams that they process. The test finishes by sending a series of smaller datagrams at the server. When the server sees one of these smaller datagrams it shuts down and reports on the time taken and the number of datagrams processed and the rate at which they were processed.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
All we need to be able to do to stress the servers is to send datagrams at a rate that gets close to 100% utilisation of a 1Gb Ethernet link. This is fairly simple to achieve using the traditional blocking sockets API.
&lt;pre class="brush: cpp gutter: false"&gt;   for (size_t i = 0; i &amp;lt; DATAGRAMS_TO_SEND; ++i)
   {
      if (SOCKET_ERROR == ::WSASendTo(
         s,
         &amp;amp;buf,
         1,
         &amp;amp;bytesSent,
         flags,
         reinterpret_cast&amp;lt;sockaddr *&amp;gt;(&amp;amp;addr),
         sizeof(addr),
         0,
         0))
      {
         ErrorExit("WSASend");
      }
   }&lt;/pre&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
There's not much more to it than that. We use similar code to setup and clean up, but if you've been following along with the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;other examples&lt;/a&gt; then there's nothing that needs to be explained about that.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The code for this example can be downloaded from &lt;a href="http://www.serverframework.com/zips/SimpleUDPTrafficGenerator.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'SimpleUDPTrafficGenerator.zip']);"&gt;here&lt;/a&gt;. This code requires Visual Studio 11, but would work with earlier compilers if you have a Windows SDK that supports RIO. Note that &lt;code&gt;Shared.h&lt;/code&gt; and &lt;code&gt;Constants.h&lt;/code&gt; contain helper functions and tuning constants for ALL of the examples and so there will be code in there that is not used by this example. You should be able to unzip each example into the same directory structure so that they all share the same shared headers. This allows you to tune all of the examples the same so that any performance comparisons make sense. This program can be run on versions of Windows prior to Windows 8, which is useful for testing as you only need one machine set up with the beta of Windows 8 server.  
&lt;/div&gt;

        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=ctdIdMENlGw:FugnxPUVbk4:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=ctdIdMENlGw:FugnxPUVbk4:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/ctdIdMENlGw" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---generating-load-for-the-performance-tests.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O - Traditional Polled UDP Example Server - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/-arn1YIc1Jg/windows-8-registered-io---traditional-polled-udp-example-server.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1175</id>

    <published>2012-03-14T14:00:00Z</published>
    <updated>2012-03-14T17:24:36Z</updated>

    <summary> This article presents the fifth in my series of example servers for comparing the performance of the Windows 8 Registered I/O Networking extensions, RIO, and traditional Windows networking APIs. This example server is a traditional polled UDP design that...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
This article presents the fifth in my series of example servers for comparing the performance of the Windows 8 Registered I/O Networking extensions, RIO, and traditional Windows networking APIs. This example server is a traditional polled UDP design that we can use to compare to &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;the RIO polled UDP example server&lt;/a&gt;. I've been looking at the &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;Windows 8 Registered I/O Networking Extensions&lt;/a&gt; since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting together some simple UDP servers using the various notification styles that RIO provides. I then put together some equally simple UDP servers using the "traditional" APIs so that I could compare performance. This series of blog posts describes each of the example servers in turn. You can find an index to all of the articles about the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;Windows 8 Registered I/O example servers here&lt;/a&gt;. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;A traditional polled UDP server&lt;/h2&gt;
&lt;div&gt;
This server is probably the simplest UDP server you could have. It's pretty much just a tight loop around a blocking call to &lt;code&gt;WSARecv()&lt;/code&gt;. There's none of the complexity required by RIO for registering memory buffers for I/O and so we use a single buffer that we create on the stack.
&lt;pre class="brush: cpp gutter: false"&gt;   do
   {
      workValue += DoWork(g_workIterations);

      if (SOCKET_ERROR == ::WSARecv(
         s,
         &amp;amp;buf,
         1,
         &amp;amp;bytesRecvd,
         &amp;amp;flags,
         0,
         0))
      {
         ErrorExit("WSARecv");
      }

      if (bytesRecvd == EXPECTED_DATA_SIZE)
      {
         g_packets++;
      }
      else
      {
         done = true;
      }
   }
   while (!done);&lt;/pre&gt;
There is some added complexity to allow us to compare performance, and this is similar to the RIO server examples. We can add an arbitrary processing overhead to each datagram by setting &lt;code&gt;g_workIterations&lt;/code&gt; to a non zero value and we count each datagram that arrives and stop the test when a datagram of an unexpected size is received.
&lt;/div&gt; 
        &lt;h2 class="entry-body"&gt;Setting up for the datagram processing loop&lt;/h2&gt;
&lt;div&gt;
As with the RIO examples we do some setup before we can process datagrams. See the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;polled RIO example server&lt;/a&gt; for details of how and why we set up the timing system and initialise Winsock, and for details on our error handling policy.
&lt;pre class="brush: cpp gutter: false"&gt;int _tmain(int argc, _TCHAR* argv[])
{
   SetupTiming("Simple polled UDP");

   InitialiseWinsock();

   SOCKET s = CreateSocket();

   Bind(s, PORT);

   SetSocketRecvBufferToMaximum(s);

   bool done = false;

   CHAR buffer[RECV_BUFFER_SIZE];

   WSABUF buf;

   buf.buf = buffer;
   buf.len = RECV_BUFFER_SIZE;

   DWORD bytesRecvd = 0;

   DWORD flags = 0;

   if (SOCKET_ERROR == ::WSARecv(
      s,
      &amp;amp;buf,
      1,
      &amp;amp;bytesRecvd,
      &amp;amp;flags,
      0,
      0))
   {
      ErrorExit("WSARecv");
   }

   g_packets++;

   StartTiming();

   int workValue = 0;&lt;/pre&gt;
We then create a traditional blocking UDP socket, bind it to a port, set its receive buffer size to the maximum and create our receive buffer on the stack, set up our &lt;code&gt;WSABUF&lt;/code&gt; and call &lt;code&gt;WSARecv()&lt;/code&gt; for the first time. We make this call outside of our processing loop so that we can start timing when we get the first datagram. This code then proceeds into the processing loop, shown above, and processes datagrams until the test is complete and a datagram of an unexpected size is received.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The code for &lt;code&gt;CreateSocket()&lt;/code&gt;, &lt;code&gt;Bind()&lt;/code&gt; and &lt;code&gt;SetSocketRecvBufferToMaximum()&lt;/code&gt; can be found in &lt;code&gt;Shared.h&lt;/code&gt; and remember that the use of globals isn't clever, it's simply convenient for some of the other example servers that use the shared code.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;After the processing loop&lt;/h2&gt;
&lt;div&gt;
Once the performance test completes we stop our timing and report the results.
&lt;pre class="brush: cpp gutter: false"&gt;   StopTiming();

   PrintTimings();

   return workValue;
}&lt;/pre&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The code for this example can be downloaded from &lt;a href="http://www.serverframework.com/zips/SimplePolledUDP.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'SimplePolledUDP.zip']);"&gt;here&lt;/a&gt;. This code requires Visual Studio 11, but would work with earlier compilers if you have a Windows SDK that supports RIO. Note that &lt;code&gt;Shared.h&lt;/code&gt; and &lt;code&gt;Constants.h&lt;/code&gt; contain helper functions and tuning constants for ALL of the examples and so there will be code in there that is not used by this example. You should be able to unzip each example into the same directory structure so that they all share the same shared headers. This allows you to tune all of the examples the same so that any performance comparisons make sense.   
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;h2 class="entry-body"&gt;Join in&lt;/h2&gt;
Comments and suggestions are more than welcome. I'm learning as I go here and I'm quite likely to have made some mistakes or come up with some erroneous conclusions, feel free to put me straight and help make these examples better.
&lt;/div&gt;
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=-arn1YIc1Jg:89rsGzJITec:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=-arn1YIc1Jg:89rsGzJITec:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/-arn1YIc1Jg" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---traditional-polled-udp-example-server.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O - Multi threaded RIO IOCP UDP Example Server - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/3TyycbzWSpE/windows-8-registered-io---multi-threaded-rio-iocp-udp-example-server.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1174</id>

    <published>2012-03-12T18:10:00Z</published>
    <updated>2012-05-03T08:54:18Z</updated>

    <summary> This article presents the fourth in my series of example servers using the Windows 8 Registered I/O Networking extensions, RIO. This example server, like the last example, uses the I/O Completion Port notification method to handle RIO completions, but...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
This article presents the fourth in my series of example servers using the Windows 8 Registered I/O Networking extensions, RIO. This example server, like &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-iocp-udp-example-server.html"&gt;the last example&lt;/a&gt;, uses the I/O Completion Port notification method to handle RIO completions, but where the last example used only a single thread to service the IOCP this one uses multiple thread to scale the load . I've been looking at the &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;Windows 8 Registered I/O Networking Extensions&lt;/a&gt; since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting together some simple UDP servers using the various notification styles that RIO provides. I then put together some equally simple UDP servers using the "traditional" APIs so that I could compare performance. This series of blog posts describes each of the example servers in turn. You can find an index to all of the articles about the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;Windows 8 Registered I/O example servers here&lt;/a&gt;. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Using an I/O Completion Port for RIO completions&lt;/h2&gt;
&lt;div&gt;
As I mentioned back in October, there are &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;three ways to receive completion notifications from RIO&lt;/a&gt;; polling, event driven and via an I/O Completion Port. Using an IOCP for RIO completions allows you to easily scale your completion handling across multiple threads as we do here and this is the first of my example servers that allows for more than one thread to be used to process completions.
&lt;/div&gt;
        &lt;h2 class="entry-body"&gt;Creating an IOCP driven RIO completion queue&lt;/h2&gt;
&lt;div&gt;
We start by initialising things in the same way that we did with the earlier &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;example RIO servers&lt;/a&gt;. In fact, this initialisation is &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-iocp-udp-example-server.html"&gt;identical to the previous IOCP example&lt;/a&gt; except for one thing.  
&lt;pre class="brush: cpp gutter: false"&gt;int _tmain(int argc, _TCHAR* argv[])
{
   SetupTiming("RIO IOCP UDP");

   InitialiseWinsock();

   CreateRIOSocket();

   g_hIOCP = ::CreateIoCompletionPort(
      INVALID_HANDLE_VALUE,
      0,
      0,
      0);

   OVERLAPPED overlapped;

   RIO_NOTIFICATION_COMPLETION completionType;

   completionType.Type = RIO_IOCP_COMPLETION;
   completionType.Iocp.IocpHandle = g_hIOCP;
   completionType.Iocp.CompletionKey = (void*)1;
   completionType.Iocp.Overlapped = &amp;amp;overlapped;

   g_queue = g_rio.RIOCreateCompletionQueue(
      RIO_PENDING_RECVS,
      &amp;amp;completionType);

   if (g_queue == RIO_INVALID_CQ)
   {
      ErrorExit("RIOCreateCompletionQueue");
   }&lt;/pre&gt;
With the previous design we passed &lt;code&gt;0&lt;/code&gt; as the completion key. This server passes &lt;code&gt;1&lt;/code&gt;. This is an arbitrary change purely to allow us to post completions of &lt;code&gt;0&lt;/code&gt; to cause all of the threads waiting on the completion queue to shut down. This is a common idiom with normal, non-RIO, IOCP designs as it's more usual that the completion key is a pointer to a "per operation" data structure. A RIO design with multiple completion queues would likely use the completion key for "per queue" data. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Creating a RIO request queue&lt;/h2&gt;
&lt;div&gt;
Creating the request queue and posting our receives is identical to &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;the polled example&lt;/a&gt;. The only difference is how we handle the completions.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Starting our worker threads&lt;/h2&gt;
&lt;div&gt;
This example is the first to use more than just a single thread. Because of this we need to have a way to start and manage our worker threads and for them to communicate with each other and the main thread.
&lt;pre class="brush: cpp gutter: false"&gt;   CreateIOCPThreads(NUM_IOCP_THREADS);

   INT notifyResult = g_rio.RIONotify(g_queue);

   if (notifyResult != ERROR_SUCCESS)
   {
      ErrorExit("RIONotify", notifyResult);
   }

   WaitForProcessingStarted();&lt;/pre&gt;
First we call &lt;code&gt;CreateIOCPThreads()&lt;/code&gt;, which is shown below, this creates some events that the threads will use to communicate and then creates and starts the threads themselves. As with the earlier examples, we use globals for convenience and not as an example of good design.
&lt;pre class="brush: cpp gutter: false"&gt;inline void CreateIOCPThreads(
   const DWORD numThreads)
{
   g_hStartedEvent = ::CreateEvent(0, TRUE, FALSE, 0);

   if (0 == g_hStartedEvent)
   {
      ErrorExit("CreateEvent");
   }

   g_hStoppedEvent = ::CreateEvent(0, TRUE, FALSE, 0);

   if (0 == g_hStoppedEvent)
   {
      ErrorExit("CreateEvent");
   }

   // Start our worker threads

   for (DWORD i = 0; i &amp;lt; numThreads; ++i)
   {
      unsigned int notUsed;

      const uintptr_t result = ::_beginthreadex(
         0,
         0,
         ThreadFunction,
         0,
         0,
         &amp;amp;notUsed);

      if (result == 0)
      {
         ErrorExit("_beginthreadex", errno);
      }

      g_threads.push_back(reinterpret_cast&amp;lt;handle&amp;gt;(result));
   }

   cout &amp;lt;&amp;lt; numThreads &amp;lt;&amp;lt; " threads running" &amp;lt;&amp;lt; endl;
}&lt;/pre&gt;
The main thread then calls &lt;code&gt;RIONotify()&lt;/code&gt; to enable notifications and then waits for the first datagram to be processed before it starts the timer.
&lt;pre class="brush: cpp gutter: false"&gt;inline void WaitForProcessingStarted()
{
   if (WAIT_OBJECT_0 != ::WaitForSingleObject(
      g_hStartedEvent,
      INFINITE))
   {
      ErrorExit("WaitForSingleObject");
   }

   StartTiming();
}&lt;/pre&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Calling RIODequeueCompletion() and processing results&lt;/h2&gt;
&lt;div&gt;
This example's processing loop is similar to the previous examples, especially the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-iocp-udp-example-server.html"&gt;single threaded IOCP example server&lt;/a&gt;. It's slightly more complicated due to the fact that it's being run on a separate thread.
&lt;pre class="brush: cpp gutter: false"&gt;unsigned int __stdcall ThreadFunction(
   void *pV)
{
   DWORD numberOfBytes = 0;

   ULONG_PTR completionKey = 0;

   OVERLAPPED *pOverlapped = 0;

   const DWORD recvFlags = 0;

   if (!::GetQueuedCompletionStatus(
      g_hIOCP,
      &amp;amp;numberOfBytes,
      &amp;amp;completionKey,
      &amp;amp;pOverlapped,
      INFINITE))
   {
      ErrorExit("GetQueuedCompletionStatus");
   }

   int workValue = 0;

   if (completionKey == 1)
   {
      RIORESULT results[RIO_MAX_RESULTS];

      bool done = false;

      ::SetEvent(g_hStartedEvent);

      ULONG numResults = g_rio.RIODequeueCompletion(
         g_queue,
         results,
         RIO_MAX_RESULTS);

      if (0 == numResults ||
          RIO_CORRUPT_CQ == numResults)
      {
         ErrorExit("RIODequeueCompletion");
      }

      INT notifyResult = g_rio.RIONotify(g_queue);

      if (notifyResult != ERROR_SUCCESS)
      {
         ErrorExit("RIONotify", notifyResult);
      }

      do
      {
         for (DWORD i = 0; i &amp;lt; numResults; ++i)
         {
            EXTENDED_RIO_BUF *pBuffer = reinterpret_cast&amp;lt;EXTENDED_RIO_BUF *&amp;gt;(results[i].RequestContext);

            if (results[i].BytesTransferred == EXPECTED_DATA_SIZE)
            {
               ::InterlockedIncrement(&amp;amp;g_packets);

               workValue += DoWork(g_workIterations);

               if (!g_rio.RIOReceive(
                  g_requestQueue,
                  pBuffer,
                  1,
                  recvFlags,
                  pBuffer))
               {
                  ErrorExit("RIOReceive");
               }

               done = false;
            }
            else
            {
               done = true;
            }
         }

         if (!done)
         {
            if (!::GetQueuedCompletionStatus(
               g_hIOCP,
               &amp;amp;numberOfBytes,
               &amp;amp;completionKey,
               &amp;amp;pOverlapped,
               INFINITE))
            {
               ErrorExit("GetQueuedCompletionStatus");
            }

            if (completionKey == 0)
            {
               done = true;
            }
            else
            {
               numResults = g_rio.RIODequeueCompletion(
                  g_queue,
                  results,
                  RIO_MAX_RESULTS);

               if (0 == numResults ||
                   RIO_CORRUPT_CQ == numResults)
               {
                  ErrorExit("RIODequeueCompletion");
               }

               INT notifyResult = g_rio.RIONotify(g_queue);

               if (notifyResult != ERROR_SUCCESS)
               {
                  ErrorExit("RIONotify", notifyResult);
               }
            }
         }
      }
      while (!done);
   }

   ::SetEvent(g_hStoppedEvent);

   return workValue;
}&lt;/pre&gt;
The first thing we do is wait for a completion. Once we have a completion we dequeue the results and then call &lt;code&gt;RIONotify() &lt;/code&gt; to allow more completions to occur. It's important to realise that until we call &lt;code&gt;RIONotify()&lt;/code&gt; no further completions will be posted to the I/O Completion Port and that this effectively acts as synchronisation around the calls to &lt;code&gt;RIODequeueCompletion()&lt;/code&gt;. With this design only one thread can ever be calling &lt;code&gt;RIODequeueCompletion()&lt;/code&gt; at a time, which is a good thing as &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/hh448845(v=vs.85).aspx" target="_blank"&gt;the documentation for &lt;code&gt;RIODequeueCompletion()&lt;/code&gt;&lt;/a&gt; states that this is a requirement for users of the API.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Remember that this example is made more complex due to the way we profile the servers. See the explanation in the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;completion handling section of the polled RIO server example&lt;/a&gt; for details of why this is.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Shutting down and displaying results&lt;/h2&gt;
&lt;div&gt;
Whilst our worker threads are processing datagrams our main thread is waiting for the performance test to end. 
&lt;pre class="brush: cpp gutter: false"&gt;   WaitForProcessingStopped();

   StopIOCPThreads();

   PrintTimings();

   return 0;
}&lt;/pre&gt;
Our thread function's main loop can exit in two ways. Firstly because a datagram arrives that isn't of the expected size, thus signalling the end of the performance test, and secondly if &lt;code&gt;GetQueuedCompletionStatus()&lt;/code&gt; returns a completion key of &lt;code&gt;0&lt;/code&gt; which means that the main thread has posted completions to request that we shut down. This means that the first time a "shutdown" datagram arrives the first thread that begins processing it will shut down and set the &lt;code&gt;g_hStoppedEvent&lt;/code&gt; event. The main thread is currently waiting for this event, and will wake when the event is set and shut the rest of the worker threads down. Once all of the threads have terminated the main thread will display details of the datagrams received and the test timings.
&lt;pre class="brush: cpp gutter: false"&gt;inline void WaitForProcessingStopped()
{
   if (WAIT_OBJECT_0 != ::WaitForSingleObject(
      g_hStoppedEvent,
      INFINITE))
   {
      ErrorExit("WaitForSingleObject");
   }

   StopTiming();
}

inline void StopIOCPThreads()
{
   // Tell all threads to exit

   for (Threads::const_iterator it = g_threads.begin(),
      end = g_threads.end();
      it != end;
      ++it)
   {
      if (0 == ::PostQueuedCompletionStatus(
         g_hIOCP,
         0,
         0,
         0))
      {
         ErrorExit("PostQueuedCompletionStatus");
      }
   }

   cout &amp;lt;&amp;lt; "Threads stopping" &amp;lt;&amp;lt; endl;

   // Wait for all threads to exit

   for (Threads::const_iterator it = g_threads.begin(),
      end = g_threads.end();
      it != end;
      ++it)
   {
      HANDLE hThread = *it;

      if (WAIT_OBJECT_0 != ::WaitForSingleObject(
         hThread,
         INFINITE))
      {
         ErrorExit("WaitForSingleObject");
      }

      ::CloseHandle(hThread);
   }   

   cout &amp;lt;&amp;lt; "Threads stopped" &amp;lt;&amp;lt; endl;
}&lt;/pre&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Unexpected performance issues...&lt;/h2&gt;
&lt;div&gt;
The slight problem with this design is that it's not actually as performant in some scenarios as we might like it to be. As it stands, the fact that we can scale out across multiple threads is a plus point but the fact that the operations that we have to perform to achieve that scaling are considerably more expensive is a problem. This is more of an issue when we're looking for a general purpose solution which works as well for low throughput and fast processing of each datagram as it does for high throughput and/or slow processing. Luckily there are a couple of things we can do to fix this, but we'll look at those once we've done some performance comparisons and seen the problems first hand.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The code for this example can be downloaded from &lt;a href="http://www.serverframework.com/zips/RIO-IOCPUDPMT.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'RIO-IOCPUDPMT.zip']);"&gt;here&lt;/a&gt;. This code requires Visual Studio 11, but would work with earlier compilers if you have a Windows SDK that supports RIO. Note that &lt;code&gt;Shared.h&lt;/code&gt; and &lt;code&gt;Constants.h&lt;/code&gt; contain helper functions and tuning constants for ALL of the examples and so there will be code in there that is not used by this example. You should be able to unzip each example into the same directory structure so that they all share the same shared headers. This allows you to tune all of the examples the same so that any performance comparisons make sense.   
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;h2 class="entry-body"&gt;Join in&lt;/h2&gt;
Comments and suggestions are more than welcome. I'm learning as I go here and I'm quite likely to have made some mistakes or come up with some erroneous conclusions, feel free to put me straight and help make these examples better.
&lt;/div&gt;

    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=3TyycbzWSpE:zcwomYmmSIk:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=3TyycbzWSpE:zcwomYmmSIk:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/3TyycbzWSpE" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---multi-threaded-rio-iocp-udp-example-server.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O - Single threaded RIO IOCP UDP Example Server - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/TeqJ_vsRtew/windows-8-registered-io---single-threaded-rio-iocp-udp-example-server.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1171</id>

    <published>2012-03-12T15:45:00Z</published>
    <updated>2012-03-13T08:34:24Z</updated>

    <summary> This article presents the third in my series of example servers using the Windows 8 Registered I/O Networking extensions, RIO. This example server uses the I/O Completion Port notification method to handle RIO completions, but only uses a single...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
This article presents the third in my series of example servers using the Windows 8 Registered I/O Networking extensions, RIO. This example server uses the I/O Completion Port notification method to handle RIO completions, but only uses a single thread to service the IOCP. I've been looking at the &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;Windows 8 Registered I/O Networking Extensions&lt;/a&gt; since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting together some simple UDP servers using the various notification styles that RIO provides. I then put together some equally simple UDP servers using the "traditional" APIs so that I could compare performance. This series of blog posts describes each of the example servers in turn. You can find an index to all of the articles about the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;Windows 8 Registered I/O example servers here&lt;/a&gt;. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Using an I/O Completion Port for RIO completions&lt;/h2&gt;
&lt;div&gt;
As I mentioned back in October, there are &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;three ways to receive completion notifications from RIO&lt;/a&gt;; polling, event driven and via an I/O Completion Port. Using an IOCP for RIO completions allows you to easily scale your completion handling across multiple threads, though in this first IOCP example server we use a single thread so as to allow us to compare the performance against the polled and event driven servers. The next example server will adapt this server for multiple threads and allow us to scale our completion processing across more CPUs.
&lt;/div&gt;

        &lt;h2 class="entry-body"&gt;Creating an IOCP driven RIO completion queue&lt;/h2&gt;
&lt;div&gt;
We start by initialising things in the same way that we did with the earlier &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;example RIO servers&lt;/a&gt;.  
&lt;pre class="brush: cpp gutter: false"&gt;int _tmain(int argc, _TCHAR* argv[])
{
   SetupTiming("RIO IOCP UDP");

   InitialiseWinsock();

   CreateRIOSocket();

   g_hIOCP = ::CreateIoCompletionPort(
      INVALID_HANDLE_VALUE,
      0,
      0,
      0);

   OVERLAPPED overlapped;

   RIO_NOTIFICATION_COMPLETION completionType;

   completionType.Type = RIO_IOCP_COMPLETION;
   completionType.Iocp.IocpHandle = g_hIOCP;
   completionType.Iocp.CompletionKey = (void*)0;
   completionType.Iocp.Overlapped = &amp;amp;overlapped;

   g_queue = g_rio.RIOCreateCompletionQueue(
      RIO_PENDING_RECVS,
      &amp;amp;completionType);

   if (g_queue == RIO_INVALID_CQ)
   {
      ErrorExit("RIOCreateCompletionQueue");
   }&lt;/pre&gt;
Once that is done we create an IO Completion Port and then create a RIO completion queue which uses the IOCP for notification. In this simple design we have no need for a completion key as we only have a single completion queue so there's no need to differentiate between completion types. We also use a plain old &lt;code&gt;OVERLAPPED&lt;/code&gt; rather than extending it to carry more information. More complex designs could use either the completion key, or an extended overlapped structure to pass queue specific information to our completion handler in much the same way that we do with normal IOCP server designs.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Creating a RIO request queue&lt;/h2&gt;
&lt;div&gt;
Creating the request queue and posting our receives is identical to &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;the polled example&lt;/a&gt;. The only difference is how we handle the completions.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Calling RIODequeueCompletion() and processing results&lt;/h2&gt;
&lt;div&gt;
Processing completions is almost identical to &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-event-driven-udp-example-server.html"&gt;processing event driven completions&lt;/a&gt;. We simply change the call to &lt;code&gt;WaitForSingleObject()&lt;/code&gt; that we were using in the event driven example to the following to retrieve a completion notification from the IOCP.
&lt;pre class="brush: cpp gutter: false"&gt;   DWORD numberOfBytes = 0;

   ULONG_PTR completionKey = 0;

   OVERLAPPED *pOverlapped = 0;

   if (!::GetQueuedCompletionStatus(
      g_hIOCP,
      &amp;amp;numberOfBytes,
      &amp;amp;completionKey,
      &amp;amp;pOverlapped,
      INFINITE))
   {
      ErrorExit("GetQueuedCompletionStatus");
   }&lt;/pre&gt;
Everything else is identical. Things change somewhat when we switch to using multiple threads for our completion handling.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The code for this example can be downloaded from &lt;a href="http://www.serverframework.com/zips/RIO-IOCPUDP.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'RIO-IOCPUDP.zip']);"&gt;here&lt;/a&gt;. This code requires Visual Studio 11, but would work with earlier compilers if you have a Windows SDK that supports RIO. Note that &lt;code&gt;Shared.h&lt;/code&gt; and &lt;code&gt;Constants.h&lt;/code&gt; contain helper functions and tuning constants for ALL of the examples and so there will be code in there that is not used by this example. You should be able to unzip each example into the same directory structure so that they all share the same shared headers. This allows you to tune all of the examples the same so that any performance comparisons make sense.   
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;h2 class="entry-body"&gt;Join in&lt;/h2&gt;
Comments and suggestions are more than welcome. I'm learning as I go here and I'm quite likely to have made some mistakes or come up with some erroneous conclusions, feel free to put me straight and help make these examples better.
&lt;/div&gt;
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=TeqJ_vsRtew:rEOqdkqhzDs:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=TeqJ_vsRtew:rEOqdkqhzDs:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/TeqJ_vsRtew" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-iocp-udp-example-server.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O - Single threaded RIO Event Driven UDP Example Server - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/ZMFsIwiTQq0/windows-8-registered-io---single-threaded-rio-event-driven-udp-example-server.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1172</id>

    <published>2012-03-09T22:25:00Z</published>
    <updated>2012-03-14T17:11:18Z</updated>

    <summary> This article presents the second in my series of example servers using the Windows 8 Registered I/O Networking extensions, RIO. This example server uses the event driven notification method to handle RIO completions. I've been looking at the Windows...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
This article presents the second in my series of example servers using the Windows 8 Registered I/O Networking extensions, RIO. This example server uses the event driven notification method to handle RIO completions. I've been looking at the &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;Windows 8 Registered I/O Networking Extensions&lt;/a&gt; since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting together some simple UDP servers using the various notification styles that RIO provides. I then put together some equally simple UDP servers using the "traditional" APIs so that I could compare performance. This series of blog posts describes each of the example servers in turn. You can find an index to all of the articles about the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;Windows 8 Registered I/O example servers here&lt;/a&gt;. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Using an event for RIO completions&lt;/h2&gt;
&lt;div&gt;
As I mentioned back in October, there are &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;three ways to receive completion notifications from RIO&lt;/a&gt;; polling, event driven and via an I/O Completion Port. Using the event driven approach is similar to using the polling approach that I described in &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;the previous article&lt;/a&gt; except that the server doesn't burn CPU in a tight polling loop.
&lt;/div&gt;
        &lt;h2 class="entry-body"&gt;Creating an event driven RIO completion queue&lt;/h2&gt;
&lt;div&gt;
We start by initialising things in the same way that we did with the earlier &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;example RIO servers&lt;/a&gt;.  
&lt;pre class="brush: cpp gutter: false"&gt;int _tmain(int argc, _TCHAR* argv[])
{
   SetupTiming("RIO Event Driven UDP");

   InitialiseWinsock();

   CreateRIOSocket();

   HANDLE hEvent = WSACreateEvent();

   if (hEvent == WSA_INVALID_EVENT)
   {
      ErrorExit("WSACreateEvent");
   }

   RIO_NOTIFICATION_COMPLETION completionType;

   completionType.Type = RIO_EVENT_COMPLETION;
   completionType.Event.EventHandle = hEvent;
   completionType.Event.NotifyReset = TRUE;

   g_queue = g_rio.RIOCreateCompletionQueue(
      RIO_PENDING_RECVS,
      &amp;amp;completionType);

   if (g_queue == RIO_INVALID_CQ)
   {
      ErrorExit("RIOCreateCompletionQueue");
   }&lt;/pre&gt;
Once that is done we create an event and then create a RIO completion queue which uses the event for notification. The event is signalled when there are completions to process and reset when we call &lt;code&gt;RIONotify()&lt;/code&gt;.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Creating a RIO request queue&lt;/h2&gt;
&lt;div&gt;
Creating the request queue and posting our receives is identical to &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;the polled example&lt;/a&gt;. The only difference is how we handle the completions.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Calling RIODequeueCompletion() and processing results&lt;/h2&gt;
&lt;div&gt;
The processing loop is, again, similar to the polled example. Unsurprisingly rather than polling we wait on the event and dequeue the completions once the event is set. This reduced the amount of CPU used as there's no need to spin whilst waiting for new datagrams to process. The only complication is that we need to call &lt;code&gt;RIONotify()&lt;/code&gt; to indicate that we're ready to process more completions. Note that in a real server you would probably want to wait on your completions available event and a 'we're ready to shut down' event so that you can shut the sever down cleanly.
&lt;pre class="brush: cpp gutter: false"&gt;   bool done = false;

   DWORD recvFlags = 0;

   RIORESULT results[RIO_MAX_RESULTS];

   const INT notifyResult = g_rio.RIONotify(g_queue);

   if (notifyResult != ERROR_SUCCESS)
   {
      ErrorExit("RIONotify");
   }

   const DWORD waitResult = WaitForSingleObject(
      hEvent,
      INFINITE);

   if (waitResult != WAIT_OBJECT_0)
   {
      ErrorExit("WaitForSingleObject");
   }

   ULONG numResults = g_rio.RIODequeueCompletion(
      g_queue,
      results,
      RIO_MAX_RESULTS);

   if (0 == numResults ||
       RIO_CORRUPT_CQ == numResults)
   {
      ErrorExit("RIODequeueCompletion");
   }

   StartTiming();

   int workValue = 0;

   bool running = true;

   do
   {
      for (DWORD i = 0; i &amp;lt; numResults; ++i)
      {
         EXTENDED_RIO_BUF *pBuffer = reinterpret_cast&amp;lt;EXTENDED_RIO_BUF *&amp;gt;(results[i].RequestContext);

         if (results[i].BytesTransferred == EXPECTED_DATA_SIZE)
         {
            g_packets++;

            workValue += DoWork(g_workIterations);

            if (!g_rio.RIOReceive(
               g_requestQueue,
               pBuffer,
               1,
               recvFlags,
               pBuffer))
            {
               ErrorExit("RIOReceive");
            }

            done = false;
         }
         else
         {
            done = true;
         }
      }

      if (!done)
      {
         const INT notifyResult = g_rio.RIONotify(g_queue);

         if (notifyResult != ERROR_SUCCESS)
         {
            ErrorExit("RIONotify");
         }

         const DWORD waitResult = WaitForSingleObject(
            hEvent,
            INFINITE);

         if (waitResult != WAIT_OBJECT_0)
         {
            ErrorExit("WaitForSingleObject");
         }

         numResults = g_rio.RIODequeueCompletion(
            g_queue,
            results,
            RIO_MAX_RESULTS);

         if (0 == numResults ||
             RIO_CORRUPT_CQ == numResults)
         {
            ErrorExit("RIODequeueCompletion");
         }
      }
   }
   while (!done);

   StopTiming();

   PrintTimings();

   return workValue;
}&lt;/pre&gt;
As &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;before&lt;/a&gt;, the structure of the processing loop is complicated somewhat by the fact that we want to start and stop the timing for the performance testing, and the &lt;code&gt;DoWork()&lt;/code&gt; function can be used to add 'processing overhead' to each datagram. This can be configured using the &lt;code&gt;g_workIterations&lt;/code&gt; which is defined in &lt;code&gt;Constants.h&lt;/code&gt;. With this set to 0 there is no overhead and we can compare how quickly each API can receive datagrams. Setting larger values will affect how the various multi-threaded examples perform and can be useful if you're unable to saturate the test machine's network interfaces.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
This example can be optimised slightly so that we revert to straight polling as long as calling &lt;code&gt;RIODequeueCompletion()&lt;/code&gt; returns us at least one result. We'll look at this variation after we've studied the performance of the example shown here.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The code for this example can be downloaded from &lt;a href="http://www.serverframework.com/zips/RIO-EventDrivenUDP.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'RIO-EventDrivenUDP.zip']);"&gt;here&lt;/a&gt;. This code requires Visual Studio 11, but would work with earlier compilers if you have a Windows SDK that supports RIO. Note that &lt;code&gt;Shared.h&lt;/code&gt; and &lt;code&gt;Constants.h&lt;/code&gt; contain helper functions and tuning constants for ALL of the examples and so there will be code in there that is not used by this example. You should be able to unzip each example into the same directory structure so that they all share the same shared headers. This allows you to tune all of the examples the same so that any performance comparisons make sense.   
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;h2 class="entry-body"&gt;Join in&lt;/h2&gt;
Comments and suggestions are more than welcome. I'm learning as I go here and I'm quite likely to have made some mistakes or come up with some erroneous conclusions, feel free to put me straight and help make these examples better.
&lt;/div&gt;

    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=ZMFsIwiTQq0:Z_N7N7gn_0I:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=ZMFsIwiTQq0:Z_N7N7gn_0I:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/ZMFsIwiTQq0" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-event-driven-udp-example-server.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O - Simple RIO Polled UDP Example Server - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/KiFFoHzoOBo/windows-8-registered-io---simple-rio-polled-udp-example-server.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1170</id>

    <published>2012-03-07T11:28:00Z</published>
    <updated>2012-03-13T18:46:54Z</updated>

    <summary> I've been looking at the Windows 8 Registered I/O Networking Extensions since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
I've been looking at the &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;Windows 8 Registered I/O Networking Extensions&lt;/a&gt; since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting together some simple UDP servers using the various notification styles that RIO provides. I then put together some equally simple UDP servers using the "traditional" APIs so that I could compare performance. This series of blog posts describes each of the example servers in turn. You can find an index to all of the articles about the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;Windows 8 Registered I/O example servers here&lt;/a&gt;. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Polling RIO for completions&lt;/h2&gt;
&lt;div&gt;
As I mentioned back in October, there are &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;three ways to receive completion notifications from RIO&lt;/a&gt;; polling, event driven and via an I/O Completion Port. The first is the simplest though it burns CPU time even when no datagrams are being received.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
At its simplest a polled RIO server obtains datagrams to process like this: 
&lt;pre class="brush: cpp gutter: false"&gt;   RIORESULT results[RIO_MAX_RESULTS];

   ULONG numResults = 0;

   do
   {
      numResults = g_rio.RIODequeueCompletion(
         g_queue,
         results,
         RIO_MAX_RESULTS);

      if (0 == numResults)
      {
         YieldProcessor();
      }
      else if (RIO_CORRUPT_CQ == numResults)
      {
         ErrorExit("RIODequeueCompletion");
      }
   }
   while (0 == numResults);&lt;/pre&gt;
You then loop over the results array and process each result in turn before looping back to dequeue more completions. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Getting to the point where you can call &lt;code&gt;RIODequeueCompletion()&lt;/code&gt; takes a bit of setting up though...
&lt;/div&gt;
        &lt;h2 class="entry-body"&gt;Creating a RIO completion queue&lt;/h2&gt;
&lt;div&gt;
The examples are each stand alone but can share two common header files. The first, &lt;code&gt;Constants.h&lt;/code&gt;, contains all constants that are used to tune the examples. The second, &lt;code&gt;Shared.h&lt;/code&gt;, contains inline helper functions which hide some of the complexity and allow the individual example programs to focus on the area of the API that they're demonstrating. We use several of these helper functions as we prepare to create our RIO completion queue.
&lt;pre class="brush: cpp gutter: false"&gt;int _tmain(int argc, _TCHAR* argv[])
{
   SetupTiming("RIO polled UDP");

   InitialiseWinsock();

   CreateRIOSocket();

   g_queue = g_rio.RIOCreateCompletionQueue(RIO_PENDING_RECVS, 0);

   if (g_queue == RIO_INVALID_CQ)
   {
      ErrorExit("RIOCreateCompletionQueue");
   }&lt;/pre&gt;
So that we can compare the performance of these examples we first call &lt;code&gt;SetupTiming()&lt;/code&gt;, this prepares us for calling &lt;code&gt;StartTiming()&lt;/code&gt; and &lt;code&gt;StopTiming()&lt;/code&gt; later in the program. &lt;code&gt;SetupTiming()&lt;/code&gt; locks this thread to a single CPU, using &lt;code&gt;SetThreadAffinityMask()&lt;/code&gt; as &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/ms644904(v=vs.85).aspx" target="_blank"&gt;recommended by the help for &lt;code&gt;QueryPerformanceCounter()&lt;/code&gt;&lt;/a&gt;, once this is done we call &lt;code&gt;QueryPerformanceFrequency()&lt;/code&gt; and store the resulting value for use by &lt;code&gt;StopTiming()&lt;/code&gt;. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
With that done we initialise Winsock and then create our RIO socket. Note that these examples use quite a few global variables for convenience. This isn't how I would suggest you write production code, but it's convenient for these examples as it makes much of the code simpler and allows us to focus on the RIO API. For example, &lt;code&gt;CreateRIOSocket()&lt;/code&gt; does the following, creating a socket and assigning it to &lt;code&gt;g_s&lt;/code&gt;, we then bind the socket to &lt;code&gt;PORT&lt;/code&gt;, which is a constant that is defined in &lt;code&gt;Constants.h&lt;/code&gt;, and then initialising the RIO API function table so that we can use it through &lt;code&gt;g_rio&lt;/code&gt;.
&lt;pre class="brush: cpp gutter: false"&gt;inline void CreateRIOSocket()
{
   g_s = CreateSocket(WSA_FLAG_REGISTERED_IO);

   Bind(g_s, PORT);

   InitialiseRIO(g_s);
}

inline SOCKET CreateSocket(
   const DWORD flags = 0)
{
   g_s = ::WSASocket(
      AF_INET,
      SOCK_DGRAM,
      IPPROTO_UDP,
      NULL,
      0,
      flags);

   if (g_s == INVALID_SOCKET)
   {
      ErrorExit("WSASocket");
   }

   return g_s;
}

inline void InitialiseRIO(
   SOCKET s)
{
   GUID functionTableId = WSAID_MULTIPLE_RIO;

   DWORD dwBytes = 0;

   bool ok = true;

   if (0 != WSAIoctl(
      s,
      SIO_GET_MULTIPLE_EXTENSION_FUNCTION_POINTER,
      &amp;amp;functionTableId,
      sizeof(GUID),
      (void**)&amp;amp;g_rio,
      sizeof(g_rio),
      &amp;amp;dwBytes,
      0,
      0))
   {
      ErrorExit("WSAIoctl");
   }
}&lt;/pre&gt;
As you can see, we check all API calls for failure and report errors via our &lt;code&gt;ErrorExit()&lt;/code&gt; functions. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Finally we can create the RIO completion queue. We use the tunable constant &lt;code&gt;RIO_PENDING_RECVS&lt;/code&gt; to specify how large the queue should be.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Creating a RIO request queue&lt;/h2&gt;
&lt;div&gt;
Setting up a RIO request queue seems to be one of the few places where RIO API changes between the Windows 8 Developer Preview and the Windows 8 Server Beta are visible to us. With the Developer Preview we could pass any value for the &lt;code&gt;maxReceiveDataBuffers&lt;/code&gt; and &lt;code&gt;maxSendDataBuffers&lt;/code&gt; whereas in the beta &lt;code&gt;RIOCreateRequestQueue()&lt;/code&gt; only accepts a value of 1. See &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;my earlier posting on RIO for more details&lt;/a&gt;, but the documentation and the API are now in sync and the API currently doesn't support scatter/gather I/O.   
&lt;pre class="brush: cpp gutter: false"&gt;   ULONG maxOutstandingReceive = RIO_PENDING_RECVS;
   ULONG maxReceiveDataBuffers = 1;
   ULONG maxOutstandingSend = 0;
   ULONG maxSendDataBuffers = 1;

   void *pContext = 0;

   g_requestQueue = g_rio.RIOCreateRequestQueue(
      g_s,
      maxOutstandingReceive,
      maxReceiveDataBuffers,
      maxOutstandingSend,
      maxSendDataBuffers,
      g_queue,
      g_queue,
      pContext);

   if (g_requestQueue == RIO_INVALID_RQ)
   {
      ErrorExit("RIOCreateRequestQueue");
   }

   PostRIORecvs(RECV_BUFFER_SIZE, RIO_PENDING_RECVS);&lt;/pre&gt;
Once the request queue has been created we can post some read requests. Note that we specify the size of the buffers to use and the number of receives that we want to have pending, both of these values can be changed easily in &lt;code&gt;Constants.h&lt;/code&gt;.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Registering buffers and posting RIO read requests&lt;/h2&gt;
&lt;div&gt;
Before we can issue some read requests we need to register some I/O buffers. The &lt;code&gt;PostRIORecvs()&lt;/code&gt; function is complicated by the fact that we're trying to keep things simple (!) and by the fact that the example is tunable for different buffer sizes and the number of receives that we may want pending. We loop allocating and registering buffers and then slicing the buffer into buffer slices. We use an extended &lt;code&gt;RIO_BUF&lt;/code&gt; structure so that we can pass an "operation" code with each buffer slice. These examples don't use this operation, but a real server might need to pass additional information with each I/O request, especially if it's using a single completion queue for reads and writes. We deliberately leak our &lt;code&gt;EXTENDED_RIO_BUF&lt;/code&gt; structures but this isn't much of a problem in this example as they're in use from this point until the program exits.
&lt;pre class="brush: cpp gutter: false"&gt;inline void PostRIORecvs(
   const DWORD recvBufferSize,
   const DWORD pendingRecvs)
{
   DWORD totalBuffersAllocated = 0;

   while (totalBuffersAllocated &amp;lt; pendingRecvs)
   {
      DWORD bufferSize = 0;
   
      DWORD receiveBuffersAllocated = 0;

      char *pBuffer = AllocateBufferSpace(
         recvBufferSize,
         pendingRecvs,
         bufferSize,
         receiveBuffersAllocated);

      totalBuffersAllocated += receiveBuffersAllocated;

      RIO_BUFFERID id = g_rio.RIORegisterBuffer(
         pBuffer,
         static_cast&amp;lt;DWORD&amp;gt;(bufferSize));

      if (id == RIO_INVALID_BUFFERID)
      {
         ErrorExit("RIORegisterBuffer");
      }

      DWORD offset = 0;

      const DWORD recvFlags = 0;

      EXTENDED_RIO_BUF *pBufs = new EXTENDED_RIO_BUF[receiveBuffersAllocated];

      for (DWORD i = 0; i &amp;lt; receiveBuffersAllocated; ++i)
      {
         // now split into buffer slices and post our recvs

         EXTENDED_RIO_BUF *pBuffer = pBufs + i;

         pBuffer-&amp;gt;operation = 0;
         pBuffer-&amp;gt;BufferId = id;
         pBuffer-&amp;gt;Offset = offset;
         pBuffer-&amp;gt;Length = recvBufferSize;

         offset += recvBufferSize;

         if (!g_rio.RIOReceive(g_requestQueue, pBuffer, 1, recvFlags, pBuffer))
         {
            ErrorExit("RIOReceive");
         }
      }

      if (totalBuffersAllocated != pendingRecvs)
      {
         cout &amp;lt;&amp;lt; pendingRecvs &amp;lt;&amp;lt; " receives pending" &amp;lt;&amp;lt; endl;
      }
   }

   cout &amp;lt;&amp;lt; totalBuffersAllocated &amp;lt;&amp;lt; " total receives pending" &amp;lt;&amp;lt; endl;
}&lt;/pre&gt;
Before we can register our I/O buffer we need to allocate the memory that we will be using. As I mentioned back in October when I was first looking at the RIO API, it's important to allocate your I/O buffer memory in a particular way so that you &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-buffer-strategies.html"&gt;use memory efficiently for RIO's registered I/O buffers&lt;/a&gt;.   
&lt;pre class="brush: cpp gutter: false"&gt;inline char *AllocateBufferSpace(
   const DWORD recvBufferSize,
   const DWORD pendingRecvs,
   DWORD &amp;amp;bufferSize,
   DWORD &amp;amp;receiveBuffersAllocated)
{
   const DWORD preferredNumaNode = 0;

   const SIZE_T largePageMinimum = USE_LARGE_PAGES ? ::GetLargePageMinimum() : 0;

   SYSTEM_INFO systemInfo;

   ::GetSystemInfo(&amp;amp;systemInfo);

   systemInfo.dwAllocationGranularity;
   
   const unsigned __int64 granularity = (largePageMinimum == 0 ? systemInfo.dwAllocationGranularity : largePageMinimum);

   const unsigned __int64 desiredSize = recvBufferSize * pendingRecvs;

   unsigned __int64 actualSize = RoundUp(desiredSize, granularity);

   if (actualSize &amp;gt; std::numeric_limits&amp;lt;DWORD&amp;gt;::max())
   {
      actualSize = (std::numeric_limits&amp;lt;DWORD&amp;gt;::max() / granularity) * granularity;
   }

   receiveBuffersAllocated = std::min&amp;lt;DWORD&amp;gt;(pendingRecvs, static_cast&amp;lt;DWORD&amp;gt;(actualSize / recvBufferSize));

   bufferSize = static_cast&amp;lt;DWORD&amp;gt;(actualSize);

   char *pBuffer = reinterpret_cast&amp;lt;char *&amp;gt;(VirtualAllocExNuma(
      GetCurrentProcess(),
      0,
      bufferSize,
      MEM_COMMIT |
      MEM_RESERVE  |
      (largePageMinimum != 0 ? MEM_LARGE_PAGES : 0),
      PAGE_READWRITE,
      preferredNumaNode));

   if (pBuffer == 0)
   {
      ErrorExit("VirtualAlloc");
   }

   return pBuffer;
}&lt;/pre&gt;
Our allocation function is again slightly more complex than it need be, but that complexity allows us to explore various options by simply changing our configuration constants; you can ignore things like the &lt;code&gt;USE_LARGE_PAGES&lt;/code&gt; flag and the fact that we're allocating to a preferred NUMA node unless you're interested in the details and your hardware supports these features. The important thing is that we allocate in terms of the system's allocation granularity and that we use a variant of &lt;code&gt;VirtualAlloc()&lt;/code&gt; to do. Once again, in the name of simplicity, we leak these buffers (which will be in use for the life of the program) and allow program exit to clean them up.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Calling RIODequeueCompletion() and processing results&lt;/h2&gt;
&lt;div&gt;
We finally have queues created, buffers registered and reads pending. Processing these reads in our simple polled RIO server is fairly straight forward. First we enter a polling loop for completions and spin until completions are available. Once we have at least one completion we call &lt;code&gt;StartTiming()&lt;/code&gt; to start our performance timing. We then process the completion results. Our performance tests are simple, we send a number of datagrams of &lt;code&gt;EXPECTED_DATA_SIZE&lt;/code&gt; and then indicate that the test is complete by sending a series of datagrams of a different size. Once our servers receive an unexpected sized datagram they consider the test to be complete and shutdown. Thus our main completion loop is the &lt;code&gt;do/while&lt;/code&gt; loop below. We process datagrams, issue new reads and then dequeue more results. Once we're done we stop our timer and display details about the time taken and the number of datagrams that we processed.
&lt;pre class="brush: cpp gutter: false"&gt;bool done = false;

   DWORD recvFlags = 0;

   RIORESULT results[RIO_MAX_RESULTS];

   ULONG numResults = 0;

   do
   {
      numResults = g_rio.RIODequeueCompletion(
         g_queue,
         results,
         RIO_MAX_RESULTS);

      if (0 == numResults)
      {
         YieldProcessor();
      }
      else if (RIO_CORRUPT_CQ == numResults)
      {
         ErrorExit("RIODequeueCompletion");
      }
   }
   while (0 == numResults);

   StartTiming();

   int workValue = 0;

   bool running = true;

   do
   {
      for (DWORD i = 0; i &amp;lt; numResults; ++i)
      {
         EXTENDED_RIO_BUF *pBuffer = reinterpret_cast&amp;lt;EXTENDED_RIO_BUF *&amp;gt;(results[i].RequestContext);

         if (results[i].BytesTransferred == EXPECTED_DATA_SIZE)
         {
            g_packets++;

            workValue += DoWork(g_workIterations);

            if (!g_rio.RIOReceive(
               g_requestQueue,
               pBuffer,
               1,
               recvFlags,
               pBuffer))
            {
               ErrorExit("RIOReceive");
            }

            done = false;
         }
         else
         {
            done = true;
         }
      }

      if (!done)
      {
         do
         {
            numResults = g_rio.RIODequeueCompletion(
               g_queue,
               results,
               RIO_MAX_RESULTS);

            if (0 == numResults)
            {
               YieldProcessor();
            }
            else if (RIO_CORRUPT_CQ == numResults)
            {
               ErrorExit("RIODequeueCompletion");
            }
         }
         while (0 == numResults);
      }
   }
   while (!done);

   StopTiming();

   PrintTimings();

   return workValue;
}&lt;/pre&gt;
The &lt;code&gt;DoWork()&lt;/code&gt; function above can be used to add 'processing overhead' to each datagram. This can be configured using the &lt;code&gt;g_workIterations&lt;/code&gt; which is defined in &lt;code&gt;Constants.h&lt;/code&gt;. With this set to 0 there is no overhead and we can compare how quickly each API can receive datagrams. Setting larger values will affect how the various multi-threaded examples perform and can be useful if you're unable to saturate the test machine's network interfaces.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The code for this example can be downloaded from &lt;a href="http://www.serverframework.com/zips/RIO-PolledUDP.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'RIO-PolledUDP.zip']);"&gt;here&lt;/a&gt;. This code requires Visual Studio 11, but would work with earlier compilers if you have a Windows SDK that supports RIO. Note that &lt;code&gt;Shared.h&lt;/code&gt; and &lt;code&gt;Constants.h&lt;/code&gt; contain helper functions and tuning constants for ALL of the examples and so there will be code in there that is not used by this example. You should be able to unzip each example into the same directory structure so that they all share the same shared headers. This allows you to tune all of the examples the same so that any performance comparisons make sense.   
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;h2 class="entry-body"&gt;Join in&lt;/h2&gt;
Comments and suggestions are more than welcome. I'm learning as I go here and I'm quite likely to have made some mistakes or come up with some erroneous conclusions, feel free to put me straight and help make these examples better.
&lt;/div&gt;
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=KiFFoHzoOBo:bEmAcqyrkpk:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=KiFFoHzoOBo:bEmAcqyrkpk:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/KiFFoHzoOBo" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O Example UDP Servers - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/OX7vwnASdRo/windows-8-registered-io-example-udp-servers.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1169</id>

    <published>2012-03-07T11:27:00Z</published>
    <updated>2012-03-15T15:28:35Z</updated>

    <summary> I've been looking at the Windows 8 Registered I/O Networking Extensions since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
I've been looking at the &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;Windows 8 Registered I/O Networking Extensions&lt;/a&gt; since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting together some simple UDP servers using the various notification styles that RIO provides. I then put together some equally simple UDP servers using the "traditional" APIs so that I could compare performance.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;RIO API demonstration&lt;/h2&gt;
&lt;div&gt;
The examples are simple in that they do the bare minimum to demonstrate the APIs in question but they are configurable so that you can tune them to the hardware on which you're running them. You can run them to compare the maximum speed at which you can pull UDP datagrams off of the wire using each API and then adjust the examples so that they do a specific amount of "work" with each datagram to simulate a slightly more realistic scenario.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Simplified error handling&lt;/h2&gt;
&lt;div&gt;
Error handling is limited, we display an error and exit the program, but we don't skip error checking, all API calls are checked for errors. The examples are each stand alone but can share two common header files. The first, &lt;code&gt;Constants.h&lt;/code&gt;, contains all constants that are used to tune the examples. The second, &lt;code&gt;Shared.h&lt;/code&gt;, contains inline helper functions which hide some of the complexity and allow the individual example programs to focus on the area of the API that they're demonstrating.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;This is the index page&lt;/h2&gt;
&lt;div&gt;
I will be blogging about the construction of the various examples over the next few weeks and updating this entry as an index page for all of the examples. I've listed the examples that I'll be talking about and I'll link to each blog post as they go live. Once I've presented the RIO examples I'll present the more traditional examples and finally some performance comparisons.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;RIO server examples&lt;/h2&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;RIO Polled UDP&lt;/a&gt; - A server which uses a single thread and a tight loop to poll for RIO completions.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-event-driven-udp-example-server.html"&gt;RIO Event Driven UDP&lt;/a&gt; - A server which uses a single thread and event driven notifications to handle RIO completions.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-iocp-udp-example-server.html"&gt;RIO IOCP UDP&lt;/a&gt; - A server which uses a single thread and I/O Completion Port notifications to handle RIO completions.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---multi-threaded-rio-iocp-udp-example-server.html"&gt;RIO IOCP MT UDP&lt;/a&gt; - A server which uses a configurable number of threads and I/O Completion Port notifications to handle RIO completions.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Traditional server examples&lt;/h2&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---traditional-polled-udp-example-server.html"&gt;Simple Polled UDP&lt;/a&gt; - A server which uses a single thread and a tight loop to poll &lt;code&gt;WSARecv()&lt;/code&gt; for datagrams.&lt;/li&gt;
&lt;li&gt;IOCP UDP - A server which uses a single thread and I/O Completion Port notifications to handle overlapped &lt;code&gt;WSARecv()&lt;/code&gt; completions.&lt;/li&gt;
&lt;li&gt;IOCP MT UDP - A server which uses a configurable number of threads and I/O Completion Port notifications to handle overlapped &lt;code&gt;WSARecv()&lt;/code&gt; completions.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;A simple UDP datagram traffic generator&lt;/h2&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---generating-load-for-the-performance-tests.html"&gt;Simple UDP traffic generator&lt;/a&gt; - A client which uses a single thread and a tight loop send datagrams using &lt;code&gt;WSASendTo()&lt;/code&gt;, this easily saturates a 1000BASE-T, 1Gb ethernet connection.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Test scripts&lt;/h2&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/zips/RIO-TestScripts.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'RIO-TestScripts.zip']);"&gt;Test scripts&lt;/a&gt; - These simple scripts create performance counter logs and run the test servers.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Performance Test results&lt;/h2&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-performance.html"&gt;The first tests&lt;/a&gt; - Where we compare the simple polled traditional server with the polled RIO server.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;h2 class="entry-body"&gt;Join in&lt;/h2&gt;
Comments and suggestions are more than welcome. I'm learning as I go here and I'm quite likely to have made some mistakes or come up with some erroneous conclusions, feel free to put me straight and help make these examples better.
&lt;/div&gt;
 
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=OX7vwnASdRo:IriY_5aB1N0:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=OX7vwnASdRo:IriY_5aB1N0:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/OX7vwnASdRo" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html</feedburner:origLink></entry>

<entry>
    <title>Latest release of The Server Framework: 6.5.4 - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/I7M5ArjIws0/latest-release-of-the-server-framework-654.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1167</id>

    <published>2012-02-14T08:28:41Z</published>
    <updated>2012-02-15T10:38:13Z</updated>

    <summary> Version 6.5.4 of The Server Framework was released today. This release contains two important bug fixes and a selection of minor improvements. If you run your code on Vista/Windows Server 2003 or later and you don't explicitly disable FILE_SKIP_COMPLETION_PORT_ON_SUCCESS...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Releases" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
Version 6.5.4 of The Server Framework was released today.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
This release contains two important bug fixes and a selection of minor improvements. If you run your code on Vista/Windows Server 2003 or later and you don't explicitly disable &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt; in your Config.h then you should install this update.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;This release includes the following, see the release notes, &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/sockettoolsreleasenotes.html"&gt;here&lt;/a&gt;, for full details of all changes.&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;Bug fix. If &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt; was enabled but JetByteTools::Socket::CanEnableSkipCompletionPortOnSuccess()
returned false then the the code that handled issuing read and write calls would fail if ERROR_SUCCESS was returned
because it would assume that &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt; was enabled and that it should
handle the completion directly but a completion would have been posted to the IOCP and so the completion would
get handled twice. We now correctly whether we have actually enabled &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt; rather
than just whether we want to enable it.&lt;/li&gt;
&lt;li&gt;Change to &lt;code&gt;JetByteTools::Socket::CConnectionMaintainingStreamSocketConnectionFilter&lt;/code&gt; so that we do not attempt to maintain a connection
if the reconnect delay is 0.&lt;/li&gt;
&lt;li&gt;Added the concept of being able to force a write request to go via the I/O pool even if marshalling is currently turned off.&lt;/li&gt;
&lt;li&gt;Added &lt;code&gt;JetByteTools::Socket::IManageStreamSocketConnectionFilters::TryRequestWrite()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Changed how &lt;code&gt;JetByteTools::Socket::CFlowControlStreamSocketConnectionFilter&lt;/code&gt; issues write requests and how it
deals with write failure due to socket closure. We now purge any queued data when we detect the socket has
been closed, rather than continuing to try and send more.&lt;/li&gt;
&lt;li&gt;Added &lt;code&gt;JetByteTools::Socket::IDatagramSendSocket&lt;/code&gt; which is a common base class for &lt;code&gt;JetByteTools::Socket::IDatagramSocket&lt;/code&gt;
and &lt;code&gt;JetByteTools::Socket::IDatagramServerSocket&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Added &lt;code&gt;JetByteTools::Socket::IFilterableStreamSocket::CanIssueFilteredWrite()&lt;/code&gt; which is now called instead of
&lt;code&gt;JetByteTools::Socket::IStreamSocketEx::CanWrite()&lt;/code&gt; by &lt;code&gt;JetByteTools::Socket::CFilteringStreamSocketConnectionManagerBase::TryRequestWrite()&lt;/code&gt;.
This removes a race condition during the shutdown of the write side of a socket in situations where filtering is being used and the
filter wishes to write to the socket after the application level code has requested that the write side of the socket be shut down. We always
tracked the outstanding write count before actually issuing the shutdown and the filter could manage this to allow it to be able to
send after a shutdown had been requested BUT the filtered send could still fail as the socket's write shutdown flag would be set. This
new function does not check the write shutdown flag and so allows the filter to write successfully.
&lt;/li&gt;&lt;li&gt;Changed &lt;code&gt;JetByteTools::IO::IAllocateBufferHandles::Flush()&lt;/code&gt; so that it returns a bool indicating
if buffers were active when the flush was done. This brings it in line with &lt;code&gt;JetByteTools::IO::IAllocateBuffers&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Changed the &lt;code&gt;JetByteTools::IO::IAsyncIOStream::Write()&lt;/code&gt; methods so that they take an optional bool
that enables you to force the write to go via the I/O pool even if I/O marshalling is turned off.
&lt;/li&gt;&lt;li&gt;Changed &lt;code&gt;JetByteTools::IO::CAsyncFileLog&lt;/code&gt; to monitor its own write thread to remove the chance that
it might hang during destruction if the thread has terminated due to an exception.&lt;/li&gt;
&lt;/ul&gt;

&lt;/div&gt;
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=I7M5ArjIws0:USMIvh14VaI:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=I7M5ArjIws0:USMIvh14VaI:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/I7M5ArjIws0" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/02/latest-release-of-the-server-framework-654.html</feedburner:origLink></entry>

<entry>
    <title>The advantage of having lots of clients and clients with lots of clients - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/PT2qt0IHqRQ/the-advantage-of-having-lots-of-clients-and-clients-with-lots-of-clients.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1165</id>

    <published>2012-02-08T12:11:02Z</published>
    <updated>2012-02-08T12:38:42Z</updated>

    <summary> Our Secretive Online Game Company client uses The Server Framework for their custom application server for the games industry. They have thousands of users who run their server on a very diverse set of hardware. This is great for...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Bug fixes" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
Our &lt;a href="http://www.serverframework.com/clients/online-game-company.html" target="_blank"&gt;Secretive Online Game Company&lt;/a&gt; client uses The Server Framework for their custom application server for the games industry. They have thousands of users who run their server on a very diverse set of hardware. This is great for us as it really helps to shake down The Server Framework. There's nothing like running your multi-threaded code on lots of different hardware to help find all of the hidden race conditions and whatever. I'm pleased that we have so few bug reports coming in from our clients. Especially knowing that our Online Game Company client has the latest code out in the field or at least in use internally on their cloud system. Unfortunately one of their clients has recently exposed a latent bug in a rarely used corner of The Server Framework. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
One of the features of The Server Framework is that we track new features in various Windows operating systems so that you can take advantage of them simply by upgrading. Often you only need to adjust your &lt;code&gt;Config.h&lt;/code&gt; file to enable powerful new features. One of these features is &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt;, you can &lt;a href="http://www.lenholgate.com/cgi-bin/mt/mt-search.cgi?search=FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&amp;amp;IncludeBlogs=11%2C12&amp;amp;limit=20" target="_blank"&gt;read more about it over on Len's blog&lt;/a&gt;. This allows some optimisation in thread scheduling and context switching and generally improves performance on busy servers.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Back in April 2011 we updated our &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt; support to include protection from incompatible networking providers, see &lt;a href="http://support.microsoft.com/kb/2568167" target="_blank"&gt;this Microsoft Knowledge Base article for details of the potential problem&lt;/a&gt;. In addition to the compile time support for &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt; which can be used to turn the feature on and off in The Server Framework, we added a run time check to ensure that the machine on which the code was running did not have any incompatible networking providers installed.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Unfortunately this code wasn't tested as well as it could be and there was a bug in it which leads to problems on systems that have an incompatible networking provider installed and &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt; support turned on. This leads to the completions for some operations being processed twice and subsequent reference counting (over-release) problems with the corresponding socket and buffer structures.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
We've fixed this issue and it will be included in a 6.5.4 release which is currently in test. If you think you're suffering from the problems caused by this and need the fix immediately then please get in touch.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The offending networking provider in this case was the "AVSDA" provider which, a quick web search suggests, is part of Avira Anti virus.
&lt;/div&gt;
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=PT2qt0IHqRQ:3KV1BN7_HoU:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=PT2qt0IHqRQ:3KV1BN7_HoU:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/PT2qt0IHqRQ" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/02/the-advantage-of-having-lots-of-clients-and-clients-with-lots-of-clients.html</feedburner:origLink></entry>

<entry>
    <title>WASP download of XP versions now fixed  - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/TVFMDKbXBfg/wasp-download-of-xp-versions-now-fixed.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1164</id>

    <published>2012-02-02T14:58:40Z</published>
    <updated>2012-02-02T15:04:19Z</updated>

    <summary> I've just noticed a problem with downloading the XP versions of WASP. This is now fixed. The XP versions can now be downloaded correctly again from here. Sorry for any inconvenience caused....</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="WASP" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
I've just noticed a problem with downloading the XP versions of WASP.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
This is now fixed. The XP versions can now be downloaded correctly again from &lt;a href="http://www.serverframework.com/products---download-wasp.html"&gt;here&lt;/a&gt;. Sorry for any inconvenience caused.
&lt;/div&gt;
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=TVFMDKbXBfg:i8S4djspWzM:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=TVFMDKbXBfg:i8S4djspWzM:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/TVFMDKbXBfg" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/02/wasp-download-of-xp-versions-now-fixed.html</feedburner:origLink></entry>

<entry>
    <title>Latest release of The Server Framework: 6.5.3 - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/TLzTWfdj-2U/latest-release-of-the-server-framework-653.html" />
    <id>tag:www.serverframework.com,2011:/asynchronousevents//2.1162</id>

    <published>2011-12-14T08:47:43Z</published>
    <updated>2011-12-14T09:58:41Z</updated>

    <summary> Version 6.5.3 of The Server Framework was released today. This release updates the WebSockets Option pack to the final version of the protocol as detailed in RFC 6455 which was released yesterday. There is also a bug fix to...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Releases" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
Version 6.5.3 of The Server Framework was released today.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
This release updates the &lt;a href="http://www.serverframework.com/products---the-websockets-option.html"&gt;WebSockets Option pack&lt;/a&gt; to the final version of the protocol as detailed in &lt;a href="http://tools.ietf.org/html/rfc6455" target="_blank"&gt;RFC 6455&lt;/a&gt; which was released yesterday. There is also a bug fix to WebSocket status reason processing. If you have 6.5 or 6.5.1 or 6.5.2 and you are NOT using WebSockets then you probably don't need this release.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;This release includes the following, see the release notes, &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/sockettoolsreleasenotes.html"&gt;here&lt;/a&gt;, for full details of all changes.&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;Updated to support &lt;a href="http://tools.ietf.org/html/rfc6455" target="_blank"&gt;RFC 6455&lt;/a&gt; - added close status codes 1011 and 1015.&lt;/li&gt;
&lt;li&gt;Fixed a bug in the handling of long status result messages, we now truncate them correctly.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=TLzTWfdj-2U:qegbu_8WWp0:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=TLzTWfdj-2U:qegbu_8WWp0:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/TLzTWfdj-2U" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2011/12/latest-release-of-the-server-framework-653.html</feedburner:origLink></entry>

<entry>
    <title>New client profile: Smart Moves Software Systems - Online gaming - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/Ru4Of8hZZrc/new-client-profile-smart-moves-software-systems-ltd---online-gaming.html" />
    <id>tag:www.serverframework.com,2011:/asynchronousevents//2.1159</id>

    <published>2011-11-23T08:38:32Z</published>
    <updated>2011-11-23T08:45:03Z</updated>

    <summary><![CDATA[We have a new client profile available here for a new client who selected The&nbsp;Server&nbsp;Framework to help it expand its online gaming platform to incorporate a WebSockets interface....]]></summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="General" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        We have a new client profile available &lt;a href="http://www.serverframework.com/clients/smart-moves-software-systems-ltd.html"&gt;here&lt;/a&gt; for a new client who selected The&amp;nbsp;Server&amp;nbsp;Framework to help it expand its online gaming platform to incorporate &lt;a href="http://www.serverframework.com/products---the-websockets-option.html"&gt;a WebSockets interface&lt;/a&gt;.
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=Ru4Of8hZZrc:ieQNcKZey0I:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=Ru4Of8hZZrc:ieQNcKZey0I:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/Ru4Of8hZZrc" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2011/11/new-client-profile-smart-moves-software-systems-ltd---online-gaming.html</feedburner:origLink></entry>

<entry>
    <title>A new release of WASP, now with SSL/TLS support - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/Serverframework/~3/1aiTb3RRxlE/a-new-release-of-wasp-now-with-ssltls-support.html" />
    <id>tag:www.serverframework.com,2011:/asynchronousevents//2.1157</id>

    <published>2011-11-22T10:10:47Z</published>
    <updated>2011-11-22T10:48:41Z</updated>

    <summary> We've just released a new version of WASP, our pluggable application server platform. This release is built with release 6.5.2 of The Server Framework and includes support for secure TCP connections using SSL/TLS via our SChannel Option pack. Setting...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="WASP" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="WASP Tutorial" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
We've just released a new version of &lt;a href="http://www.serverframework.com/products---wasp.html"&gt;WASP, our pluggable application server platform&lt;/a&gt;. This release is built with release 6.5.2 of The Server Framework and includes support for secure TCP connections using SSL/TLS via our &lt;a href="http://www.serverframework.com/products---the-ssltls-using-schannel-option.html"&gt;SChannel Option pack&lt;/a&gt;.
&lt;/div&gt;
&lt;br /&gt;
&lt;div&gt;
Setting up a secure TCP endpoint with WASP is easy, simply add the &lt;b&gt;Secure&lt;/b&gt; configuration option to the &lt;b&gt;&amp;lt;EndPoint&amp;gt;&lt;/b&gt; node like this:
&lt;/div&gt;
&lt;pre class="brush: xml gutter: false"&gt;&amp;lt;?xml version="1.0" encoding="Windows-1252"?&amp;gt;
&amp;lt;Configuration&amp;gt;
  &amp;lt;WASP&amp;gt;
    &amp;lt;TCP&amp;gt;
      &amp;lt;Endpoints&amp;gt;
        &amp;lt;EndPoint
          Name="Echo Server"
          Port="5050"
          HandlerDLL="[CONFIG]\EchoServer.dll"
          Secure="true"&amp;gt;
        &amp;lt;/EndPoint&amp;gt;
      &amp;lt;/Endpoints&amp;gt;
    &amp;lt;/TCP&amp;gt;
  &amp;lt;/WASP&amp;gt;
&amp;lt;/Configuration&amp;gt;
&lt;/pre&gt;
&lt;div&gt;
This tells WASP to secure the endpoint using a default certificate called "Wasp" that is located in the "MY" certificate store. You can add a self signed test certificate using the standard Microsoft "make cert" utility, makecert.exe and a simple script which creates and installs the correct type of certificate can be downloaded from &lt;a href="http://www.serverframework.com/WASP/Examples/MakeCert.zip" onclick="javascript: pageTracker._trackPageview('/downloads/WASP-MakeCert'); "&gt;here&lt;/a&gt;.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
if you do not want to use a certificate called "Wasp" in the "MY" certificate store then you can configure the certificate used by adding the &lt;b&gt;StoreName&lt;/b&gt;, &lt;b&gt;CertificateName&lt;/b&gt; and &lt;b&gt;UseMachineStore&lt;/b&gt; config values.
&lt;/div&gt;
&lt;pre class="brush: xml gutter: false"&gt;&amp;lt;?xml version="1.0" encoding="Windows-1252"?&amp;gt;
&amp;lt;Configuration&amp;gt;
  &amp;lt;WASP&amp;gt;
    &amp;lt;TCP&amp;gt;
      &amp;lt;Endpoints&amp;gt;
        &amp;lt;EndPoint
          Name="Echo Server"
          Port="5050"
          HandlerDLL="[CONFIG]\EchoServer.dll"
          Secure="true"
          StoreName="OurSpecialStore"
          CertificateName="OurCertificate"
          UseMachineStore="true"&amp;gt;
        &amp;lt;/EndPoint&amp;gt;
      &amp;lt;/Endpoints&amp;gt;
    &amp;lt;/TCP&amp;gt;
  &amp;lt;/WASP&amp;gt;
&amp;lt;/Configuration&amp;gt;
&lt;/pre&gt;
&lt;div&gt;Testing your new secure endpoint can be done using either our OpenSSL server test or our SChannel server test. These are example clients that ship with 
&lt;a href="http://www.serverframework.com/products---the-server-framework.html"&gt;The Server Framework&lt;/a&gt; and that allow you to create thousands of concurrent connections and control how they send data to a server. This is an easy way to build a test system for your server as all of the complexity of managing and controlling the connections is done for you and you simply have to adjust the messages that are generated and how the response validation is done. The default message that is built is an network byte order integer length prefixed message and so this program can be used to stress test &lt;a href="http://www.serverframework.com/products---wasp.html"&gt;WASP&lt;/a&gt; with either of the first two example plugins that were discussed in the tutorial.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;You can download the SChannelEchoServerTest program from &lt;a href="http://www.serverframework.com/WASP/Examples/SChannelEchoServerTest.zip" onclick="javascript: pageTracker._trackPageview('/downloads/WASP-SChannelEchoServerTest'); "&gt;here&lt;/a&gt;. See our &lt;a href="http://www.serverframework.com/asynchronousevents/2010/10/stress-testing-wasp-using-the-echoservertest-program.html" tutorial="" on="" testing="" wasp&lt;="" a=""&gt; for details of how to run this tool.&lt;/a&gt;&lt;/div&gt;
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=1aiTb3RRxlE:MceTRAUeHoo:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/Serverframework?a=1aiTb3RRxlE:MceTRAUeHoo:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Serverframework?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Serverframework/~4/1aiTb3RRxlE" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2011/11/a-new-release-of-wasp-now-with-ssltls-support.html</feedburner:origLink></entry>

</feed>

