<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">
    <title>www.lenholgate.com - Rambling Comments - Len Holgate's C++ progamming blog</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/" />
    
    <id>tag:www.lenholgate.com,2010-12-10://11</id>
    <updated>2012-03-27T06:04:32Z</updated>
    <subtitle>/* Rambling comments... */

Len Holgate's thoughts on this and that...
Mainly test driven software development in C++ on Windows platforms...</subtitle>
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type Pro 5.12</generator>

<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/LenHolgate" /><feedburner:info uri="lenholgate" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry>
    <title>Netmap, RIO and the challenges in using a 10 Gigabit pipe - Rambling Comments</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/gVyWtUR_4cs/netmap-rio-and-the-challenges-in-using-a-10-gigabit-pipe.html" />
    <id>tag:www.lenholgate.com,2012:/blog//12.1179</id>

    <published>2012-03-26T20:46:06Z</published>
    <updated>2012-03-27T06:04:32Z</updated>

    <summary> This link, Revisiting Network I/O APIs: The Netmap Framework, via highscalability.com makes for interesting reading. Especially given my current interest in the performance of the Winsock Registered I/O networking extensions, RIO, and the fact that my new network cards...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        &lt;div&gt;
This link, &lt;a href="http://highscalability.com/blog/2012/3/22/paper-revisiting-network-io-apis-the-netmap-framework.html" target="_blank"&gt;Revisiting Network I/O APIs: The Netmap Framework&lt;/a&gt;, via &lt;a href="http://highscalability.com/" target="_blank"&gt;highscalability.com&lt;/a&gt; makes for interesting reading. Especially given my current interest in the performance of &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-performance.html" target="_blank"&gt;the Winsock Registered I/O networking extensions, RIO&lt;/a&gt;&lt;a&gt;, and the fact that &lt;/a&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-performance---10-gigabit-networking.html" target="_blank"&gt;my new network cards&lt;/a&gt; are so difficult to fully utilise.
&lt;/div&gt; 
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=gVyWtUR_4cs:8-qg10Cs5IA:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=gVyWtUR_4cs:8-qg10Cs5IA:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/gVyWtUR_4cs" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.lenholgate.com/blog/2012/03/netmap-rio-and-the-challenges-in-using-a-10-gigabit-pipe.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O Performance - 10 Gigabit networking... - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/uZuyMbbb_1I/windows-8-registered-io-performance---10-gigabit-networking.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1178</id>

    <published>2012-03-26T15:44:35Z</published>
    <updated>2012-03-26T15:56:42Z</updated>

    <summary> When I switched to looking at the performance of the more advanced RIO server designs that use IOCP it quickly became apparent that even multiple 1 Gigabit connections weren't enough of a challenge to give me any meaningful figures;...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
When I switched to looking at the performance of the more advanced RIO server designs that use IOCP it quickly became apparent that even multiple 1 Gigabit connections weren't enough of a challenge to give me any meaningful figures; my traditional IOCP datagram servers were easily able to keep up and increasing the workload per datagram required such high workloads that the tests became meaningless. So, we've brought forward the purchase of the hardware that we intended to use for our private cloud scalability testing and we now have 2 Intel 10 Gigabit AT2 cards. Switch prices are still prohibitive for lab use and so these two cards are directly connected, point to point.  
&lt;/div&gt; 
&lt;div&gt;&lt;br/&gt;&lt;/div&gt;
&lt;div&gt;
The good news is that we now have a 10 Gigabit network link between two of our test servers. The bad news is that I now have to work out how to use it, the traditional datagram generation program that I was previously using to test simply doesn't scale to saturate the new link.
&lt;/div&gt; 
 
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=uZuyMbbb_1I:1-6opJYPFJQ:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=uZuyMbbb_1I:1-6opJYPFJQ:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/uZuyMbbb_1I" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-performance---10-gigabit-networking.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O Performance - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/ZcB-hAsQW3w/windows-8-registered-io-performance.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1168</id>

    <published>2012-03-15T15:15:00Z</published>
    <updated>2012-03-19T09:53:07Z</updated>

    <summary> I've been looking at the Windows 8 Registered I/O Networking Extensions since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
I've been looking at the &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;Windows 8 Registered I/O Networking Extensions&lt;/a&gt; since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting together &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;some simple UDP servers&lt;/a&gt; using the various notification styles that RIO provides. I then put together some equally simple UDP servers using the "traditional" APIs so that I could compare performance.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Of course these comparisons should be taken as preliminary since we're working with a beta version of the operating system. However, though I wouldn't put much weight in the exact numbers until we have a non-beta OS to test on, it's useful to see how things are coming along and familiarise ourselves with the designs that might be required to take advantage of RIO once it ships. The main thing to take away from these discussions on RIO's performance are the example server designs, the testing methods and a general understanding of why RIO performs better than the traditional Windows networking APIs. With this you can run your own tests, build your own servers and get value from using RIO where appropriate.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Do bear in mind that I'm learning as I go here, RIO is a new API and there is precious little in the way of documentation about how and why to use the API. Comments and suggestions are more than welcome, feel free to put me straight if I've made some mistakes, and submit code to help make these examples better.
&lt;/div&gt;
        &lt;h2 class="entry-body"&gt;How to test RIO's performance&lt;/h2&gt;
&lt;div&gt;
The tests consist of sending a large number of datagrams to the server under test. We send two sizes of datagram, the test datagram and the shutdown datagram. The server counts the datagrams that it receives and the time taken. It shuts down as soon as it receives a shutdown datagram. The servers that we are using for these tests &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;are detailed here&lt;/a&gt; and the datagram generator &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---generating-load-for-the-performance-tests.html"&gt;is available here&lt;/a&gt;.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Whilst the numbers that the servers report are useful for getting a rough idea of how the various API's compare they're not the whole story. It's useful to look at performance counter logs that are taken whilst the test server is running. The CPU usage of the server under test, and the entire machine, are useful indicators of how much further we could push a given server. The number of datagrams received, and dropped by the network and Winsock are useful to see, as is the non-paged pool usage, etc.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
To make the testing repeatable I've put together some simple scripts which create the required performance logs using &lt;a href="http://technet.microsoft.com/en-us/library/cc753820(WS.10).aspx" target="_blank"&gt;logman&lt;/a&gt;, the command line interface to &lt;a href="http://technet.microsoft.com/en-us/library/bb490957.aspx" target="_blank"&gt;perfmon&lt;/a&gt;. This means that for each test run we can run a single command which creates and starts a performance counter log, runs the server and then stops the performance counter log. It would be nice to include custom performance counters in each of the example servers so that we can see more of what's going on inside, but whilst easy to do, using our &lt;a href="http://www.serverframework.com/products---the-performance-counters-option.html"&gt;Performance Counters Option pack&lt;/a&gt;, that's beyond the scope of these tests.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The test client, or clients for when we're using two network links into the test machine, are started manually. We could automate this with &lt;a href="http://www.windowsnetworking.com/articles_tutorials/How-Windows-Server-2008-WinRM-WinRS.html" target="_blank"&gt;winrs&lt;/a&gt;, as we've &lt;a href="http://www.lenholgate.com/blog/2010/05/performance-comparisons-for-recent-code-changes.html" target="_blank"&gt;done in the past&lt;/a&gt;, but these tests don't really warrant that level of complexity. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Our test system&lt;/h2&gt;
&lt;div&gt;
Our test system consists of a dual Xenon E5620 @ 2.40GHz, that's 16 CPUs in 2 Numa nodes with 16GB of memory. The machine has four 1Gb Ethernet network intefaces, a Broadcom BCM571C NetXtreme II GigE with two channels and a Intel 82576 Gigabit dual port adapter. We're using the Intel adapter for all of the tests shown here, sometimes using one NIC and sometimes two. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Windows Server 8 beta Datacentre edition is running directly on the hardware.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The client hardware is less impressive, but both client machines can push their 1Gb network interfaces to around 98% whilst running our datagram generator and that's more than enough for our purposes here.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;The first tests&lt;/h2&gt;
&lt;div&gt;
To get a feel for how the RIO API differs from the traditional API's the first test will compare a polled RIO server with a traditional, blocking, polled server. &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;The code for the servers is available here&lt;/a&gt; along with some commentary on their designs. You'll need Visual Studio 11 to build the examples.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The test scripts, mentioned above, can be &lt;a href="http://www.serverframework.com/zips/RIO-TestScripts.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'RIO-TestScripts.zip']);"&gt;downloaded from here&lt;/a&gt;. Each server has its own script and a text file that details the performance counters to capture during the test run. All of the scripts call a common script which sets up the performance counter log and then starts the server. You shouldn't start the clients until the server is running and has output its configuration details. Once the server receives its first datagram it will display "TimingStarted" and when it has received a shutdown datagram it will display "TimingStopped" and display the number of datagrams that it managed to receive, the time taken and the datagrams per second. You need to copy the x64 release builds of the example servers into the same directory of the test scripts and then be sure to run the batch file and not the exe directly. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
As an initial test we will run the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;traditional UDP server&lt;/a&gt; with one test client. We'll set the test client to send 10,000,000 datagrams, which takes a little over one minute. Once the test was completed the server reported that it had processed 9,952,510 datagrams in 86,880ms, a rate of 114,000 per second. Running the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;RIO polled server example&lt;/a&gt; with the same network load the results were broadly similar; 9,932,228 datagrams in 86,681ms, a rate of 114,000 per second.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
At first glance it seems that RIO isn't so impressive, however we need to remind ourselves of what these example servers are doing; all they're doing is pulling datagrams off of the wire as fast as they can. They're both doing so on a single CPU of a 16 CPU machine and, from these results, it seems that on, this hardware, both APIs can quite easily handle a single saturated 1Gb network link. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Digging deeper into RIO's performance&lt;/h2&gt;
&lt;div&gt;
Whilst the two servers at first appear to behave almost identically under the load it's only when we start looking at the performance counters that we can see that actually the two APIs have completely different performance characteristics.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Here's the graph for the traditional UDP server. Note the thick blue line, that's the amount of time the process spends in kernel mode, on average 37.133% of its time.
&lt;a href="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-SimplePolledUDP_03151103-313.html" onclick="window.open('http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-SimplePolledUDP_03151103-313.html','popup','width=807,height=564,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false"&gt;&lt;img src="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-SimplePolledUDP_03151103-thumb-500x349-313.gif" width="500" height="349" alt="RIO-Perf-SimplePolledUDP_03151103.gif" class="mt-image-center" style="text-align: center; display: block; margin: 0 auto 20px;" /&gt;&lt;/a&gt;
The graph from the RIO server is a little different. The thick blue line is still there, it's just that it's 0 most of the time. The average is 0.167%.
&lt;a href="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-RIOPolledUDP_03151109-316.html" onclick="window.open('http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-RIOPolledUDP_03151109-316.html','popup','width=784,height=541,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false"&gt;&lt;img src="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-RIOPolledUDP_03151109-thumb-500x345-316.gif" width="500" height="345" alt="RIO-Perf-RIOPolledUDP_03151109.gif" class="mt-image-center" style="text-align: center; display: block; margin: 0 auto 20px;" /&gt;&lt;/a&gt;
Another thing worth noting is that the spinning that the RIO server does is obvious from the fact that it uses 100% of a CPU (see the thick red line) and that most of that is spent in user mode code (the dotted green line that runs across the thick red line).
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Another thing that is interesting to see is the non-paged pool usage; the RIO server uses a fixed amount for the life of the process, 8,064 bytes, the traditional server uses 4,192 bytes for most of the time but has some random peaks, the highest of which is 133,656 bytes. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Increasing the network load&lt;/h2&gt;
&lt;div&gt;
Running two clients, each sending 10,000,000 datagrams to different network cards on the test machine gives us similar figures, the traditional server remains ahead of the RIO server with 19,847,578 datagrams to 19,279,842 datagrams. It seems that with the given hardware both APIs are capable of dealing with two saturated 1Gb links.  
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Increasing the work done per datagram received&lt;/h2&gt;
&lt;div&gt;
The example servers all have a &lt;code&gt;DoWork()&lt;/code&gt; function which allows us to add some "processing" for each datagram that is received, this gives us a slightly more realistic test as, except for &lt;a href="http://en.wikipedia.org/wiki/Discard_Protocol" target="_blank"&gt;discard servers&lt;/a&gt;, most servers need to do some work with each datagram that arrives.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Running the tests again, this time with a 'workload' of 100 gives the following results. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;b&gt;Traditional Server&lt;/b&gt;
&lt;ul&gt;
&lt;li&gt; 6,386,294 datagrams out of  10,000,000 on 1 1Gb link,  63%&lt;/li&gt;
&lt;li&gt; 4,824,707 datagrams out of  20,000,000 on 2 1Gb links, 24%&lt;/li&gt;
&lt;li&gt;38,830,887 datagrams out of 100,000,000 on 2 1Gb links, 38%&lt;/li&gt;
&lt;/ul&gt;
&lt;b&gt;RIO Server&lt;/b&gt;
&lt;ul&gt;
&lt;li&gt; 9,985,323 datagrams out of  10,000,000 on 1 1Gb link,  99%&lt;/li&gt;
&lt;li&gt;19,730,003 datagrams out of  20,000,000 on 2 1Gb links, 98%&lt;/li&gt;
&lt;li&gt;93,640,607 datagrams out of 100,000,000 on 2 1Gb links, 93%&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Clearly a more realistic example allows the RIO API to show what it's capable of. Note that I ran a longer test, with two clients each sending 50,000,000 datagrams as the second test showed some results that seemed to imply that the traditional server had become overwhelmed near the end of the test. The longer test was to see if it could recover, it didn't it simply entered the overwhelmed state and stayed there until the end of the test. This is possibly due to the socket's recv buffer filling up.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
As can be seen from these graphs, the traditional server quickly gets into a state where it is dropping vast numbers of datagrams (thick pink line) whilst burning more user mode CPU than kernel mode CPU having maxed out the single CPU that it's running on.
&lt;a href="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-SimplePolledUDP_03151306-319.html" onclick="window.open('http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-SimplePolledUDP_03151306-319.html','popup','width=807,height=564,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false"&gt;&lt;img src="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-SimplePolledUDP_03151306-thumb-500x349-319.gif" width="500" height="349" alt="RIO-Perf-SimplePolledUDP_03151306.gif" class="mt-image-center" style="text-align: center; display: block; margin: 0 auto 20px;" /&gt;&lt;/a&gt;
The RIO server doesn't drop any datagrams and the graph looks surprisingly like the previous one with no load.
&lt;a href="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-RIOPolledUDP_03151330-322.html" onclick="window.open('http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-RIOPolledUDP_03151330-322.html','popup','width=807,height=564,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false"&gt;&lt;img src="http://www.serverframework.com/asynchronousevents/assets_c/2012/03/RIO-Perf-RIOPolledUDP_03151330-thumb-500x349-322.gif" width="500" height="349" alt="RIO-Perf-RIOPolledUDP_03151330.gif" class="mt-image-center" style="text-align: center; display: block; margin: 0 auto 20px;" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
To look at how the performance of the RIO server degraded as the workload per datagram increases I ran some more tests.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;b&gt;RIO Server, 10,000,000 datagrams&lt;/b&gt;
&lt;ul&gt;
&lt;li&gt;9,985,323 datagrams, 99% at a workload of  100&lt;/li&gt;
&lt;li&gt;9,888,733 datagrams, 98% at a workload of  300&lt;/li&gt;
&lt;li&gt;7,174,653 datagrams, 71% at a workload of  500&lt;/li&gt;
&lt;li&gt;5,573,046 datagrams, 55% at a workload of  700&lt;/li&gt;
&lt;li&gt;4,361,820 datagrams, 43% at a workload of 1000&lt;/li&gt;
&lt;li&gt;2,927,590 datagrams, 29% at a workload of 2000&lt;/li&gt;
&lt;/ul&gt;
And just to compare...
&lt;br /&gt;
&lt;b&gt;Traditional server, 10,000,000 datagrams&lt;/b&gt;
&lt;ul&gt;
&lt;li&gt;2,522,667 datagrams, 25% at a workload of 1000&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Some conclusions&lt;/h2&gt;
&lt;div&gt;
Bear in mind that these results are specific to the test machine I was running on and that we're testing on a beta version of the Windows 8 Server operating system. Even so, the figures are impressive. The lack of kernel mode transitions allow much more CPU to be used for real work on each datagram that arrives. The registering of I/O buffers once at program start up reduces the work done per operation and also means that your server will use a known amount of non-paged pool rather than a completely variable amount. Though &lt;a href="http://www.lenholgate.com/blog/2009/03/excellent-article-on-non-paged-pool.html" target="_blank"&gt;non-paged pool is more plentiful than it used to be pre-Vista&lt;/a&gt; this is likely still an advantage.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The RIO API isn't especially complicated but your server designs will be different. The simple polling example server that we used here is unlikely to be an ideal choice as it uses 100% of its CPU for the whole time that the server is running. It's also a little unfair to compare RIO to such a simple traditional server but; there are better alternatives, but it's a useful line in the sand. As we'll see in the following performance articles there are better, and more scalable ways to use both APIs.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
If you're interested in digging deeper into the results used in this article then all of the performance logs taken whilst running the tests are available &lt;a href="http://www.serverframework.com/zips/RIO-Perf-PerfLogs1.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'RIO-Perf-PerfLogs1.zip']);"&gt;here&lt;/a&gt;.
&lt;/div&gt;
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=ZcB-hAsQW3w:bfryiUNdnXs:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=ZcB-hAsQW3w:bfryiUNdnXs:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/ZcB-hAsQW3w" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-performance.html</feedburner:origLink></entry>

<entry>
    <title>Security Clearance granted - Company News</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/A77Xzc5va6s/security-clearance-granted.html" />
    <id>tag:www.jetbyte.com,2012:/news//14.1177</id>

    <published>2012-03-15T08:45:21Z</published>
    <updated>2012-03-15T08:53:16Z</updated>

    <summary>Len now has SC level Security Clearance.</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="C++" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Server Development" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.jetbyte.com/news/">
        &lt;div&gt;
As we &lt;a href="http://www.jetbyte.com/news/2012/01/happy-new-year.html"&gt;mentioned back in January&lt;/a&gt;, one of &lt;a href="http://www.serverframework.com/clients/industrial-control-client.html"&gt;our clients&lt;/a&gt; has sponsored Len for &lt;a href="http://www.security-clearance.org.uk/"&gt;SC level Security Clearance&lt;/a&gt; for one of the projects that we're bidding for with them.
&lt;/div&gt;
&lt;div&gt;&lt;br/&gt;&lt;/div&gt;
&lt;div&gt;
We're pleased to announce that this has been granted and that Len now has SC level Security Clearance.
&lt;/div&gt;
&lt;div&gt;&lt;br/&gt;&lt;/div&gt;
&lt;div&gt;
Whilst this clearance was obtained for work with our sponsoring client that work may or may not materialise, it's currently very early in the bidding process, so if you require the services of an Security Cleared C++ network specialist please get in touch.  
&lt;/div&gt;

        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=A77Xzc5va6s:CRBb2nDbG1A:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=A77Xzc5va6s:CRBb2nDbG1A:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/A77Xzc5va6s" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.jetbyte.com/news/2012/03/security-clearance-granted.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O - Generating load for the performance tests - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/ctdIdMENlGw/windows-8-registered-io---generating-load-for-the-performance-tests.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1176</id>

    <published>2012-03-14T17:15:00Z</published>
    <updated>2012-03-14T17:19:06Z</updated>

    <summary> Now that we have five example servers, four RIO designs and a traditional polled UDP design, we can begin to look at how the RIO API performs compared to the traditional APIs. Of course these comparisons should be taken...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
Now that we have &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;five example servers&lt;/a&gt;, four RIO designs and a traditional polled UDP design, we can begin to look at how the RIO API performs compared to the traditional APIs. Of course these comparisons should be taken as preliminary since we're working with a beta version of the operating system. However, though I wouldn't put much weight in the exact numbers until we have a non-beta OS to test on, it's useful to see how things are coming along and familiarise ourselves with the designs that might be required to take advantage of RIO once it ships.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Sending a stream of datagrams&lt;/h2&gt;
&lt;div&gt;
Before we can compare performance we need to be able to push the example servers hard. We do this by sending a stream of datagrams at them as fast as we can for a period of time. The servers start timing when they get the first datagram and then count the number of datagrams that they process. The test finishes by sending a series of smaller datagrams at the server. When the server sees one of these smaller datagrams it shuts down and reports on the time taken and the number of datagrams processed and the rate at which they were processed.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
All we need to be able to do to stress the servers is to send datagrams at a rate that gets close to 100% utilisation of a 1Gb Ethernet link. This is fairly simple to achieve using the traditional blocking sockets API.
&lt;pre class="brush: cpp gutter: false"&gt;   for (size_t i = 0; i &amp;lt; DATAGRAMS_TO_SEND; ++i)
   {
      if (SOCKET_ERROR == ::WSASendTo(
         s,
         &amp;amp;buf,
         1,
         &amp;amp;bytesSent,
         flags,
         reinterpret_cast&amp;lt;sockaddr *&amp;gt;(&amp;amp;addr),
         sizeof(addr),
         0,
         0))
      {
         ErrorExit("WSASend");
      }
   }&lt;/pre&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
There's not much more to it than that. We use similar code to setup and clean up, but if you've been following along with the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;other examples&lt;/a&gt; then there's nothing that needs to be explained about that.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The code for this example can be downloaded from &lt;a href="http://www.serverframework.com/zips/SimpleUDPTrafficGenerator.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'SimpleUDPTrafficGenerator.zip']);"&gt;here&lt;/a&gt;. This code requires Visual Studio 11, but would work with earlier compilers if you have a Windows SDK that supports RIO. Note that &lt;code&gt;Shared.h&lt;/code&gt; and &lt;code&gt;Constants.h&lt;/code&gt; contain helper functions and tuning constants for ALL of the examples and so there will be code in there that is not used by this example. You should be able to unzip each example into the same directory structure so that they all share the same shared headers. This allows you to tune all of the examples the same so that any performance comparisons make sense. This program can be run on versions of Windows prior to Windows 8, which is useful for testing as you only need one machine set up with the beta of Windows 8 server.  
&lt;/div&gt;

        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=ctdIdMENlGw:FugnxPUVbk4:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=ctdIdMENlGw:FugnxPUVbk4:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/ctdIdMENlGw" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---generating-load-for-the-performance-tests.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O - Traditional Polled UDP Example Server - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/-arn1YIc1Jg/windows-8-registered-io---traditional-polled-udp-example-server.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1175</id>

    <published>2012-03-14T14:00:00Z</published>
    <updated>2012-03-14T17:24:36Z</updated>

    <summary> This article presents the fifth in my series of example servers for comparing the performance of the Windows 8 Registered I/O Networking extensions, RIO, and traditional Windows networking APIs. This example server is a traditional polled UDP design that...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
This article presents the fifth in my series of example servers for comparing the performance of the Windows 8 Registered I/O Networking extensions, RIO, and traditional Windows networking APIs. This example server is a traditional polled UDP design that we can use to compare to &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;the RIO polled UDP example server&lt;/a&gt;. I've been looking at the &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;Windows 8 Registered I/O Networking Extensions&lt;/a&gt; since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting together some simple UDP servers using the various notification styles that RIO provides. I then put together some equally simple UDP servers using the "traditional" APIs so that I could compare performance. This series of blog posts describes each of the example servers in turn. You can find an index to all of the articles about the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;Windows 8 Registered I/O example servers here&lt;/a&gt;. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;A traditional polled UDP server&lt;/h2&gt;
&lt;div&gt;
This server is probably the simplest UDP server you could have. It's pretty much just a tight loop around a blocking call to &lt;code&gt;WSARecv()&lt;/code&gt;. There's none of the complexity required by RIO for registering memory buffers for I/O and so we use a single buffer that we create on the stack.
&lt;pre class="brush: cpp gutter: false"&gt;   do
   {
      workValue += DoWork(g_workIterations);

      if (SOCKET_ERROR == ::WSARecv(
         s,
         &amp;amp;buf,
         1,
         &amp;amp;bytesRecvd,
         &amp;amp;flags,
         0,
         0))
      {
         ErrorExit("WSARecv");
      }

      if (bytesRecvd == EXPECTED_DATA_SIZE)
      {
         g_packets++;
      }
      else
      {
         done = true;
      }
   }
   while (!done);&lt;/pre&gt;
There is some added complexity to allow us to compare performance, and this is similar to the RIO server examples. We can add an arbitrary processing overhead to each datagram by setting &lt;code&gt;g_workIterations&lt;/code&gt; to a non zero value and we count each datagram that arrives and stop the test when a datagram of an unexpected size is received.
&lt;/div&gt; 
        &lt;h2 class="entry-body"&gt;Setting up for the datagram processing loop&lt;/h2&gt;
&lt;div&gt;
As with the RIO examples we do some setup before we can process datagrams. See the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;polled RIO example server&lt;/a&gt; for details of how and why we set up the timing system and initialise Winsock, and for details on our error handling policy.
&lt;pre class="brush: cpp gutter: false"&gt;int _tmain(int argc, _TCHAR* argv[])
{
   SetupTiming("Simple polled UDP");

   InitialiseWinsock();

   SOCKET s = CreateSocket();

   Bind(s, PORT);

   SetSocketRecvBufferToMaximum(s);

   bool done = false;

   CHAR buffer[RECV_BUFFER_SIZE];

   WSABUF buf;

   buf.buf = buffer;
   buf.len = RECV_BUFFER_SIZE;

   DWORD bytesRecvd = 0;

   DWORD flags = 0;

   if (SOCKET_ERROR == ::WSARecv(
      s,
      &amp;amp;buf,
      1,
      &amp;amp;bytesRecvd,
      &amp;amp;flags,
      0,
      0))
   {
      ErrorExit("WSARecv");
   }

   g_packets++;

   StartTiming();

   int workValue = 0;&lt;/pre&gt;
We then create a traditional blocking UDP socket, bind it to a port, set its receive buffer size to the maximum and create our receive buffer on the stack, set up our &lt;code&gt;WSABUF&lt;/code&gt; and call &lt;code&gt;WSARecv()&lt;/code&gt; for the first time. We make this call outside of our processing loop so that we can start timing when we get the first datagram. This code then proceeds into the processing loop, shown above, and processes datagrams until the test is complete and a datagram of an unexpected size is received.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The code for &lt;code&gt;CreateSocket()&lt;/code&gt;, &lt;code&gt;Bind()&lt;/code&gt; and &lt;code&gt;SetSocketRecvBufferToMaximum()&lt;/code&gt; can be found in &lt;code&gt;Shared.h&lt;/code&gt; and remember that the use of globals isn't clever, it's simply convenient for some of the other example servers that use the shared code.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;After the processing loop&lt;/h2&gt;
&lt;div&gt;
Once the performance test completes we stop our timing and report the results.
&lt;pre class="brush: cpp gutter: false"&gt;   StopTiming();

   PrintTimings();

   return workValue;
}&lt;/pre&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The code for this example can be downloaded from &lt;a href="http://www.serverframework.com/zips/SimplePolledUDP.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'SimplePolledUDP.zip']);"&gt;here&lt;/a&gt;. This code requires Visual Studio 11, but would work with earlier compilers if you have a Windows SDK that supports RIO. Note that &lt;code&gt;Shared.h&lt;/code&gt; and &lt;code&gt;Constants.h&lt;/code&gt; contain helper functions and tuning constants for ALL of the examples and so there will be code in there that is not used by this example. You should be able to unzip each example into the same directory structure so that they all share the same shared headers. This allows you to tune all of the examples the same so that any performance comparisons make sense.   
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;h2 class="entry-body"&gt;Join in&lt;/h2&gt;
Comments and suggestions are more than welcome. I'm learning as I go here and I'm quite likely to have made some mistakes or come up with some erroneous conclusions, feel free to put me straight and help make these examples better.
&lt;/div&gt;
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=-arn1YIc1Jg:89rsGzJITec:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=-arn1YIc1Jg:89rsGzJITec:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/-arn1YIc1Jg" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---traditional-polled-udp-example-server.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O - Multi threaded RIO IOCP UDP Example Server - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/3TyycbzWSpE/windows-8-registered-io---multi-threaded-rio-iocp-udp-example-server.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1174</id>

    <published>2012-03-12T18:10:00Z</published>
    <updated>2012-05-03T08:54:18Z</updated>

    <summary> This article presents the fourth in my series of example servers using the Windows 8 Registered I/O Networking extensions, RIO. This example server, like the last example, uses the I/O Completion Port notification method to handle RIO completions, but...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
This article presents the fourth in my series of example servers using the Windows 8 Registered I/O Networking extensions, RIO. This example server, like &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-iocp-udp-example-server.html"&gt;the last example&lt;/a&gt;, uses the I/O Completion Port notification method to handle RIO completions, but where the last example used only a single thread to service the IOCP this one uses multiple thread to scale the load . I've been looking at the &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;Windows 8 Registered I/O Networking Extensions&lt;/a&gt; since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting together some simple UDP servers using the various notification styles that RIO provides. I then put together some equally simple UDP servers using the "traditional" APIs so that I could compare performance. This series of blog posts describes each of the example servers in turn. You can find an index to all of the articles about the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;Windows 8 Registered I/O example servers here&lt;/a&gt;. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Using an I/O Completion Port for RIO completions&lt;/h2&gt;
&lt;div&gt;
As I mentioned back in October, there are &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;three ways to receive completion notifications from RIO&lt;/a&gt;; polling, event driven and via an I/O Completion Port. Using an IOCP for RIO completions allows you to easily scale your completion handling across multiple threads as we do here and this is the first of my example servers that allows for more than one thread to be used to process completions.
&lt;/div&gt;
        &lt;h2 class="entry-body"&gt;Creating an IOCP driven RIO completion queue&lt;/h2&gt;
&lt;div&gt;
We start by initialising things in the same way that we did with the earlier &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;example RIO servers&lt;/a&gt;. In fact, this initialisation is &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-iocp-udp-example-server.html"&gt;identical to the previous IOCP example&lt;/a&gt; except for one thing.  
&lt;pre class="brush: cpp gutter: false"&gt;int _tmain(int argc, _TCHAR* argv[])
{
   SetupTiming("RIO IOCP UDP");

   InitialiseWinsock();

   CreateRIOSocket();

   g_hIOCP = ::CreateIoCompletionPort(
      INVALID_HANDLE_VALUE,
      0,
      0,
      0);

   OVERLAPPED overlapped;

   RIO_NOTIFICATION_COMPLETION completionType;

   completionType.Type = RIO_IOCP_COMPLETION;
   completionType.Iocp.IocpHandle = g_hIOCP;
   completionType.Iocp.CompletionKey = (void*)1;
   completionType.Iocp.Overlapped = &amp;amp;overlapped;

   g_queue = g_rio.RIOCreateCompletionQueue(
      RIO_PENDING_RECVS,
      &amp;amp;completionType);

   if (g_queue == RIO_INVALID_CQ)
   {
      ErrorExit("RIOCreateCompletionQueue");
   }&lt;/pre&gt;
With the previous design we passed &lt;code&gt;0&lt;/code&gt; as the completion key. This server passes &lt;code&gt;1&lt;/code&gt;. This is an arbitrary change purely to allow us to post completions of &lt;code&gt;0&lt;/code&gt; to cause all of the threads waiting on the completion queue to shut down. This is a common idiom with normal, non-RIO, IOCP designs as it's more usual that the completion key is a pointer to a "per operation" data structure. A RIO design with multiple completion queues would likely use the completion key for "per queue" data. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Creating a RIO request queue&lt;/h2&gt;
&lt;div&gt;
Creating the request queue and posting our receives is identical to &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;the polled example&lt;/a&gt;. The only difference is how we handle the completions.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Starting our worker threads&lt;/h2&gt;
&lt;div&gt;
This example is the first to use more than just a single thread. Because of this we need to have a way to start and manage our worker threads and for them to communicate with each other and the main thread.
&lt;pre class="brush: cpp gutter: false"&gt;   CreateIOCPThreads(NUM_IOCP_THREADS);

   INT notifyResult = g_rio.RIONotify(g_queue);

   if (notifyResult != ERROR_SUCCESS)
   {
      ErrorExit("RIONotify", notifyResult);
   }

   WaitForProcessingStarted();&lt;/pre&gt;
First we call &lt;code&gt;CreateIOCPThreads()&lt;/code&gt;, which is shown below, this creates some events that the threads will use to communicate and then creates and starts the threads themselves. As with the earlier examples, we use globals for convenience and not as an example of good design.
&lt;pre class="brush: cpp gutter: false"&gt;inline void CreateIOCPThreads(
   const DWORD numThreads)
{
   g_hStartedEvent = ::CreateEvent(0, TRUE, FALSE, 0);

   if (0 == g_hStartedEvent)
   {
      ErrorExit("CreateEvent");
   }

   g_hStoppedEvent = ::CreateEvent(0, TRUE, FALSE, 0);

   if (0 == g_hStoppedEvent)
   {
      ErrorExit("CreateEvent");
   }

   // Start our worker threads

   for (DWORD i = 0; i &amp;lt; numThreads; ++i)
   {
      unsigned int notUsed;

      const uintptr_t result = ::_beginthreadex(
         0,
         0,
         ThreadFunction,
         0,
         0,
         &amp;amp;notUsed);

      if (result == 0)
      {
         ErrorExit("_beginthreadex", errno);
      }

      g_threads.push_back(reinterpret_cast&amp;lt;handle&amp;gt;(result));
   }

   cout &amp;lt;&amp;lt; numThreads &amp;lt;&amp;lt; " threads running" &amp;lt;&amp;lt; endl;
}&lt;/pre&gt;
The main thread then calls &lt;code&gt;RIONotify()&lt;/code&gt; to enable notifications and then waits for the first datagram to be processed before it starts the timer.
&lt;pre class="brush: cpp gutter: false"&gt;inline void WaitForProcessingStarted()
{
   if (WAIT_OBJECT_0 != ::WaitForSingleObject(
      g_hStartedEvent,
      INFINITE))
   {
      ErrorExit("WaitForSingleObject");
   }

   StartTiming();
}&lt;/pre&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Calling RIODequeueCompletion() and processing results&lt;/h2&gt;
&lt;div&gt;
This example's processing loop is similar to the previous examples, especially the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-iocp-udp-example-server.html"&gt;single threaded IOCP example server&lt;/a&gt;. It's slightly more complicated due to the fact that it's being run on a separate thread.
&lt;pre class="brush: cpp gutter: false"&gt;unsigned int __stdcall ThreadFunction(
   void *pV)
{
   DWORD numberOfBytes = 0;

   ULONG_PTR completionKey = 0;

   OVERLAPPED *pOverlapped = 0;

   const DWORD recvFlags = 0;

   if (!::GetQueuedCompletionStatus(
      g_hIOCP,
      &amp;amp;numberOfBytes,
      &amp;amp;completionKey,
      &amp;amp;pOverlapped,
      INFINITE))
   {
      ErrorExit("GetQueuedCompletionStatus");
   }

   int workValue = 0;

   if (completionKey == 1)
   {
      RIORESULT results[RIO_MAX_RESULTS];

      bool done = false;

      ::SetEvent(g_hStartedEvent);

      ULONG numResults = g_rio.RIODequeueCompletion(
         g_queue,
         results,
         RIO_MAX_RESULTS);

      if (0 == numResults ||
          RIO_CORRUPT_CQ == numResults)
      {
         ErrorExit("RIODequeueCompletion");
      }

      INT notifyResult = g_rio.RIONotify(g_queue);

      if (notifyResult != ERROR_SUCCESS)
      {
         ErrorExit("RIONotify", notifyResult);
      }

      do
      {
         for (DWORD i = 0; i &amp;lt; numResults; ++i)
         {
            EXTENDED_RIO_BUF *pBuffer = reinterpret_cast&amp;lt;EXTENDED_RIO_BUF *&amp;gt;(results[i].RequestContext);

            if (results[i].BytesTransferred == EXPECTED_DATA_SIZE)
            {
               ::InterlockedIncrement(&amp;amp;g_packets);

               workValue += DoWork(g_workIterations);

               if (!g_rio.RIOReceive(
                  g_requestQueue,
                  pBuffer,
                  1,
                  recvFlags,
                  pBuffer))
               {
                  ErrorExit("RIOReceive");
               }

               done = false;
            }
            else
            {
               done = true;
            }
         }

         if (!done)
         {
            if (!::GetQueuedCompletionStatus(
               g_hIOCP,
               &amp;amp;numberOfBytes,
               &amp;amp;completionKey,
               &amp;amp;pOverlapped,
               INFINITE))
            {
               ErrorExit("GetQueuedCompletionStatus");
            }

            if (completionKey == 0)
            {
               done = true;
            }
            else
            {
               numResults = g_rio.RIODequeueCompletion(
                  g_queue,
                  results,
                  RIO_MAX_RESULTS);

               if (0 == numResults ||
                   RIO_CORRUPT_CQ == numResults)
               {
                  ErrorExit("RIODequeueCompletion");
               }

               INT notifyResult = g_rio.RIONotify(g_queue);

               if (notifyResult != ERROR_SUCCESS)
               {
                  ErrorExit("RIONotify", notifyResult);
               }
            }
         }
      }
      while (!done);
   }

   ::SetEvent(g_hStoppedEvent);

   return workValue;
}&lt;/pre&gt;
The first thing we do is wait for a completion. Once we have a completion we dequeue the results and then call &lt;code&gt;RIONotify() &lt;/code&gt; to allow more completions to occur. It's important to realise that until we call &lt;code&gt;RIONotify()&lt;/code&gt; no further completions will be posted to the I/O Completion Port and that this effectively acts as synchronisation around the calls to &lt;code&gt;RIODequeueCompletion()&lt;/code&gt;. With this design only one thread can ever be calling &lt;code&gt;RIODequeueCompletion()&lt;/code&gt; at a time, which is a good thing as &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/hh448845(v=vs.85).aspx" target="_blank"&gt;the documentation for &lt;code&gt;RIODequeueCompletion()&lt;/code&gt;&lt;/a&gt; states that this is a requirement for users of the API.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Remember that this example is made more complex due to the way we profile the servers. See the explanation in the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;completion handling section of the polled RIO server example&lt;/a&gt; for details of why this is.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Shutting down and displaying results&lt;/h2&gt;
&lt;div&gt;
Whilst our worker threads are processing datagrams our main thread is waiting for the performance test to end. 
&lt;pre class="brush: cpp gutter: false"&gt;   WaitForProcessingStopped();

   StopIOCPThreads();

   PrintTimings();

   return 0;
}&lt;/pre&gt;
Our thread function's main loop can exit in two ways. Firstly because a datagram arrives that isn't of the expected size, thus signalling the end of the performance test, and secondly if &lt;code&gt;GetQueuedCompletionStatus()&lt;/code&gt; returns a completion key of &lt;code&gt;0&lt;/code&gt; which means that the main thread has posted completions to request that we shut down. This means that the first time a "shutdown" datagram arrives the first thread that begins processing it will shut down and set the &lt;code&gt;g_hStoppedEvent&lt;/code&gt; event. The main thread is currently waiting for this event, and will wake when the event is set and shut the rest of the worker threads down. Once all of the threads have terminated the main thread will display details of the datagrams received and the test timings.
&lt;pre class="brush: cpp gutter: false"&gt;inline void WaitForProcessingStopped()
{
   if (WAIT_OBJECT_0 != ::WaitForSingleObject(
      g_hStoppedEvent,
      INFINITE))
   {
      ErrorExit("WaitForSingleObject");
   }

   StopTiming();
}

inline void StopIOCPThreads()
{
   // Tell all threads to exit

   for (Threads::const_iterator it = g_threads.begin(),
      end = g_threads.end();
      it != end;
      ++it)
   {
      if (0 == ::PostQueuedCompletionStatus(
         g_hIOCP,
         0,
         0,
         0))
      {
         ErrorExit("PostQueuedCompletionStatus");
      }
   }

   cout &amp;lt;&amp;lt; "Threads stopping" &amp;lt;&amp;lt; endl;

   // Wait for all threads to exit

   for (Threads::const_iterator it = g_threads.begin(),
      end = g_threads.end();
      it != end;
      ++it)
   {
      HANDLE hThread = *it;

      if (WAIT_OBJECT_0 != ::WaitForSingleObject(
         hThread,
         INFINITE))
      {
         ErrorExit("WaitForSingleObject");
      }

      ::CloseHandle(hThread);
   }   

   cout &amp;lt;&amp;lt; "Threads stopped" &amp;lt;&amp;lt; endl;
}&lt;/pre&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Unexpected performance issues...&lt;/h2&gt;
&lt;div&gt;
The slight problem with this design is that it's not actually as performant in some scenarios as we might like it to be. As it stands, the fact that we can scale out across multiple threads is a plus point but the fact that the operations that we have to perform to achieve that scaling are considerably more expensive is a problem. This is more of an issue when we're looking for a general purpose solution which works as well for low throughput and fast processing of each datagram as it does for high throughput and/or slow processing. Luckily there are a couple of things we can do to fix this, but we'll look at those once we've done some performance comparisons and seen the problems first hand.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The code for this example can be downloaded from &lt;a href="http://www.serverframework.com/zips/RIO-IOCPUDPMT.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'RIO-IOCPUDPMT.zip']);"&gt;here&lt;/a&gt;. This code requires Visual Studio 11, but would work with earlier compilers if you have a Windows SDK that supports RIO. Note that &lt;code&gt;Shared.h&lt;/code&gt; and &lt;code&gt;Constants.h&lt;/code&gt; contain helper functions and tuning constants for ALL of the examples and so there will be code in there that is not used by this example. You should be able to unzip each example into the same directory structure so that they all share the same shared headers. This allows you to tune all of the examples the same so that any performance comparisons make sense.   
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;h2 class="entry-body"&gt;Join in&lt;/h2&gt;
Comments and suggestions are more than welcome. I'm learning as I go here and I'm quite likely to have made some mistakes or come up with some erroneous conclusions, feel free to put me straight and help make these examples better.
&lt;/div&gt;

    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=3TyycbzWSpE:zcwomYmmSIk:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=3TyycbzWSpE:zcwomYmmSIk:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/3TyycbzWSpE" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---multi-threaded-rio-iocp-udp-example-server.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O - Single threaded RIO IOCP UDP Example Server - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/TeqJ_vsRtew/windows-8-registered-io---single-threaded-rio-iocp-udp-example-server.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1171</id>

    <published>2012-03-12T15:45:00Z</published>
    <updated>2012-03-13T08:34:24Z</updated>

    <summary> This article presents the third in my series of example servers using the Windows 8 Registered I/O Networking extensions, RIO. This example server uses the I/O Completion Port notification method to handle RIO completions, but only uses a single...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
This article presents the third in my series of example servers using the Windows 8 Registered I/O Networking extensions, RIO. This example server uses the I/O Completion Port notification method to handle RIO completions, but only uses a single thread to service the IOCP. I've been looking at the &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;Windows 8 Registered I/O Networking Extensions&lt;/a&gt; since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting together some simple UDP servers using the various notification styles that RIO provides. I then put together some equally simple UDP servers using the "traditional" APIs so that I could compare performance. This series of blog posts describes each of the example servers in turn. You can find an index to all of the articles about the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;Windows 8 Registered I/O example servers here&lt;/a&gt;. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Using an I/O Completion Port for RIO completions&lt;/h2&gt;
&lt;div&gt;
As I mentioned back in October, there are &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;three ways to receive completion notifications from RIO&lt;/a&gt;; polling, event driven and via an I/O Completion Port. Using an IOCP for RIO completions allows you to easily scale your completion handling across multiple threads, though in this first IOCP example server we use a single thread so as to allow us to compare the performance against the polled and event driven servers. The next example server will adapt this server for multiple threads and allow us to scale our completion processing across more CPUs.
&lt;/div&gt;

        &lt;h2 class="entry-body"&gt;Creating an IOCP driven RIO completion queue&lt;/h2&gt;
&lt;div&gt;
We start by initialising things in the same way that we did with the earlier &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;example RIO servers&lt;/a&gt;.  
&lt;pre class="brush: cpp gutter: false"&gt;int _tmain(int argc, _TCHAR* argv[])
{
   SetupTiming("RIO IOCP UDP");

   InitialiseWinsock();

   CreateRIOSocket();

   g_hIOCP = ::CreateIoCompletionPort(
      INVALID_HANDLE_VALUE,
      0,
      0,
      0);

   OVERLAPPED overlapped;

   RIO_NOTIFICATION_COMPLETION completionType;

   completionType.Type = RIO_IOCP_COMPLETION;
   completionType.Iocp.IocpHandle = g_hIOCP;
   completionType.Iocp.CompletionKey = (void*)0;
   completionType.Iocp.Overlapped = &amp;amp;overlapped;

   g_queue = g_rio.RIOCreateCompletionQueue(
      RIO_PENDING_RECVS,
      &amp;amp;completionType);

   if (g_queue == RIO_INVALID_CQ)
   {
      ErrorExit("RIOCreateCompletionQueue");
   }&lt;/pre&gt;
Once that is done we create an IO Completion Port and then create a RIO completion queue which uses the IOCP for notification. In this simple design we have no need for a completion key as we only have a single completion queue so there's no need to differentiate between completion types. We also use a plain old &lt;code&gt;OVERLAPPED&lt;/code&gt; rather than extending it to carry more information. More complex designs could use either the completion key, or an extended overlapped structure to pass queue specific information to our completion handler in much the same way that we do with normal IOCP server designs.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Creating a RIO request queue&lt;/h2&gt;
&lt;div&gt;
Creating the request queue and posting our receives is identical to &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;the polled example&lt;/a&gt;. The only difference is how we handle the completions.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Calling RIODequeueCompletion() and processing results&lt;/h2&gt;
&lt;div&gt;
Processing completions is almost identical to &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-event-driven-udp-example-server.html"&gt;processing event driven completions&lt;/a&gt;. We simply change the call to &lt;code&gt;WaitForSingleObject()&lt;/code&gt; that we were using in the event driven example to the following to retrieve a completion notification from the IOCP.
&lt;pre class="brush: cpp gutter: false"&gt;   DWORD numberOfBytes = 0;

   ULONG_PTR completionKey = 0;

   OVERLAPPED *pOverlapped = 0;

   if (!::GetQueuedCompletionStatus(
      g_hIOCP,
      &amp;amp;numberOfBytes,
      &amp;amp;completionKey,
      &amp;amp;pOverlapped,
      INFINITE))
   {
      ErrorExit("GetQueuedCompletionStatus");
   }&lt;/pre&gt;
Everything else is identical. Things change somewhat when we switch to using multiple threads for our completion handling.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The code for this example can be downloaded from &lt;a href="http://www.serverframework.com/zips/RIO-IOCPUDP.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'RIO-IOCPUDP.zip']);"&gt;here&lt;/a&gt;. This code requires Visual Studio 11, but would work with earlier compilers if you have a Windows SDK that supports RIO. Note that &lt;code&gt;Shared.h&lt;/code&gt; and &lt;code&gt;Constants.h&lt;/code&gt; contain helper functions and tuning constants for ALL of the examples and so there will be code in there that is not used by this example. You should be able to unzip each example into the same directory structure so that they all share the same shared headers. This allows you to tune all of the examples the same so that any performance comparisons make sense.   
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;h2 class="entry-body"&gt;Join in&lt;/h2&gt;
Comments and suggestions are more than welcome. I'm learning as I go here and I'm quite likely to have made some mistakes or come up with some erroneous conclusions, feel free to put me straight and help make these examples better.
&lt;/div&gt;
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=TeqJ_vsRtew:rEOqdkqhzDs:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=TeqJ_vsRtew:rEOqdkqhzDs:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/TeqJ_vsRtew" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-iocp-udp-example-server.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O - Single threaded RIO Event Driven UDP Example Server - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/ZMFsIwiTQq0/windows-8-registered-io---single-threaded-rio-event-driven-udp-example-server.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1172</id>

    <published>2012-03-09T22:25:00Z</published>
    <updated>2012-03-14T17:11:18Z</updated>

    <summary> This article presents the second in my series of example servers using the Windows 8 Registered I/O Networking extensions, RIO. This example server uses the event driven notification method to handle RIO completions. I've been looking at the Windows...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
This article presents the second in my series of example servers using the Windows 8 Registered I/O Networking extensions, RIO. This example server uses the event driven notification method to handle RIO completions. I've been looking at the &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;Windows 8 Registered I/O Networking Extensions&lt;/a&gt; since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting together some simple UDP servers using the various notification styles that RIO provides. I then put together some equally simple UDP servers using the "traditional" APIs so that I could compare performance. This series of blog posts describes each of the example servers in turn. You can find an index to all of the articles about the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;Windows 8 Registered I/O example servers here&lt;/a&gt;. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Using an event for RIO completions&lt;/h2&gt;
&lt;div&gt;
As I mentioned back in October, there are &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;three ways to receive completion notifications from RIO&lt;/a&gt;; polling, event driven and via an I/O Completion Port. Using the event driven approach is similar to using the polling approach that I described in &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;the previous article&lt;/a&gt; except that the server doesn't burn CPU in a tight polling loop.
&lt;/div&gt;
        &lt;h2 class="entry-body"&gt;Creating an event driven RIO completion queue&lt;/h2&gt;
&lt;div&gt;
We start by initialising things in the same way that we did with the earlier &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;example RIO servers&lt;/a&gt;.  
&lt;pre class="brush: cpp gutter: false"&gt;int _tmain(int argc, _TCHAR* argv[])
{
   SetupTiming("RIO Event Driven UDP");

   InitialiseWinsock();

   CreateRIOSocket();

   HANDLE hEvent = WSACreateEvent();

   if (hEvent == WSA_INVALID_EVENT)
   {
      ErrorExit("WSACreateEvent");
   }

   RIO_NOTIFICATION_COMPLETION completionType;

   completionType.Type = RIO_EVENT_COMPLETION;
   completionType.Event.EventHandle = hEvent;
   completionType.Event.NotifyReset = TRUE;

   g_queue = g_rio.RIOCreateCompletionQueue(
      RIO_PENDING_RECVS,
      &amp;amp;completionType);

   if (g_queue == RIO_INVALID_CQ)
   {
      ErrorExit("RIOCreateCompletionQueue");
   }&lt;/pre&gt;
Once that is done we create an event and then create a RIO completion queue which uses the event for notification. The event is signalled when there are completions to process and reset when we call &lt;code&gt;RIONotify()&lt;/code&gt;.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Creating a RIO request queue&lt;/h2&gt;
&lt;div&gt;
Creating the request queue and posting our receives is identical to &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;the polled example&lt;/a&gt;. The only difference is how we handle the completions.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Calling RIODequeueCompletion() and processing results&lt;/h2&gt;
&lt;div&gt;
The processing loop is, again, similar to the polled example. Unsurprisingly rather than polling we wait on the event and dequeue the completions once the event is set. This reduced the amount of CPU used as there's no need to spin whilst waiting for new datagrams to process. The only complication is that we need to call &lt;code&gt;RIONotify()&lt;/code&gt; to indicate that we're ready to process more completions. Note that in a real server you would probably want to wait on your completions available event and a 'we're ready to shut down' event so that you can shut the sever down cleanly.
&lt;pre class="brush: cpp gutter: false"&gt;   bool done = false;

   DWORD recvFlags = 0;

   RIORESULT results[RIO_MAX_RESULTS];

   const INT notifyResult = g_rio.RIONotify(g_queue);

   if (notifyResult != ERROR_SUCCESS)
   {
      ErrorExit("RIONotify");
   }

   const DWORD waitResult = WaitForSingleObject(
      hEvent,
      INFINITE);

   if (waitResult != WAIT_OBJECT_0)
   {
      ErrorExit("WaitForSingleObject");
   }

   ULONG numResults = g_rio.RIODequeueCompletion(
      g_queue,
      results,
      RIO_MAX_RESULTS);

   if (0 == numResults ||
       RIO_CORRUPT_CQ == numResults)
   {
      ErrorExit("RIODequeueCompletion");
   }

   StartTiming();

   int workValue = 0;

   bool running = true;

   do
   {
      for (DWORD i = 0; i &amp;lt; numResults; ++i)
      {
         EXTENDED_RIO_BUF *pBuffer = reinterpret_cast&amp;lt;EXTENDED_RIO_BUF *&amp;gt;(results[i].RequestContext);

         if (results[i].BytesTransferred == EXPECTED_DATA_SIZE)
         {
            g_packets++;

            workValue += DoWork(g_workIterations);

            if (!g_rio.RIOReceive(
               g_requestQueue,
               pBuffer,
               1,
               recvFlags,
               pBuffer))
            {
               ErrorExit("RIOReceive");
            }

            done = false;
         }
         else
         {
            done = true;
         }
      }

      if (!done)
      {
         const INT notifyResult = g_rio.RIONotify(g_queue);

         if (notifyResult != ERROR_SUCCESS)
         {
            ErrorExit("RIONotify");
         }

         const DWORD waitResult = WaitForSingleObject(
            hEvent,
            INFINITE);

         if (waitResult != WAIT_OBJECT_0)
         {
            ErrorExit("WaitForSingleObject");
         }

         numResults = g_rio.RIODequeueCompletion(
            g_queue,
            results,
            RIO_MAX_RESULTS);

         if (0 == numResults ||
             RIO_CORRUPT_CQ == numResults)
         {
            ErrorExit("RIODequeueCompletion");
         }
      }
   }
   while (!done);

   StopTiming();

   PrintTimings();

   return workValue;
}&lt;/pre&gt;
As &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;before&lt;/a&gt;, the structure of the processing loop is complicated somewhat by the fact that we want to start and stop the timing for the performance testing, and the &lt;code&gt;DoWork()&lt;/code&gt; function can be used to add 'processing overhead' to each datagram. This can be configured using the &lt;code&gt;g_workIterations&lt;/code&gt; which is defined in &lt;code&gt;Constants.h&lt;/code&gt;. With this set to 0 there is no overhead and we can compare how quickly each API can receive datagrams. Setting larger values will affect how the various multi-threaded examples perform and can be useful if you're unable to saturate the test machine's network interfaces.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
This example can be optimised slightly so that we revert to straight polling as long as calling &lt;code&gt;RIODequeueCompletion()&lt;/code&gt; returns us at least one result. We'll look at this variation after we've studied the performance of the example shown here.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The code for this example can be downloaded from &lt;a href="http://www.serverframework.com/zips/RIO-EventDrivenUDP.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'RIO-EventDrivenUDP.zip']);"&gt;here&lt;/a&gt;. This code requires Visual Studio 11, but would work with earlier compilers if you have a Windows SDK that supports RIO. Note that &lt;code&gt;Shared.h&lt;/code&gt; and &lt;code&gt;Constants.h&lt;/code&gt; contain helper functions and tuning constants for ALL of the examples and so there will be code in there that is not used by this example. You should be able to unzip each example into the same directory structure so that they all share the same shared headers. This allows you to tune all of the examples the same so that any performance comparisons make sense.   
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;h2 class="entry-body"&gt;Join in&lt;/h2&gt;
Comments and suggestions are more than welcome. I'm learning as I go here and I'm quite likely to have made some mistakes or come up with some erroneous conclusions, feel free to put me straight and help make these examples better.
&lt;/div&gt;

    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=ZMFsIwiTQq0:Z_N7N7gn_0I:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=ZMFsIwiTQq0:Z_N7N7gn_0I:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/ZMFsIwiTQq0" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-event-driven-udp-example-server.html</feedburner:origLink></entry>

<entry>
    <title>Visual Studio 11, the UI changes don't matter... - Rambling Comments</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/ygQr3zO17Ks/visual-studio-11-the-ui-changes-dont-matter.html" />
    <id>tag:www.lenholgate.com,2012:/blog//12.1173</id>

    <published>2012-03-09T21:50:40Z</published>
    <updated>2012-03-10T18:16:02Z</updated>

    <summary> The best thing about Visual Studio 11 is that it doesn't matter if you like the new style IDE or not. The project files are, at last, backwards compatible, so you can load them in Visual Studio 2010 and...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Geek Speak" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        &lt;div&gt;
The best thing about Visual Studio 11 is that it doesn't matter if you like the new style IDE or not. The project files are, at last, backwards compatible, so you can load them in Visual Studio 2010 and build with the new tool chain even though you ignore the new IDE - if that's what you want to do.  
&lt;/div&gt; 
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
I don't &lt;i&gt;like&lt;/i&gt; the new icons, but I find I can work fine in the IDE as long as I don't think about it too much... Probably pretty much like how I felt about all previous versions when they were at the beta stage...
&lt;/div&gt; 
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=ygQr3zO17Ks:a5y2Q7O6yrA:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=ygQr3zO17Ks:a5y2Q7O6yrA:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/ygQr3zO17Ks" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.lenholgate.com/blog/2012/03/visual-studio-11-the-ui-changes-dont-matter.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O - Simple RIO Polled UDP Example Server - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/KiFFoHzoOBo/windows-8-registered-io---simple-rio-polled-udp-example-server.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1170</id>

    <published>2012-03-07T11:28:00Z</published>
    <updated>2012-03-13T18:46:54Z</updated>

    <summary> I've been looking at the Windows 8 Registered I/O Networking Extensions since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
I've been looking at the &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;Windows 8 Registered I/O Networking Extensions&lt;/a&gt; since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting together some simple UDP servers using the various notification styles that RIO provides. I then put together some equally simple UDP servers using the "traditional" APIs so that I could compare performance. This series of blog posts describes each of the example servers in turn. You can find an index to all of the articles about the &lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html"&gt;Windows 8 Registered I/O example servers here&lt;/a&gt;. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Polling RIO for completions&lt;/h2&gt;
&lt;div&gt;
As I mentioned back in October, there are &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;three ways to receive completion notifications from RIO&lt;/a&gt;; polling, event driven and via an I/O Completion Port. The first is the simplest though it burns CPU time even when no datagrams are being received.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
At its simplest a polled RIO server obtains datagrams to process like this: 
&lt;pre class="brush: cpp gutter: false"&gt;   RIORESULT results[RIO_MAX_RESULTS];

   ULONG numResults = 0;

   do
   {
      numResults = g_rio.RIODequeueCompletion(
         g_queue,
         results,
         RIO_MAX_RESULTS);

      if (0 == numResults)
      {
         YieldProcessor();
      }
      else if (RIO_CORRUPT_CQ == numResults)
      {
         ErrorExit("RIODequeueCompletion");
      }
   }
   while (0 == numResults);&lt;/pre&gt;
You then loop over the results array and process each result in turn before looping back to dequeue more completions. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Getting to the point where you can call &lt;code&gt;RIODequeueCompletion()&lt;/code&gt; takes a bit of setting up though...
&lt;/div&gt;
        &lt;h2 class="entry-body"&gt;Creating a RIO completion queue&lt;/h2&gt;
&lt;div&gt;
The examples are each stand alone but can share two common header files. The first, &lt;code&gt;Constants.h&lt;/code&gt;, contains all constants that are used to tune the examples. The second, &lt;code&gt;Shared.h&lt;/code&gt;, contains inline helper functions which hide some of the complexity and allow the individual example programs to focus on the area of the API that they're demonstrating. We use several of these helper functions as we prepare to create our RIO completion queue.
&lt;pre class="brush: cpp gutter: false"&gt;int _tmain(int argc, _TCHAR* argv[])
{
   SetupTiming("RIO polled UDP");

   InitialiseWinsock();

   CreateRIOSocket();

   g_queue = g_rio.RIOCreateCompletionQueue(RIO_PENDING_RECVS, 0);

   if (g_queue == RIO_INVALID_CQ)
   {
      ErrorExit("RIOCreateCompletionQueue");
   }&lt;/pre&gt;
So that we can compare the performance of these examples we first call &lt;code&gt;SetupTiming()&lt;/code&gt;, this prepares us for calling &lt;code&gt;StartTiming()&lt;/code&gt; and &lt;code&gt;StopTiming()&lt;/code&gt; later in the program. &lt;code&gt;SetupTiming()&lt;/code&gt; locks this thread to a single CPU, using &lt;code&gt;SetThreadAffinityMask()&lt;/code&gt; as &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/ms644904(v=vs.85).aspx" target="_blank"&gt;recommended by the help for &lt;code&gt;QueryPerformanceCounter()&lt;/code&gt;&lt;/a&gt;, once this is done we call &lt;code&gt;QueryPerformanceFrequency()&lt;/code&gt; and store the resulting value for use by &lt;code&gt;StopTiming()&lt;/code&gt;. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
With that done we initialise Winsock and then create our RIO socket. Note that these examples use quite a few global variables for convenience. This isn't how I would suggest you write production code, but it's convenient for these examples as it makes much of the code simpler and allows us to focus on the RIO API. For example, &lt;code&gt;CreateRIOSocket()&lt;/code&gt; does the following, creating a socket and assigning it to &lt;code&gt;g_s&lt;/code&gt;, we then bind the socket to &lt;code&gt;PORT&lt;/code&gt;, which is a constant that is defined in &lt;code&gt;Constants.h&lt;/code&gt;, and then initialising the RIO API function table so that we can use it through &lt;code&gt;g_rio&lt;/code&gt;.
&lt;pre class="brush: cpp gutter: false"&gt;inline void CreateRIOSocket()
{
   g_s = CreateSocket(WSA_FLAG_REGISTERED_IO);

   Bind(g_s, PORT);

   InitialiseRIO(g_s);
}

inline SOCKET CreateSocket(
   const DWORD flags = 0)
{
   g_s = ::WSASocket(
      AF_INET,
      SOCK_DGRAM,
      IPPROTO_UDP,
      NULL,
      0,
      flags);

   if (g_s == INVALID_SOCKET)
   {
      ErrorExit("WSASocket");
   }

   return g_s;
}

inline void InitialiseRIO(
   SOCKET s)
{
   GUID functionTableId = WSAID_MULTIPLE_RIO;

   DWORD dwBytes = 0;

   bool ok = true;

   if (0 != WSAIoctl(
      s,
      SIO_GET_MULTIPLE_EXTENSION_FUNCTION_POINTER,
      &amp;amp;functionTableId,
      sizeof(GUID),
      (void**)&amp;amp;g_rio,
      sizeof(g_rio),
      &amp;amp;dwBytes,
      0,
      0))
   {
      ErrorExit("WSAIoctl");
   }
}&lt;/pre&gt;
As you can see, we check all API calls for failure and report errors via our &lt;code&gt;ErrorExit()&lt;/code&gt; functions. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Finally we can create the RIO completion queue. We use the tunable constant &lt;code&gt;RIO_PENDING_RECVS&lt;/code&gt; to specify how large the queue should be.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Creating a RIO request queue&lt;/h2&gt;
&lt;div&gt;
Setting up a RIO request queue seems to be one of the few places where RIO API changes between the Windows 8 Developer Preview and the Windows 8 Server Beta are visible to us. With the Developer Preview we could pass any value for the &lt;code&gt;maxReceiveDataBuffers&lt;/code&gt; and &lt;code&gt;maxSendDataBuffers&lt;/code&gt; whereas in the beta &lt;code&gt;RIOCreateRequestQueue()&lt;/code&gt; only accepts a value of 1. See &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;my earlier posting on RIO for more details&lt;/a&gt;, but the documentation and the API are now in sync and the API currently doesn't support scatter/gather I/O.   
&lt;pre class="brush: cpp gutter: false"&gt;   ULONG maxOutstandingReceive = RIO_PENDING_RECVS;
   ULONG maxReceiveDataBuffers = 1;
   ULONG maxOutstandingSend = 0;
   ULONG maxSendDataBuffers = 1;

   void *pContext = 0;

   g_requestQueue = g_rio.RIOCreateRequestQueue(
      g_s,
      maxOutstandingReceive,
      maxReceiveDataBuffers,
      maxOutstandingSend,
      maxSendDataBuffers,
      g_queue,
      g_queue,
      pContext);

   if (g_requestQueue == RIO_INVALID_RQ)
   {
      ErrorExit("RIOCreateRequestQueue");
   }

   PostRIORecvs(RECV_BUFFER_SIZE, RIO_PENDING_RECVS);&lt;/pre&gt;
Once the request queue has been created we can post some read requests. Note that we specify the size of the buffers to use and the number of receives that we want to have pending, both of these values can be changed easily in &lt;code&gt;Constants.h&lt;/code&gt;.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Registering buffers and posting RIO read requests&lt;/h2&gt;
&lt;div&gt;
Before we can issue some read requests we need to register some I/O buffers. The &lt;code&gt;PostRIORecvs()&lt;/code&gt; function is complicated by the fact that we're trying to keep things simple (!) and by the fact that the example is tunable for different buffer sizes and the number of receives that we may want pending. We loop allocating and registering buffers and then slicing the buffer into buffer slices. We use an extended &lt;code&gt;RIO_BUF&lt;/code&gt; structure so that we can pass an "operation" code with each buffer slice. These examples don't use this operation, but a real server might need to pass additional information with each I/O request, especially if it's using a single completion queue for reads and writes. We deliberately leak our &lt;code&gt;EXTENDED_RIO_BUF&lt;/code&gt; structures but this isn't much of a problem in this example as they're in use from this point until the program exits.
&lt;pre class="brush: cpp gutter: false"&gt;inline void PostRIORecvs(
   const DWORD recvBufferSize,
   const DWORD pendingRecvs)
{
   DWORD totalBuffersAllocated = 0;

   while (totalBuffersAllocated &amp;lt; pendingRecvs)
   {
      DWORD bufferSize = 0;
   
      DWORD receiveBuffersAllocated = 0;

      char *pBuffer = AllocateBufferSpace(
         recvBufferSize,
         pendingRecvs,
         bufferSize,
         receiveBuffersAllocated);

      totalBuffersAllocated += receiveBuffersAllocated;

      RIO_BUFFERID id = g_rio.RIORegisterBuffer(
         pBuffer,
         static_cast&amp;lt;DWORD&amp;gt;(bufferSize));

      if (id == RIO_INVALID_BUFFERID)
      {
         ErrorExit("RIORegisterBuffer");
      }

      DWORD offset = 0;

      const DWORD recvFlags = 0;

      EXTENDED_RIO_BUF *pBufs = new EXTENDED_RIO_BUF[receiveBuffersAllocated];

      for (DWORD i = 0; i &amp;lt; receiveBuffersAllocated; ++i)
      {
         // now split into buffer slices and post our recvs

         EXTENDED_RIO_BUF *pBuffer = pBufs + i;

         pBuffer-&amp;gt;operation = 0;
         pBuffer-&amp;gt;BufferId = id;
         pBuffer-&amp;gt;Offset = offset;
         pBuffer-&amp;gt;Length = recvBufferSize;

         offset += recvBufferSize;

         if (!g_rio.RIOReceive(g_requestQueue, pBuffer, 1, recvFlags, pBuffer))
         {
            ErrorExit("RIOReceive");
         }
      }

      if (totalBuffersAllocated != pendingRecvs)
      {
         cout &amp;lt;&amp;lt; pendingRecvs &amp;lt;&amp;lt; " receives pending" &amp;lt;&amp;lt; endl;
      }
   }

   cout &amp;lt;&amp;lt; totalBuffersAllocated &amp;lt;&amp;lt; " total receives pending" &amp;lt;&amp;lt; endl;
}&lt;/pre&gt;
Before we can register our I/O buffer we need to allocate the memory that we will be using. As I mentioned back in October when I was first looking at the RIO API, it's important to allocate your I/O buffer memory in a particular way so that you &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-buffer-strategies.html"&gt;use memory efficiently for RIO's registered I/O buffers&lt;/a&gt;.   
&lt;pre class="brush: cpp gutter: false"&gt;inline char *AllocateBufferSpace(
   const DWORD recvBufferSize,
   const DWORD pendingRecvs,
   DWORD &amp;amp;bufferSize,
   DWORD &amp;amp;receiveBuffersAllocated)
{
   const DWORD preferredNumaNode = 0;

   const SIZE_T largePageMinimum = USE_LARGE_PAGES ? ::GetLargePageMinimum() : 0;

   SYSTEM_INFO systemInfo;

   ::GetSystemInfo(&amp;amp;systemInfo);

   systemInfo.dwAllocationGranularity;
   
   const unsigned __int64 granularity = (largePageMinimum == 0 ? systemInfo.dwAllocationGranularity : largePageMinimum);

   const unsigned __int64 desiredSize = recvBufferSize * pendingRecvs;

   unsigned __int64 actualSize = RoundUp(desiredSize, granularity);

   if (actualSize &amp;gt; std::numeric_limits&amp;lt;DWORD&amp;gt;::max())
   {
      actualSize = (std::numeric_limits&amp;lt;DWORD&amp;gt;::max() / granularity) * granularity;
   }

   receiveBuffersAllocated = std::min&amp;lt;DWORD&amp;gt;(pendingRecvs, static_cast&amp;lt;DWORD&amp;gt;(actualSize / recvBufferSize));

   bufferSize = static_cast&amp;lt;DWORD&amp;gt;(actualSize);

   char *pBuffer = reinterpret_cast&amp;lt;char *&amp;gt;(VirtualAllocExNuma(
      GetCurrentProcess(),
      0,
      bufferSize,
      MEM_COMMIT |
      MEM_RESERVE  |
      (largePageMinimum != 0 ? MEM_LARGE_PAGES : 0),
      PAGE_READWRITE,
      preferredNumaNode));

   if (pBuffer == 0)
   {
      ErrorExit("VirtualAlloc");
   }

   return pBuffer;
}&lt;/pre&gt;
Our allocation function is again slightly more complex than it need be, but that complexity allows us to explore various options by simply changing our configuration constants; you can ignore things like the &lt;code&gt;USE_LARGE_PAGES&lt;/code&gt; flag and the fact that we're allocating to a preferred NUMA node unless you're interested in the details and your hardware supports these features. The important thing is that we allocate in terms of the system's allocation granularity and that we use a variant of &lt;code&gt;VirtualAlloc()&lt;/code&gt; to do. Once again, in the name of simplicity, we leak these buffers (which will be in use for the life of the program) and allow program exit to clean them up.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Calling RIODequeueCompletion() and processing results&lt;/h2&gt;
&lt;div&gt;
We finally have queues created, buffers registered and reads pending. Processing these reads in our simple polled RIO server is fairly straight forward. First we enter a polling loop for completions and spin until completions are available. Once we have at least one completion we call &lt;code&gt;StartTiming()&lt;/code&gt; to start our performance timing. We then process the completion results. Our performance tests are simple, we send a number of datagrams of &lt;code&gt;EXPECTED_DATA_SIZE&lt;/code&gt; and then indicate that the test is complete by sending a series of datagrams of a different size. Once our servers receive an unexpected sized datagram they consider the test to be complete and shutdown. Thus our main completion loop is the &lt;code&gt;do/while&lt;/code&gt; loop below. We process datagrams, issue new reads and then dequeue more results. Once we're done we stop our timer and display details about the time taken and the number of datagrams that we processed.
&lt;pre class="brush: cpp gutter: false"&gt;bool done = false;

   DWORD recvFlags = 0;

   RIORESULT results[RIO_MAX_RESULTS];

   ULONG numResults = 0;

   do
   {
      numResults = g_rio.RIODequeueCompletion(
         g_queue,
         results,
         RIO_MAX_RESULTS);

      if (0 == numResults)
      {
         YieldProcessor();
      }
      else if (RIO_CORRUPT_CQ == numResults)
      {
         ErrorExit("RIODequeueCompletion");
      }
   }
   while (0 == numResults);

   StartTiming();

   int workValue = 0;

   bool running = true;

   do
   {
      for (DWORD i = 0; i &amp;lt; numResults; ++i)
      {
         EXTENDED_RIO_BUF *pBuffer = reinterpret_cast&amp;lt;EXTENDED_RIO_BUF *&amp;gt;(results[i].RequestContext);

         if (results[i].BytesTransferred == EXPECTED_DATA_SIZE)
         {
            g_packets++;

            workValue += DoWork(g_workIterations);

            if (!g_rio.RIOReceive(
               g_requestQueue,
               pBuffer,
               1,
               recvFlags,
               pBuffer))
            {
               ErrorExit("RIOReceive");
            }

            done = false;
         }
         else
         {
            done = true;
         }
      }

      if (!done)
      {
         do
         {
            numResults = g_rio.RIODequeueCompletion(
               g_queue,
               results,
               RIO_MAX_RESULTS);

            if (0 == numResults)
            {
               YieldProcessor();
            }
            else if (RIO_CORRUPT_CQ == numResults)
            {
               ErrorExit("RIODequeueCompletion");
            }
         }
         while (0 == numResults);
      }
   }
   while (!done);

   StopTiming();

   PrintTimings();

   return workValue;
}&lt;/pre&gt;
The &lt;code&gt;DoWork()&lt;/code&gt; function above can be used to add 'processing overhead' to each datagram. This can be configured using the &lt;code&gt;g_workIterations&lt;/code&gt; which is defined in &lt;code&gt;Constants.h&lt;/code&gt;. With this set to 0 there is no overhead and we can compare how quickly each API can receive datagrams. Setting larger values will affect how the various multi-threaded examples perform and can be useful if you're unable to saturate the test machine's network interfaces.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The code for this example can be downloaded from &lt;a href="http://www.serverframework.com/zips/RIO-PolledUDP.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'RIO-PolledUDP.zip']);"&gt;here&lt;/a&gt;. This code requires Visual Studio 11, but would work with earlier compilers if you have a Windows SDK that supports RIO. Note that &lt;code&gt;Shared.h&lt;/code&gt; and &lt;code&gt;Constants.h&lt;/code&gt; contain helper functions and tuning constants for ALL of the examples and so there will be code in there that is not used by this example. You should be able to unzip each example into the same directory structure so that they all share the same shared headers. This allows you to tune all of the examples the same so that any performance comparisons make sense.   
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;h2 class="entry-body"&gt;Join in&lt;/h2&gt;
Comments and suggestions are more than welcome. I'm learning as I go here and I'm quite likely to have made some mistakes or come up with some erroneous conclusions, feel free to put me straight and help make these examples better.
&lt;/div&gt;
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=KiFFoHzoOBo:bEmAcqyrkpk:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=KiFFoHzoOBo:bEmAcqyrkpk:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/KiFFoHzoOBo" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O Example UDP Servers - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/OX7vwnASdRo/windows-8-registered-io-example-udp-servers.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1169</id>

    <published>2012-03-07T11:27:00Z</published>
    <updated>2012-03-15T15:28:35Z</updated>

    <summary> I've been looking at the Windows 8 Registered I/O Networking Extensions since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
I've been looking at the &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;Windows 8 Registered I/O Networking Extensions&lt;/a&gt; since October when they first made an appearance as part of the Windows 8 Developer Preview. Whilst exploring and understanding the new API I spent some time putting together some simple UDP servers using the various notification styles that RIO provides. I then put together some equally simple UDP servers using the "traditional" APIs so that I could compare performance.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;RIO API demonstration&lt;/h2&gt;
&lt;div&gt;
The examples are simple in that they do the bare minimum to demonstrate the APIs in question but they are configurable so that you can tune them to the hardware on which you're running them. You can run them to compare the maximum speed at which you can pull UDP datagrams off of the wire using each API and then adjust the examples so that they do a specific amount of "work" with each datagram to simulate a slightly more realistic scenario.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Simplified error handling&lt;/h2&gt;
&lt;div&gt;
Error handling is limited, we display an error and exit the program, but we don't skip error checking, all API calls are checked for errors. The examples are each stand alone but can share two common header files. The first, &lt;code&gt;Constants.h&lt;/code&gt;, contains all constants that are used to tune the examples. The second, &lt;code&gt;Shared.h&lt;/code&gt;, contains inline helper functions which hide some of the complexity and allow the individual example programs to focus on the area of the API that they're demonstrating.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;This is the index page&lt;/h2&gt;
&lt;div&gt;
I will be blogging about the construction of the various examples over the next few weeks and updating this entry as an index page for all of the examples. I've listed the examples that I'll be talking about and I'll link to each blog post as they go live. Once I've presented the RIO examples I'll present the more traditional examples and finally some performance comparisons.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;RIO server examples&lt;/h2&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---simple-rio-polled-udp-example-server.html"&gt;RIO Polled UDP&lt;/a&gt; - A server which uses a single thread and a tight loop to poll for RIO completions.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-event-driven-udp-example-server.html"&gt;RIO Event Driven UDP&lt;/a&gt; - A server which uses a single thread and event driven notifications to handle RIO completions.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---single-threaded-rio-iocp-udp-example-server.html"&gt;RIO IOCP UDP&lt;/a&gt; - A server which uses a single thread and I/O Completion Port notifications to handle RIO completions.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---multi-threaded-rio-iocp-udp-example-server.html"&gt;RIO IOCP MT UDP&lt;/a&gt; - A server which uses a configurable number of threads and I/O Completion Port notifications to handle RIO completions.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Traditional server examples&lt;/h2&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---traditional-polled-udp-example-server.html"&gt;Simple Polled UDP&lt;/a&gt; - A server which uses a single thread and a tight loop to poll &lt;code&gt;WSARecv()&lt;/code&gt; for datagrams.&lt;/li&gt;
&lt;li&gt;IOCP UDP - A server which uses a single thread and I/O Completion Port notifications to handle overlapped &lt;code&gt;WSARecv()&lt;/code&gt; completions.&lt;/li&gt;
&lt;li&gt;IOCP MT UDP - A server which uses a configurable number of threads and I/O Completion Port notifications to handle overlapped &lt;code&gt;WSARecv()&lt;/code&gt; completions.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;A simple UDP datagram traffic generator&lt;/h2&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io---generating-load-for-the-performance-tests.html"&gt;Simple UDP traffic generator&lt;/a&gt; - A client which uses a single thread and a tight loop send datagrams using &lt;code&gt;WSASendTo()&lt;/code&gt;, this easily saturates a 1000BASE-T, 1Gb ethernet connection.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Test scripts&lt;/h2&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/zips/RIO-TestScripts.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'RIO-TestScripts.zip']);"&gt;Test scripts&lt;/a&gt; - These simple scripts create performance counter logs and run the test servers.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Performance Test results&lt;/h2&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-performance.html"&gt;The first tests&lt;/a&gt; - Where we compare the simple polled traditional server with the polled RIO server.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;h2 class="entry-body"&gt;Join in&lt;/h2&gt;
Comments and suggestions are more than welcome. I'm learning as I go here and I'm quite likely to have made some mistakes or come up with some erroneous conclusions, feel free to put me straight and help make these examples better.
&lt;/div&gt;
 
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=OX7vwnASdRo:IriY_5aB1N0:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=OX7vwnASdRo:IriY_5aB1N0:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/OX7vwnASdRo" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/03/windows-8-registered-io-example-udp-servers.html</feedburner:origLink></entry>

<entry>
    <title>New release of deadlock detection tools - Lock Explorer</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/U6oqTkJi8Kk/new-release-of-deadlock-detection-tools.html" />
    <id>tag:www.lockexplorer.com,2012:/blog//9.1166</id>

    <published>2012-02-20T14:49:07Z</published>
    <updated>2012-02-20T14:13:02Z</updated>

    <summary> We've released new versions of both LID and LIA today. These releases include many changes that we've been testing with customers over the last couple of months and mainly deal with fixing hangs during the shutdown of managed applications...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="General" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lockexplorer.com/blog/">
        &lt;div&gt;
We've released new versions of both LID and LIA today. These releases include many changes that we've been testing with customers over the last couple of months and mainly deal with fixing hangs during the shutdown of managed applications when they are run under the tools and performance improvements when running on target processes which use a large number of locks and create a large number of lock acquisition sequences.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
In addition we've fixed some bugs with &lt;code&gt;TryEnterCriticalSection()&lt;/code&gt; where we were incorrectly reporting lock inversions when a call to &lt;code&gt;TryEnterCriticalSection()&lt;/code&gt; returned &lt;code&gt;FALSE&lt;/code&gt; and added a few extra command line switches so that you can show progress during lock inversion detection after the target process completes. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
You can download the latest version of LID from &lt;a href="http://www.lockexplorer.com/download.html"&gt;here&lt;/a&gt; and all customers are being contacted via email with details of how to download the latest release of LIA.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Do continue to &lt;a href="http://www.lockexplorer.com/contact/contact.html"&gt;get in touch with comments and suggestions&lt;/a&gt; and any problems that you have.
&lt;/div&gt;
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=U6oqTkJi8Kk:E5EqlOhZZRw:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=U6oqTkJi8Kk:E5EqlOhZZRw:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/U6oqTkJi8Kk" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.lockexplorer.com/blog/2012/02/new-release-of-deadlock-detection-tools.html</feedburner:origLink></entry>

<entry>
    <title>Latest release of The Server Framework: 6.5.4 - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/I7M5ArjIws0/latest-release-of-the-server-framework-654.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1167</id>

    <published>2012-02-14T08:28:41Z</published>
    <updated>2012-02-15T10:38:13Z</updated>

    <summary> Version 6.5.4 of The Server Framework was released today. This release contains two important bug fixes and a selection of minor improvements. If you run your code on Vista/Windows Server 2003 or later and you don't explicitly disable FILE_SKIP_COMPLETION_PORT_ON_SUCCESS...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Releases" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
Version 6.5.4 of The Server Framework was released today.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
This release contains two important bug fixes and a selection of minor improvements. If you run your code on Vista/Windows Server 2003 or later and you don't explicitly disable &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt; in your Config.h then you should install this update.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;This release includes the following, see the release notes, &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/sockettoolsreleasenotes.html"&gt;here&lt;/a&gt;, for full details of all changes.&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;Bug fix. If &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt; was enabled but JetByteTools::Socket::CanEnableSkipCompletionPortOnSuccess()
returned false then the the code that handled issuing read and write calls would fail if ERROR_SUCCESS was returned
because it would assume that &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt; was enabled and that it should
handle the completion directly but a completion would have been posted to the IOCP and so the completion would
get handled twice. We now correctly whether we have actually enabled &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt; rather
than just whether we want to enable it.&lt;/li&gt;
&lt;li&gt;Change to &lt;code&gt;JetByteTools::Socket::CConnectionMaintainingStreamSocketConnectionFilter&lt;/code&gt; so that we do not attempt to maintain a connection
if the reconnect delay is 0.&lt;/li&gt;
&lt;li&gt;Added the concept of being able to force a write request to go via the I/O pool even if marshalling is currently turned off.&lt;/li&gt;
&lt;li&gt;Added &lt;code&gt;JetByteTools::Socket::IManageStreamSocketConnectionFilters::TryRequestWrite()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Changed how &lt;code&gt;JetByteTools::Socket::CFlowControlStreamSocketConnectionFilter&lt;/code&gt; issues write requests and how it
deals with write failure due to socket closure. We now purge any queued data when we detect the socket has
been closed, rather than continuing to try and send more.&lt;/li&gt;
&lt;li&gt;Added &lt;code&gt;JetByteTools::Socket::IDatagramSendSocket&lt;/code&gt; which is a common base class for &lt;code&gt;JetByteTools::Socket::IDatagramSocket&lt;/code&gt;
and &lt;code&gt;JetByteTools::Socket::IDatagramServerSocket&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Added &lt;code&gt;JetByteTools::Socket::IFilterableStreamSocket::CanIssueFilteredWrite()&lt;/code&gt; which is now called instead of
&lt;code&gt;JetByteTools::Socket::IStreamSocketEx::CanWrite()&lt;/code&gt; by &lt;code&gt;JetByteTools::Socket::CFilteringStreamSocketConnectionManagerBase::TryRequestWrite()&lt;/code&gt;.
This removes a race condition during the shutdown of the write side of a socket in situations where filtering is being used and the
filter wishes to write to the socket after the application level code has requested that the write side of the socket be shut down. We always
tracked the outstanding write count before actually issuing the shutdown and the filter could manage this to allow it to be able to
send after a shutdown had been requested BUT the filtered send could still fail as the socket's write shutdown flag would be set. This
new function does not check the write shutdown flag and so allows the filter to write successfully.
&lt;/li&gt;&lt;li&gt;Changed &lt;code&gt;JetByteTools::IO::IAllocateBufferHandles::Flush()&lt;/code&gt; so that it returns a bool indicating
if buffers were active when the flush was done. This brings it in line with &lt;code&gt;JetByteTools::IO::IAllocateBuffers&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Changed the &lt;code&gt;JetByteTools::IO::IAsyncIOStream::Write()&lt;/code&gt; methods so that they take an optional bool
that enables you to force the write to go via the I/O pool even if I/O marshalling is turned off.
&lt;/li&gt;&lt;li&gt;Changed &lt;code&gt;JetByteTools::IO::CAsyncFileLog&lt;/code&gt; to monitor its own write thread to remove the chance that
it might hang during destruction if the thread has terminated due to an exception.&lt;/li&gt;
&lt;/ul&gt;

&lt;/div&gt;
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=I7M5ArjIws0:USMIvh14VaI:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=I7M5ArjIws0:USMIvh14VaI:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/I7M5ArjIws0" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/02/latest-release-of-the-server-framework-654.html</feedburner:origLink></entry>

<entry>
    <title>The advantage of having lots of clients and clients with lots of clients - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/PT2qt0IHqRQ/the-advantage-of-having-lots-of-clients-and-clients-with-lots-of-clients.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1165</id>

    <published>2012-02-08T12:11:02Z</published>
    <updated>2012-02-08T12:38:42Z</updated>

    <summary> Our Secretive Online Game Company client uses The Server Framework for their custom application server for the games industry. They have thousands of users who run their server on a very diverse set of hardware. This is great for...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Bug fixes" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
Our &lt;a href="http://www.serverframework.com/clients/online-game-company.html" target="_blank"&gt;Secretive Online Game Company&lt;/a&gt; client uses The Server Framework for their custom application server for the games industry. They have thousands of users who run their server on a very diverse set of hardware. This is great for us as it really helps to shake down The Server Framework. There's nothing like running your multi-threaded code on lots of different hardware to help find all of the hidden race conditions and whatever. I'm pleased that we have so few bug reports coming in from our clients. Especially knowing that our Online Game Company client has the latest code out in the field or at least in use internally on their cloud system. Unfortunately one of their clients has recently exposed a latent bug in a rarely used corner of The Server Framework. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
One of the features of The Server Framework is that we track new features in various Windows operating systems so that you can take advantage of them simply by upgrading. Often you only need to adjust your &lt;code&gt;Config.h&lt;/code&gt; file to enable powerful new features. One of these features is &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt;, you can &lt;a href="http://www.lenholgate.com/cgi-bin/mt/mt-search.cgi?search=FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&amp;amp;IncludeBlogs=11%2C12&amp;amp;limit=20" target="_blank"&gt;read more about it over on Len's blog&lt;/a&gt;. This allows some optimisation in thread scheduling and context switching and generally improves performance on busy servers.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Back in April 2011 we updated our &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt; support to include protection from incompatible networking providers, see &lt;a href="http://support.microsoft.com/kb/2568167" target="_blank"&gt;this Microsoft Knowledge Base article for details of the potential problem&lt;/a&gt;. In addition to the compile time support for &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt; which can be used to turn the feature on and off in The Server Framework, we added a run time check to ensure that the machine on which the code was running did not have any incompatible networking providers installed.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Unfortunately this code wasn't tested as well as it could be and there was a bug in it which leads to problems on systems that have an incompatible networking provider installed and &lt;code&gt;FILE_SKIP_COMPLETION_PORT_ON_SUCCESS&lt;/code&gt; support turned on. This leads to the completions for some operations being processed twice and subsequent reference counting (over-release) problems with the corresponding socket and buffer structures.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
We've fixed this issue and it will be included in a 6.5.4 release which is currently in test. If you think you're suffering from the problems caused by this and need the fix immediately then please get in touch.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The offending networking provider in this case was the "AVSDA" provider which, a quick web search suggests, is part of Avira Anti virus.
&lt;/div&gt;
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=PT2qt0IHqRQ:3KV1BN7_HoU:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=PT2qt0IHqRQ:3KV1BN7_HoU:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/PT2qt0IHqRQ" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/02/the-advantage-of-having-lots-of-clients-and-clients-with-lots-of-clients.html</feedburner:origLink></entry>

<entry>
    <title>WASP download of XP versions now fixed  - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/TVFMDKbXBfg/wasp-download-of-xp-versions-now-fixed.html" />
    <id>tag:www.serverframework.com,2012:/asynchronousevents//2.1164</id>

    <published>2012-02-02T14:58:40Z</published>
    <updated>2012-02-02T15:04:19Z</updated>

    <summary> I've just noticed a problem with downloading the XP versions of WASP. This is now fixed. The XP versions can now be downloaded correctly again from here. Sorry for any inconvenience caused....</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="WASP" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
I've just noticed a problem with downloading the XP versions of WASP.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
This is now fixed. The XP versions can now be downloaded correctly again from &lt;a href="http://www.serverframework.com/products---download-wasp.html"&gt;here&lt;/a&gt;. Sorry for any inconvenience caused.
&lt;/div&gt;
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=TVFMDKbXBfg:i8S4djspWzM:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=TVFMDKbXBfg:i8S4djspWzM:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/TVFMDKbXBfg" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2012/02/wasp-download-of-xp-versions-now-fixed.html</feedburner:origLink></entry>

<entry>
    <title>Happy New Year - Company News</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/4LRK8iRtwj8/happy-new-year.html" />
    <id>tag:www.jetbyte.com,2012:/news//14.1163</id>

    <published>2012-01-13T15:42:57Z</published>
    <updated>2012-01-13T16:40:42Z</updated>

    <summary> The year has kicked off to a very busy start for us with lots of work from our secretive Online Gaming Company. They're doing a lot of work to enhance the product that we've helped them build so that...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="C++" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="CLR Hosting" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Fixed-price" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Server Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Windows" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.jetbyte.com/news/">
        &lt;div&gt;
The year has kicked off to a very busy start for us with lots of work from our &lt;a href="http://www.serverframework.com/clients/online-game-company.html" target="_blank"&gt;secretive Online Gaming Company&lt;/a&gt;. They're doing a lot of work to enhance the product that we've helped them build so that it can run well in cloud environments for their clients and also form the core of their cloud-based service. Much of our current work for them is to do with server to server communications so that they can build a scalable system that can use resources in the cloud to grow on demand. We've been working with them on aspects of this for some time now and it's all starting to come together nicely. Additionally we've recently added &lt;a href="http://www.serverframework.com/products---the-websockets-option.html" target="_blank"&gt;WebSockets&lt;/a&gt; support to their server so that they can easily support a new breed of browser based clients.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
We're also excited about working more closely with our &lt;a href="http://www.serverframework.com/clients/industrial-control-client.html" target="_blank"&gt;Industrial Control Client&lt;/a&gt; who are sponsoring us for &lt;a href="http://www.security-clearance.org.uk/" target="_blank"&gt;SC level security clearance&lt;/a&gt; so that we can get more involved with their systems; if we told you any more we'd have to kill you...
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
We've had more fixed price work for our &lt;a href="http://www.jetbyte.com/news/2011/02/fixed-price-server-development-for-a-custom-electronics-manufacturer.html" target="_blank"&gt;Custom Electronics Manufacturer&lt;/a&gt;, adding &lt;a href="http://www.serverframework.com/products---the-ssltls-using-schannel-option.html" target="_blank"&gt;SChannel based SSL support&lt;/a&gt; to their server. This server allows thousands of embedded devices to connect to their central servers to upload logs and download firmware. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
We've still been too busy to do much work on the &lt;a href="http://www.lockexplorer.com" target="_blank"&gt;LockExplorer&lt;/a&gt; website but there's an update to the tools scheduled for the end of the month to include all of the recent changes and fixes that have been applied whilst we've been '&lt;a href="http://en.wikipedia.org/wiki/Eating_your_own_dog_food" target="_blank"&gt;dog fooding&lt;/a&gt;' them with our clients and on &lt;a href="http://www.serverframework.com/" target="_blank"&gt;The Server Framework&lt;/a&gt;'s build machines.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Finally, we already have several releases planned for &lt;a href="http://www.serverframework.com/" target="_blank"&gt;The Server Framework&lt;/a&gt; this year; watch &lt;a href="http://www.serverframework.com/asynchronousevents/releases/" target="_blank"&gt;this&lt;/a&gt; space!
&lt;/div&gt;

        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=4LRK8iRtwj8:5eBa6IqgDDk:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=4LRK8iRtwj8:5eBa6IqgDDk:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/4LRK8iRtwj8" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.jetbyte.com/news/2012/01/happy-new-year.html</feedburner:origLink></entry>

<entry>
    <title>Latest release of The Server Framework: 6.5.3 - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/TLzTWfdj-2U/latest-release-of-the-server-framework-653.html" />
    <id>tag:www.serverframework.com,2011:/asynchronousevents//2.1162</id>

    <published>2011-12-14T08:47:43Z</published>
    <updated>2011-12-14T09:58:41Z</updated>

    <summary> Version 6.5.3 of The Server Framework was released today. This release updates the WebSockets Option pack to the final version of the protocol as detailed in RFC 6455 which was released yesterday. There is also a bug fix to...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Releases" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
Version 6.5.3 of The Server Framework was released today.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
This release updates the &lt;a href="http://www.serverframework.com/products---the-websockets-option.html"&gt;WebSockets Option pack&lt;/a&gt; to the final version of the protocol as detailed in &lt;a href="http://tools.ietf.org/html/rfc6455" target="_blank"&gt;RFC 6455&lt;/a&gt; which was released yesterday. There is also a bug fix to WebSocket status reason processing. If you have 6.5 or 6.5.1 or 6.5.2 and you are NOT using WebSockets then you probably don't need this release.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;This release includes the following, see the release notes, &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/sockettoolsreleasenotes.html"&gt;here&lt;/a&gt;, for full details of all changes.&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;Updated to support &lt;a href="http://tools.ietf.org/html/rfc6455" target="_blank"&gt;RFC 6455&lt;/a&gt; - added close status codes 1011 and 1015.&lt;/li&gt;
&lt;li&gt;Fixed a bug in the handling of long status result messages, we now truncate them correctly.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=TLzTWfdj-2U:qegbu_8WWp0:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=TLzTWfdj-2U:qegbu_8WWp0:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/TLzTWfdj-2U" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2011/12/latest-release-of-the-server-framework-653.html</feedburner:origLink></entry>

<entry>
    <title>RFC 6455: The WebSocket protocol - Rambling Comments</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/qFu3fvAa-Lw/rfc-6455-the-websocket-protocol.html" />
    <id>tag:www.lenholgate.com,2011:/blog//12.1161</id>

    <published>2011-12-14T08:46:47Z</published>
    <updated>2011-12-14T09:03:46Z</updated>

    <summary> I know I've said this before, but now it's really done... The WebSocket protocol is now an official RFC. There are a small number of changes between RFC 6455 and the draft WebSocket protocol version 17; the only important...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Geek Speak" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Socket Servers" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        &lt;div&gt;
I know &lt;a href="http://www.lenholgate.com/blog/2011/09/the-websocket-protocol-is-done.html"&gt;I've said this before&lt;/a&gt;, but now it's really done...
&lt;/div&gt;
&lt;div&gt;&lt;br/&gt;&lt;/div&gt;
&lt;div&gt;
The WebSocket protocol is now an official RFC. There are a small number of changes between RFC 6455 and the draft WebSocket protocol version 17; the only important ones being he addition of two new close status codes. The rest is just a case of tidying up the draft.
&lt;/div&gt;
&lt;div&gt;&lt;br/&gt;&lt;/div&gt;
&lt;div&gt;
There will be a 6.5.3 release of &lt;a href="http://www.serverframework.com/products---the-websockets-option.html" target="_blank"&gt;The Server Framework&lt;/a&gt; to include these changes.
&lt;/div&gt;
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=qFu3fvAa-Lw:781iXAg00Z0:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=qFu3fvAa-Lw:781iXAg00Z0:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/qFu3fvAa-Lw" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.lenholgate.com/blog/2011/12/rfc-6455-the-websocket-protocol.html</feedburner:origLink></entry>

<entry>
    <title>New client profile: Smart Moves Software Systems - Online gaming - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/Ru4Of8hZZrc/new-client-profile-smart-moves-software-systems-ltd---online-gaming.html" />
    <id>tag:www.serverframework.com,2011:/asynchronousevents//2.1159</id>

    <published>2011-11-23T08:38:32Z</published>
    <updated>2011-11-23T08:45:03Z</updated>

    <summary><![CDATA[We have a new client profile available here for a new client who selected The&nbsp;Server&nbsp;Framework to help it expand its online gaming platform to incorporate a WebSockets interface....]]></summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="General" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        We have a new client profile available &lt;a href="http://www.serverframework.com/clients/smart-moves-software-systems-ltd.html"&gt;here&lt;/a&gt; for a new client who selected The&amp;nbsp;Server&amp;nbsp;Framework to help it expand its online gaming platform to incorporate &lt;a href="http://www.serverframework.com/products---the-websockets-option.html"&gt;a WebSockets interface&lt;/a&gt;.
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=Ru4Of8hZZrc:ieQNcKZey0I:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=Ru4Of8hZZrc:ieQNcKZey0I:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/Ru4Of8hZZrc" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2011/11/new-client-profile-smart-moves-software-systems-ltd---online-gaming.html</feedburner:origLink></entry>

<entry>
    <title>A new release of WASP, now with SSL/TLS support - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/1aiTb3RRxlE/a-new-release-of-wasp-now-with-ssltls-support.html" />
    <id>tag:www.serverframework.com,2011:/asynchronousevents//2.1157</id>

    <published>2011-11-22T10:10:47Z</published>
    <updated>2011-11-22T10:48:41Z</updated>

    <summary> We've just released a new version of WASP, our pluggable application server platform. This release is built with release 6.5.2 of The Server Framework and includes support for secure TCP connections using SSL/TLS via our SChannel Option pack. Setting...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="WASP" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="WASP Tutorial" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
We've just released a new version of &lt;a href="http://www.serverframework.com/products---wasp.html"&gt;WASP, our pluggable application server platform&lt;/a&gt;. This release is built with release 6.5.2 of The Server Framework and includes support for secure TCP connections using SSL/TLS via our &lt;a href="http://www.serverframework.com/products---the-ssltls-using-schannel-option.html"&gt;SChannel Option pack&lt;/a&gt;.
&lt;/div&gt;
&lt;br /&gt;
&lt;div&gt;
Setting up a secure TCP endpoint with WASP is easy, simply add the &lt;b&gt;Secure&lt;/b&gt; configuration option to the &lt;b&gt;&amp;lt;EndPoint&amp;gt;&lt;/b&gt; node like this:
&lt;/div&gt;
&lt;pre class="brush: xml gutter: false"&gt;&amp;lt;?xml version="1.0" encoding="Windows-1252"?&amp;gt;
&amp;lt;Configuration&amp;gt;
  &amp;lt;WASP&amp;gt;
    &amp;lt;TCP&amp;gt;
      &amp;lt;Endpoints&amp;gt;
        &amp;lt;EndPoint
          Name="Echo Server"
          Port="5050"
          HandlerDLL="[CONFIG]\EchoServer.dll"
          Secure="true"&amp;gt;
        &amp;lt;/EndPoint&amp;gt;
      &amp;lt;/Endpoints&amp;gt;
    &amp;lt;/TCP&amp;gt;
  &amp;lt;/WASP&amp;gt;
&amp;lt;/Configuration&amp;gt;
&lt;/pre&gt;
&lt;div&gt;
This tells WASP to secure the endpoint using a default certificate called "Wasp" that is located in the "MY" certificate store. You can add a self signed test certificate using the standard Microsoft "make cert" utility, makecert.exe and a simple script which creates and installs the correct type of certificate can be downloaded from &lt;a href="http://www.serverframework.com/WASP/Examples/MakeCert.zip" onclick="javascript: pageTracker._trackPageview('/downloads/WASP-MakeCert'); "&gt;here&lt;/a&gt;.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
if you do not want to use a certificate called "Wasp" in the "MY" certificate store then you can configure the certificate used by adding the &lt;b&gt;StoreName&lt;/b&gt;, &lt;b&gt;CertificateName&lt;/b&gt; and &lt;b&gt;UseMachineStore&lt;/b&gt; config values.
&lt;/div&gt;
&lt;pre class="brush: xml gutter: false"&gt;&amp;lt;?xml version="1.0" encoding="Windows-1252"?&amp;gt;
&amp;lt;Configuration&amp;gt;
  &amp;lt;WASP&amp;gt;
    &amp;lt;TCP&amp;gt;
      &amp;lt;Endpoints&amp;gt;
        &amp;lt;EndPoint
          Name="Echo Server"
          Port="5050"
          HandlerDLL="[CONFIG]\EchoServer.dll"
          Secure="true"
          StoreName="OurSpecialStore"
          CertificateName="OurCertificate"
          UseMachineStore="true"&amp;gt;
        &amp;lt;/EndPoint&amp;gt;
      &amp;lt;/Endpoints&amp;gt;
    &amp;lt;/TCP&amp;gt;
  &amp;lt;/WASP&amp;gt;
&amp;lt;/Configuration&amp;gt;
&lt;/pre&gt;
&lt;div&gt;Testing your new secure endpoint can be done using either our OpenSSL server test or our SChannel server test. These are example clients that ship with 
&lt;a href="http://www.serverframework.com/products---the-server-framework.html"&gt;The Server Framework&lt;/a&gt; and that allow you to create thousands of concurrent connections and control how they send data to a server. This is an easy way to build a test system for your server as all of the complexity of managing and controlling the connections is done for you and you simply have to adjust the messages that are generated and how the response validation is done. The default message that is built is an network byte order integer length prefixed message and so this program can be used to stress test &lt;a href="http://www.serverframework.com/products---wasp.html"&gt;WASP&lt;/a&gt; with either of the first two example plugins that were discussed in the tutorial.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;You can download the SChannelEchoServerTest program from &lt;a href="http://www.serverframework.com/WASP/Examples/SChannelEchoServerTest.zip" onclick="javascript: pageTracker._trackPageview('/downloads/WASP-SChannelEchoServerTest'); "&gt;here&lt;/a&gt;. See our &lt;a href="http://www.serverframework.com/asynchronousevents/2010/10/stress-testing-wasp-using-the-echoservertest-program.html" tutorial="" on="" testing="" wasp&lt;="" a=""&gt; for details of how to run this tool.&lt;/a&gt;&lt;/div&gt;
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=1aiTb3RRxlE:MceTRAUeHoo:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=1aiTb3RRxlE:MceTRAUeHoo:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/1aiTb3RRxlE" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2011/11/a-new-release-of-wasp-now-with-ssltls-support.html</feedburner:origLink></entry>

<entry>
    <title>Latest release of The Server Framework: 6.5.2 - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/iW8-lLCjlZg/latest-release-of-the-server-framework-652.html" />
    <id>tag:www.serverframework.com,2011:/asynchronousevents//2.1156</id>

    <published>2011-11-21T08:25:00Z</published>
    <updated>2011-11-21T08:35:20Z</updated>

    <summary> Version 6.5.2 of The Server Framework was released today. This release adds some new functionality to the WebSockets Option pack and fixes some bugs in code that is only currently used by the WebSockets Option pack. If you have...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Releases" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
Version 6.5.2 of The Server Framework was released today.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
This release adds some new functionality to the &lt;a href="http://www.serverframework.com/products---the-websockets-option.html"&gt;WebSockets Option pack&lt;/a&gt; and fixes some bugs in code that is only currently used by the &lt;a href="http://www.serverframework.com/products---the-websockets-option.html"&gt;WebSockets Option pack&lt;/a&gt;. If you have 6.5 or 6.5.1 and you are not using WebSockets then you probably don't need this release.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;This release includes the following, see the release notes, &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/sockettoolsreleasenotes.html"&gt;here&lt;/a&gt;, for full details of all changes.&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;Bug fixes to &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_i_o_1_1_c_buffer_chain.html"&gt;JetByteTools::IO::CBufferChain&lt;/a&gt; and &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_i_o_1_1_c_sorted_buffer_chain.html"&gt;JetByteTools::IO::CSortedBufferChain&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Added&lt;/b&gt; &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_i_o_1_1_c_lockable_buffer_processor.html#d414c09190c1014e8fd88bd2ebc90786"&gt;JetByteTools::IO::CLockableBufferProcessor::PutBackAndLock()&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Adjusted how &lt;code&gt;JETBYTE_DUMP_STREAM_SOCKET_READ_TO_DEBUG_LOG&lt;/code&gt;, etc. work, you can now produce less debug data (faster) using &lt;code&gt;JETBYTE_TRACE_STREAM_SOCKET_READ_TO_DEBUG_LOG&lt;/code&gt;, etc.&lt;/li&gt;
&lt;li&gt;Removed duplication in WebSocket header processing.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Breaking Change&lt;/b&gt; &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_web_socket_1_1_i_protocol_handler.html#9b57c50697f8519bd7a23ff46aabcb45"&gt;JetByteTools::WebSocket::IProtocolHandler::OnConnectionClosed()&lt;/a&gt; has been added and should be called when the underlying socket connection is closed. The example servers have been updated so please compare these to your own servers to see the changes required.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Added&lt;/b&gt; overloads of &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_web_socket_1_1_i_web_socket.html#817dcfd9b0156da8fa9f6dfe96b090ab"&gt;JetByteTools::WebSocket::IWebSocket::WriteText()&lt;/a&gt; and &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_web_socket_1_1_i_web_socket.html#e1252df7aad51646a54cef13b9c0fda6"&gt;JetByteTools::WebSocket::IWebSocket::TryWriteText()&lt;/a&gt; which take &lt;code&gt;BYTE *&lt;/code&gt; of pre-formatted UTF8 data.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_web_socket_1_1_t_web_socket_base.html"&gt;JetByteTools::WebSocket::TWebSocketBase&lt;/a&gt; now holds a reference to the underlying stream socket until the connection closes, this is because we can't rely on their being a read pending to hold the stream socket open, even if the client thinks there is, as the websocket may have data buffered and so not issue a physical read.&lt;/li&gt;
&lt;li&gt;Adjusted &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_web_socket_1_1_c_auto_detect_protocol_handler.html"&gt;JetByteTools::WebSocket::CAutoDetectProtocolHandler&lt;/a&gt; so that it doesn't have to always issue a read when set to accept a connection. This allows for connections which may also carry non-websocket traffic.&lt;/li&gt;
&lt;li&gt;Adjusted how the HyBi protocol handler responds to errors. We now issue the 'correct' websocket layer close notification with the correct close code (we pass the &lt;a href="http://www.serverframework.com/ServerFramework/6.5.2/WebSockets/Autobahn-4.3/index.html"&gt;latest Autobahn tests (0.4.3)&lt;/a&gt; which include tests for appropriate close codes) and notify client code via two new callbacks on &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_web_socket_1_1_hy_bi_1_1_i_web_socket_server.html"&gt;JetByteTools::WebSocket::HyBi::IWebSocketServer&lt;/a&gt;, &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_web_socket_1_1_hy_bi_1_1_i_web_socket_server.html#89b79ff914e0693b9945ca64cc43b53d"&gt;JetByteTools::WebSocket::HyBi::IWebSocketServer::OnError()&lt;/a&gt; and &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_web_socket_1_1_hy_bi_1_1_i_web_socket_server.html#cc1cdd64fc1e0915ea3cb07f1504c193"&gt;JetByteTools::WebSocket::HyBi::IWebSocketServer::OnClosed()&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Bug fix to &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_web_socket_1_1_hy_bi_1_1_c_protocol_handler.html"&gt;JetByteTools::WebSocket::HyBi::CProtocolHandler&lt;/a&gt; which deals with a race condition during inbound data processing.&lt;/li&gt;
&lt;li&gt;Bug fix to &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_web_socket_1_1_hy_bi_1_1_c_protocol_handler.html"&gt;JetByteTools::WebSocket::HyBi::CProtocolHandler&lt;/a&gt; and &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_web_socket_1_1_hixie76_1_1_c_protocol_handler.html"&gt;JetByteTools::WebSocket::Hixie76::CProtocolHandler&lt;/a&gt; which deals with close handling, we now wait for the other side's close response correctly.&lt;/li&gt;
&lt;li&gt;Removed unnecessary size limitations in the &lt;code&gt;TCHAR *&lt;/code&gt; based socket write methods.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=iW8-lLCjlZg:su0de0dZIJg:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=iW8-lLCjlZg:su0de0dZIJg:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/iW8-lLCjlZg" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2011/11/latest-release-of-the-server-framework-652.html</feedburner:origLink></entry>

<entry>
    <title>Dropping support for Visual Studio .Net 2002 and 2003 - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/9R8ckNcMsgA/dropping-support-for-visual-studio-net-2002-and-2003.html" />
    <id>tag:www.serverframework.com,2011:/asynchronousevents//2.1155</id>

    <published>2011-11-09T11:25:24Z</published>
    <updated>2011-11-09T12:14:31Z</updated>

    <summary> We are dropping support for Visual Studio .Net 2002 from release 6.6 of The Server Framework which is due early next year. We don't expect that this will cause anyone any problems as this compiler became unsupported by Microsoft...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="General" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
We are dropping support for Visual Studio .Net 2002 from release 6.6 of The Server Framework which is due early next year. We don't expect that this will cause anyone any problems as this compiler &lt;a href="http://support.microsoft.com/lifecycle/search/?sort=PN&amp;amp;alpha=Visual+Studio" target="_blank"&gt;became unsupported by Microsoft in 2009&lt;/a&gt;, and most native C++ people seemed to skip this release entirely. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
We are also considering dropping support for Visual Studio .Net 2003. This compiler is still supported by Microsoft until 2013 but we expect that most people interested in native C++ will have switched to Visual Studio 2005 or 2008 at the earliest opportunity.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Feedback is essential here. If you DO still need Visual Studio .Net 2003 support then we need to know, otherwise we may bring forward its retirement date...
&lt;/div&gt;
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=9R8ckNcMsgA:nyQ7e-Ohi2I:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=9R8ckNcMsgA:nyQ7e-Ohi2I:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/9R8ckNcMsgA" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2011/11/dropping-support-for-visual-studio-net-2002-and-2003.html</feedburner:origLink></entry>

<entry>
    <title>New client profile: Takion Technologies - Equity trading platform - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/c586ZkwYwO0/new-client-profile-takion-technologies---equity-trading-platform.html" />
    <id>tag:www.serverframework.com,2011:/asynchronousevents//2.1149</id>

    <published>2011-11-04T09:11:03Z</published>
    <updated>2011-11-23T08:40:45Z</updated>

    <summary>We have a new client profile available here for a client that we've had since 2006 and who use The Server Framework in their equity trading systems....</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="General" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        We have a new client profile available &lt;a href="http://www.serverframework.com/clients/takion-technologies.html"&gt;here&lt;/a&gt; for a client that we've had since 2006 and who use The Server Framework in their equity trading systems.  
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=c586ZkwYwO0:Nlx9K_3jf2c:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=c586ZkwYwO0:Nlx9K_3jf2c:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/c586ZkwYwO0" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2011/11/new-client-profile-takion-technologies---equity-trading-platform.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O Buffer Strategies - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/ErS8EFWngKM/windows-8-registered-io-buffer-strategies.html" />
    <id>tag:www.serverframework.com,2011:/asynchronousevents//2.1147</id>

    <published>2011-10-31T07:30:00Z</published>
    <updated>2012-03-13T08:35:07Z</updated>

    <summary> One of the things that allows the Windows 8 Registered I/O Networking Extensions, RIO, to perform better than normal Winsock calls is the fact that the memory used for I/O operations is pre-registered with the API. This allows RIO...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
One of the things that allows the Windows 8 Registered I/O Networking Extensions, RIO, to perform better than normal Winsock calls is the fact that the memory used for I/O operations is pre-registered with the API. This allows RIO to do all the necessary checks that the buffer memory is valid, etc. once, and then lock the buffer in memory until you de-register it. Compare this to normal Winsock networking where the memory needs to be checked and locked on each operation and already we have a whole load of work that simply isn't required for each I/O operation. As always, take a look at &lt;a href="http://channel9.msdn.com/Events/BUILD/BUILD2011/SAC-593T" target="_blank"&gt;this video from Microsoft's BUILD conference&lt;/a&gt; for more in-depth details.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;RIO buffers need to be registered before use&lt;/h2&gt;
&lt;div&gt;
The recommended way to use &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/hh437199(v=vs.85).aspx" target="_blank"&gt;&lt;code&gt;RIORegisterBuffer()&lt;/code&gt;&lt;/a&gt; is to register large buffers and then use smaller slices from these buffers in your I/O calls, rather than registering each individual I/O buffer separately. This reduces the book-keeping costs as each registered buffer requires some memory to track its registration. It's also sensible to use page aligned memory for buffers that you register with &lt;code&gt;RIORegisterBuffer()&lt;/code&gt; as the locking granularity of the operating system is page level so if you use a buffer that is not aligned on a page boundary you will lock the entire page that it occupies. This is especially important given that there's &lt;a href="http://technet.microsoft.com/en-us/library/cc959494.aspx" target="_blank"&gt;a limit to the number of I/O pages that can be locked at one time&lt;/a&gt; and I would imagine that buffers registered with &lt;code&gt;RIORegisterBuffer()&lt;/code&gt; count against this limit.
&lt;/div&gt;
        &lt;div&gt;
Here we can see some common problems when you fail to take the locking granularity of the system into consideration when registering I/O buffers for RIO.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;a href="http://www.serverframework.com/asynchronousevents/assets_c/2011/10/RIO-Buffer-Alignment-301.html" onclick="window.open('http://www.serverframework.com/asynchronousevents/assets_c/2011/10/RIO-Buffer-Alignment-301.html','popup','width=850,height=317,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false"&gt;&lt;img src="http://www.serverframework.com/asynchronousevents/assets_c/2011/10/RIO-Buffer-Alignment-thumb-500x186-301.png" width="500" height="186" alt="RIO-Buffer-Alignment.png" class="mt-image-center" style="text-align: center; display: block; margin: 0 auto 20px;" /&gt;&lt;/a&gt;
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
To avoid locking more memory than you need to always align your buffers by allocating with &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/aa366887(v=vs.85).aspx" target="_blank"&gt;&lt;code&gt;VirtualAlloc()&lt;/code&gt;&lt;/a&gt;, or &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/aa366891(v=VS.85).aspx" target="_blank"&gt;&lt;code&gt;VirtualAllocExNuma()&lt;/code&gt;&lt;/a&gt;. Note that these both work in terms of pages of memory though the allocation size is specified in bytes, also note the alignment restrictions; the start of each allocated block will be on a boundary determined by the operating system's allocation granularity (see &lt;a href="http://blogs.msdn.com/b/oldnewthing/archive/2003/10/08/55239.aspx" target="_blank"&gt;here&lt;/a&gt; for why). You can obtain both page size and allocation granularity from a call to &lt;code&gt;GetSystemInfo()&lt;/code&gt;. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
So, to allocate buffers for RIO efficiently you should use &lt;code&gt;VirtualAlloc()&lt;/code&gt; to ensure alignment and you should allocate blocks which are multiples of the operating system's allocation granularity (or you'll be leaving unusable holes in your memory area). Some code like this might work:
&lt;/div&gt;
&lt;pre class="brush: cpp gutter: false"&gt;SYSTEM_INFO systemInfo;

GetSystemInfo(&amp;amp;systemInfo);

const DWORD gran = systemInfo.dwAllocationGranularity;

const DWORD bufferSize = RoundUp(requestedSize, gran);

char *pBuffer = reinterpret_cast&amp;lt;char *&amp;gt;(VirtualAlloc(
   0,
   bufferSize,
   MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE));

if (pBuffer)
{
   RIO_BUFFERID id = rio.RIORegisterBuffer(pBuffer, bufferSize);

   if (id == RIO_INVALID_BUFFERID)
   {
      // handle error
   }
}
&lt;/pre&gt;
&lt;h2 class="entry-body"&gt;Buffer slices&lt;/h2&gt;
&lt;div&gt;
Once you've registered your buffer you then access it using &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/hh437219(v=vs.85).aspx" target="_blank"&gt;&lt;code&gt;RIO_BUF&lt;/code&gt;&lt;/a&gt;s. The &lt;code&gt;RIO_BUF&lt;/code&gt; is a handle to a "slice" of a RIO buffer and is very similar to the familiar &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/ms741542(v=vs.85).aspx" target="_blank"&gt;&lt;code&gt;WSABUF&lt;/code&gt;&lt;/a&gt; that's used in normal Winsock calls. It's simply a buffer id, a start offset and a length. You might decide that you'll allocate one block of memory for your I/O buffers and that would be 64K in size (since 64K is the allocation granularity on x64 and x86) and then perform I/O in terms of fixed sized buffers of 4096 bytes, or whatever. You could then have a series of &lt;code&gt;RIO_BUF&lt;/code&gt;s, the first points to offset 0 and has a length of 4096, the second points to offset 4096, etc.
&lt;/div&gt;
&lt;a href="http://www.serverframework.com/asynchronousevents/assets_c/2011/10/RIO-Buffer-Slices-304.html" onclick="window.open('http://www.serverframework.com/asynchronousevents/assets_c/2011/10/RIO-Buffer-Slices-304.html','popup','width=714,height=240,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false"&gt;&lt;img src="http://www.serverframework.com/asynchronousevents/assets_c/2011/10/RIO-Buffer-Slices-thumb-500x168-304.png" width="500" height="168" alt="RIO-Buffer-Slices.png" class="mt-image-center" style="text-align: center; display: block; margin: 0 auto 20px;" /&gt;&lt;/a&gt;
&lt;div&gt;
Alternatively you might write a memory allocator that sits over your RIO registered buffers and allocates portions on demand for whatever size you need, although personally I favour using fixed sized I/O buffers. Many server designs may have messages which are naturally limited in size and work fine with single fixed sized buffers for send and receive operations. When your data is larger than a single buffer you can chain buffers together. For sending you can then send the chain using scatter/gather I/O and for receiving you can simply fill a buffer, allocate another and chain the buffers together so that you can process complete messages. We've been using fixed sized buffers in The&amp;nbsp;Server&amp;nbsp;Framework for over 10 years and they work just fine in general purpose situations as long as you can select a buffer size that makes sense for most operations. The one potential reason to want to use dynamic buffers for RIO is that whilst the header's show that &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/hh437213(v=vs.85).aspx" target="_blank"&gt;&lt;code&gt;RIOSend()&lt;/code&gt;&lt;/a&gt; supports scatter/gather I/O the documentation says that it doesn't and the Microsoft BUILD video implies that the extra parameters are 'reserved'. Even so, with RIO's strict limits on the number of outstanding operations (and the number of buffers used with those operations) using our current "send a chain of buffers" style of design may not work so well with RIO. We'll see. For now I'll be sticking to fixed sized buffers as I don't believe that placing a custom memory allocator over the top of the RIO buffer will perform well enough; it may well be that we end up with an allocator that can allocate a variety of fixed sizes from different pools...
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;An augmented RIO_BUF&lt;/h2&gt;
&lt;div&gt;
The &lt;code&gt;RIO_BUF&lt;/code&gt; is too simple a structure to manage complex buffering requirements. Since we'll likely be using I/O completion port based completion handling in our RIO code we'll need a way to manage the lifetime of the buffer slices that we're using. For example, the slice is active from the moment we allocate it to put outbound data into it, until the write completion occurs, at which time it could be returned to the allocator, or, reused directly. I've found that reference counting works well for this, it also allows for flexible server designs as, when reading, you can easily manage the extended lifetime of a buffer that you need to place in a chain, or pass off to another thread for processing. You can see the &lt;code&gt;OVERLAPPED&lt;/code&gt; based buffer interface that we use for IOCP servers in The&amp;nbsp;Server&amp;nbsp;Framework &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/class_jet_byte_tools_1_1_i_o_1_1_i_buffer.html" target="_blank"&gt;here&lt;/a&gt;, I imagine that the RIO buffer interface that I come up with will be similar, but somewhat simpler and without the &lt;code&gt;OVERLAPPED&lt;/code&gt;. So, we now have a RIO buffer slice that's managed by a separately allocated 'buffer' object.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Buffer slice management&lt;/h2&gt;
&lt;div&gt;
Since our buffer allocator will provide fixed sized buffers and work in terms of large blocks of memory for registering with RIO. It's probably also worth allocating a single large block of memory for all of the corresponding buffer objects that will manage the slices. We would then use placement new to allocate each of the objects in the single large block. We'd end up with two blocks of memory, like this:
&lt;/div&gt;
&lt;a href="http://www.serverframework.com/asynchronousevents/assets_c/2011/10/RIO-Buffers-307.html" onclick="window.open('http://www.serverframework.com/asynchronousevents/assets_c/2011/10/RIO-Buffers-307.html','popup','width=821,height=322,scrollbars=no,resizable=no,toolbar=no,directories=no,location=no,menubar=no,status=no,left=0,top=0'); return false"&gt;&lt;img src="http://www.serverframework.com/asynchronousevents/assets_c/2011/10/RIO-Buffers-thumb-500x196-307.png" width="500" height="196" alt="RIO-Buffers.png" class="mt-image-center" style="text-align: center; display: block; margin: 0 auto 20px;" /&gt;&lt;/a&gt;
&lt;div&gt;
The first would contain all of the buffer objects that manage the reference counts on the slices, hold the&amp;nbsp;&lt;code style="border-style: initial; border-color: initial; "&gt;RIO_BUF&lt;/code&gt;&amp;nbsp;structures and contain any book-keeping data we might need, the second&amp;nbsp;be registered with RIO and locked in memory. The allocator would then allocate new blocks of memory as it needs them and, if the buffer objects also manage a reference count on the allocator's data blocks, it could de-register and release buffers when they're no longer in use (as long as it's pooling enough buffers for later use...).
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
One thing to bear in mind here is that my design here is naturally driven towards being flexible and general purpose. The reason for this is that's the way The&amp;nbsp;Server&amp;nbsp;Framework works; yes it's always possible to get better performance from code that's designed specifically for one single server, but with our framework you can get a working server up and running very quickly and, in most situations, you'll find that you don't need to tune the code much at all for your specific server. Also I find that it's easier to get started this way...
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
As an alternative to a buffer allocator which manages pools of RIO buffers we might, instead want to allocate buffers per connection. After all, RIO limits the number of operations that can be pending at one time on a connection, so we could, in some situations, know exactly how many buffers we might need an assign them to the connection when it's established. Whilst this may perform better in some situations, per socket buffer pooling is something that can be added on top of a pooling buffer allocator (and this is currently being investigated in The&amp;nbsp;Server&amp;nbsp;Framework anyway), so we'll look at it later if necessary. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
So, that's the idea... More on this when I have some code.
&lt;/div&gt;

    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=ErS8EFWngKM:LrW07s2boZA:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=ErS8EFWngKM:LrW07s2boZA:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/ErS8EFWngKM" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-buffer-strategies.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O and I/O Completion Ports - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/ka4-MkuQ7s8/windows-8-registered-io-and-io-completion-ports.html" />
    <id>tag:www.serverframework.com,2011:/asynchronousevents//2.1146</id>

    <published>2011-10-25T17:56:24Z</published>
    <updated>2012-03-13T08:35:08Z</updated>

    <summary> In my last blog post I introduced the Windows 8 Registered I/O Networking Extensions, RIO. As I explained there are three ways to retrieve completions from RIO, polled, event driven and via an I/O Completion Port (IOCP). This makes...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
In my last blog post &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;I introduced the Windows 8 Registered I/O Networking Extensions&lt;/a&gt;, RIO. As I explained there are three ways to retrieve completions from RIO, polled, event driven and via an I/O Completion Port (IOCP). This makes RIO pretty flexible and allows it to be used in many different designs of servers. The polled scenario is likely aimed at very high performance UDP or High Frequency Trading style situations where you may be happy to burn CPU so as to process inbound datagrams as fast as possible. The event driven style may also help here, allowing you to wait efficiently rather than spin, but it's the IOCP style that currently interests me most at present as this promises to provide increased performance to more general purpose networking code.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Please bear in mind the caveats from my last blog post, this stuff is new, I'm still finding my way, the docs aren't in sync with the headers in the SDK and much of this is based on assumption and intuition. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;How do RIO and IOCP work together?&lt;/h2&gt;
&lt;div&gt;
RIO's completions arrive via a completion queue, which is fixed sized data structure that is shared between user space and kernel space (via locked memory?) and which does not require a kernel mode transition to dequeue from (see &lt;a href="http://channel9.msdn.com/Events/BUILD/BUILD2011/SAC-593T" target="_blank"&gt;this BUILD video for more details on RIO's internals&lt;/a&gt;). As I showed last time, you specify how you want to retrieve completions when you create the queue, either providing an event to be signalled, an IOCP to be posted to or nothing if you will simply poll the queue. When using an IOCP you get a notification sent to you when the completion queue is no longer empty after you have indicated that you want to receive completions by calling &lt;code&gt;RIONotify()&lt;/code&gt;.
&lt;/div&gt;  
        &lt;div&gt;
Simplified code for handling an IOCP driven RIO completion queue might look like this:
&lt;/div&gt;
&lt;pre class="brush: cpp gutter: false"&gt;if (::GetQueuedCompletionStatus(
   hIOCP, 
   &amp;amp;numberOfBytes,
   &amp;amp;completionKey,
   &amp;amp;pOverlapped,
   INFINITE))
{
   const DWORD numResults = 10;

   RIORESULT results[numResults];

   ULONG numCompletions = rio.RIODequeueCompletion(
      queue,
      results,
      numResults);

   while (numCompletions)
   {
      for (ULONG i = 0; i &amp;lt; numCompletions; ++i)
      {
         // deal with request completion...
      }

      numCompletions = rio.RIODequeueCompletion(
         queue,
         results,
         numResults);
   }

   rio.RIONotify(queue);
}&lt;/pre&gt;
&lt;div&gt;
Of course, in real code you'd likely use the &lt;code&gt;completionKey&lt;/code&gt; to pass details of the queue that's being operated on, a pointer to a structure that you can plug the queue into once you've created it, perhaps.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Anyway, the gist of it is that once the RIO completion queue is not empty you will get an IOCP completion if you have called &lt;code&gt;RIONotify()&lt;/code&gt; but you will not get another IOCP completion until you call &lt;code&gt;RIONotify()&lt;/code&gt; again. At first this seems a little strange. After all, the completion queue could send another IOCP completion when it is, or becomes, non-empty once you have called &lt;code&gt;RIODequeueCompletion()&lt;/code&gt; once. However, having to call &lt;code&gt;RIONotify()&lt;/code&gt; to explicitly request a new notification is probably a good thing. It places you in complete control of which threads are currently accessing the RIO completion queue and given the nature of RIO completion queues this also means that by using the pattern above you can be sure that only one thread is processing completions for a given socket at a given time. Of course if you are using separate RIO completion queues for send and receive operations then you may have one IOCP thread processing send completions and one processing receive completions at the same time. If you use just one RIO completion queue for both send and receive then you can be sure that only this thread is currently processing completions for a given socket. This is different to the normal IOCP model where all of the threads in your I/O pool could be processing completions for the same socket if enough operations have completed.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Why is this a good thing?&lt;/h2&gt;
&lt;div&gt;
This behaviour is good because it means that, hopefully, you need do nothing clever to retain sequencing in completions. Assuming reads and writes complete to the RIO completion queue in the expected order (and I'd be very surprised if they didn't) then the fact that you can guarantee that you only have one thread processing completions for a given socket means that you're guaranteed, at that point, at least, that the completions are in the correct order. With IOCP the completions are placed in the queue in order but the fact that one or more threads from your I/O pool could be processing completions for the same socket simultaneously means that you need to actively ensure that the completions are processed in order (if that matters to you, and, more often than not, it does). This is more important in RIO as (according to &lt;a href="http://channel9.msdn.com/Events/BUILD/BUILD2011/SAC-593T" target="_blank"&gt;the RIO BUILD video&lt;/a&gt;) your socket's send and receive buffers are not used and so you need to have ample receives pending to ensure that you don't stall your TCP connection or lose UDP datagrams. With TCP, multiple pending receives require sequencing to ensure the stream is processed in the correct order.  
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
It's also good because, quite frankly, there's no need to be notified again until you've drained the RIO completion queue and if completions keep arriving you get to stay on one thread and dequeue them. As I mentioned earlier, dequeuing completions doesn't involve a kernel mode transition and so we've suddenly switched to a polled design where we only need a kernel mode transition when we run out of work to do. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Why is this a bad thing?&lt;/h2&gt;
&lt;div&gt;
Unfortunately this behaviour may make it a little harder to get full utilisation of all your I/O threads. You need to make sure that you have enough RIO completion queues so that each of your I/O threads can do some work and you need to hope that you've spread your connections across your queues in such a way that one RIO completion queue (i.e. a subset of connections) doesn't have more work to do than others. I expect the way that a general purpose framework assigns new connections to RIO completion queues and how many queues and of what size will all be things that I'll work out over time with some experimentation. If you have more RIO completion queues than you do I/O threads then you also need to be careful that you don't stick your thread to one completion queue by looping for long periods on a single RIO completion queue; I expect that a configurable limit on how many RIO results to process per IOCP completion would do the trick.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Something else to think about is the fact that now if an I/O thread blocks before it has called &lt;code&gt;RIONotify()&lt;/code&gt; then you're blocking all of the sockets that are associated with the RIO completion queue that you're currently processing, not just a single connection.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Wrapping up&lt;/h2&gt;
&lt;div&gt;
It looks pretty easy to scale RIO completion processing using IOCP notification but the details are not going to be the same as you're used to with normal IOCP completions. Each IOCP completion represents a potentially infinite block of RIO results for a subset of your connections. Expect your RIO architectures to be familiar, yet different to what you're doing now with IOCP.
&lt;/div&gt;
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=ka4-MkuQ7s8:sJEiBM9pGr4:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=ka4-MkuQ7s8:sJEiBM9pGr4:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/ka4-MkuQ7s8" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-and-io-completion-ports.html</feedburner:origLink></entry>

<entry>
    <title>Inside the Windows 8 Registered I/O Extensions, RIO - Rambling Comments</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/Pm3wq3Kmhtg/inside-the-windows-8-registered-io-extensions-rio.html" />
    <id>tag:www.lenholgate.com,2011:/blog//12.1144</id>

    <published>2011-10-24T20:30:38Z</published>
    <updated>2012-03-16T21:47:03Z</updated>

    <summary> Before I started to look at RIO for inclusion in The Server Framework I did a quick check on the Microsoft BUILD site to see if there were any sessions that dealt with it specifically, I didn't find any....</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Socket Servers" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        &lt;div&gt;
Before I started to look at &lt;a href="http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html"&gt;RIO for inclusion in The Server Framework&lt;/a&gt; I did a quick check on the &lt;a href="http://www.microsoft.com/presspass/events/build/" target="_blank"&gt;Microsoft BUILD&lt;/a&gt; site to see if there were any sessions that dealt with it specifically, I didn't find any. Once I posted my blog posting I did another check and &lt;a href="http://channel9.msdn.com/Events/BUILD/BUILD2011/SAC-593T" target="_blank"&gt;found this video that deals specifically with RIO&lt;/a&gt;. This gives some in depth details of how RIO works and the kinds of performance improvements that Microsoft has witnessed in their labs. It's interesting and impressive.
&lt;/div&gt; 
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
One thing that I hadn't realised is that with RIO, &lt;code&gt;SO_SNDBUF&lt;/code&gt; and &lt;code&gt;SO_RCVBUF&lt;/code&gt; are no longer applicable with RIO, it's as if you're operating with them set to zero. You need to always have recvs pending if you don't want to drop datagrams...
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
More once I've watched it all...
&lt;/div&gt;
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=Pm3wq3Kmhtg:T2Dij-Ho_YA:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=Pm3wq3Kmhtg:T2Dij-Ho_YA:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/Pm3wq3Kmhtg" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.lenholgate.com/blog/2011/10/inside-the-windows-8-registered-io-extensions-rio.html</feedburner:origLink></entry>

<entry>
    <title>Windows 8 Registered I/O Networking Extensions - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/qx83mdo1voM/windows-8-registered-io-networking-extensions.html" />
    <id>tag:www.serverframework.com,2011:/asynchronousevents//2.1145</id>

    <published>2011-10-24T15:50:35Z</published>
    <updated>2012-03-13T08:35:21Z</updated>

    <summary> Most of the buzz being generated around the Windows 8 Developer Previews at the moment seems to be centred on the new Metro user interface and the Windows Runtime. Whilst both Metro and WinRT are key components of the...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Development" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Winsock Registered I/O" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
Most of the buzz being generated around the Windows 8 Developer Previews at the moment seems to be centred on the new Metro user interface and the Windows Runtime. Whilst both Metro and WinRT are key components of the next Windows release I find the &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/ms740642(v=VS.85).aspx" target="_blank"&gt;Registered I/O Networking Extensions&lt;/a&gt; to be far more interesting, but then I guess I would...
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;What are the Registered I/O Networking Extensions?&lt;/h2&gt;
&lt;div&gt;
The Registered I/O Networking Extensions, RIO, is a new API that has been added to Winsock to support high-speed networking for increased networking performance with lower latency and jitter. These extensions are targeted primarily for server applications and use pre-registered data buffers and completion queues to increase performance. I assume that the increased performance comes from avoiding the need to lock memory pages and copy &lt;code&gt;OVERLAPPED&lt;/code&gt; structures into kernel space when individual requests are issued, instead relying on pre-locked buffers, fixed sized completion queues, optional event notification on completions and the ability to return multiple completions from kernel space to user space in one go.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The RIO API is pretty simple and straight forward but servers that currently use I/O Completion Port based designs will need to change somewhat to take advantage of it and probably not all server designs will benefit from changing. RIO relies on you registering the memory that you will use as data buffers and knowing in advance how many pending operations a given socket will have at any time. This allows it to lock the data buffers in memory once, rather than on each operation and removes the whole concept of the &lt;code&gt;OVERLAPPED&lt;/code&gt; structure from the user space API. Since completion queue space is also of a fixed size you're also required to know how many sockets you will be allocating to a given queue and the maximum number of pending operations that these sockets will have. You can increase all of these limits after socket creation but, except for registering new data buffers, I expect that you're likely to take a performance hit for doing so.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
I've been looking at the pre-release documentation and the headers from the latest Windows SDK and experimenting with the new RIO API. Note that at present the documentation is out of sync with the headers and there's little more than reference documentation so much of what I have to say about RIO is based on assumptions and intuition based on the available information and my knowledge of how I/O Completion Port based networking currently works on pre Windows 8 operating systems. In other words, don't rely on all of this to be correct.
&lt;/div&gt;

        &lt;h2 class="entry-body"&gt;How do you access the RIO API?&lt;/h2&gt;
&lt;div&gt;
RIO is a Microsoft-specific extension to the Windows Sockets specification in the same way that &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/ms737524(v=vs.85).aspx" target="_blank"&gt;AcceptEx()&lt;/a&gt;, &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/ms737606(v=vs.85).aspx" target="_blank"&gt;ConnectEx()&lt;/a&gt;, etc. are and the API is accessed in a similar way. You don't link to the functions directly, you obtain them via a call to &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/ms741621(v=vs.85).aspx" target="_blank"&gt;WSAIoctl()&lt;/a&gt;. Since RIO presents an API rather than a single extension function and that API is either available as a whole or not you obtain a table to the API's function pointers, rather than individual function pointers as with &lt;code&gt;AcceptEx()&lt;/code&gt; and &lt;code&gt;ConnectEx()&lt;/code&gt; etc. You do this by calling &lt;code&gt;WSAIoctl()&lt;/code&gt; with an opcode of &lt;code&gt;SIO_GET_MULTIPLE_EXTENSION_FUNCTION_POINTER&lt;/code&gt; and an id of &lt;code&gt;WSAID_MULTIPLE_RIO&lt;/code&gt;. The result is a populated &lt;code&gt;RIO_EXTENSION_FUNCTION_TABLE&lt;/code&gt; table, see &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/hh437226(v=VS.85).aspx" target="_blank"&gt;here&lt;/a&gt; for more details.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
It took me a few attempts to get the &lt;code&gt;WSAIoctl()&lt;/code&gt; call to work as this is the first extension API and the first use of &lt;code&gt;SIO_GET_MULTIPLE_EXTENSION_FUNCTION_POINTER&lt;/code&gt; and I was unable to find any examples of its usage. Anyway, your call to &lt;code&gt;WSAIoctl()&lt;/code&gt; should look like this:
&lt;/div&gt;
&lt;pre class="brush: cpp gutter: false"&gt;RIO_EXTENSION_FUNCTION_TABLE rio;

GUID functionTableId = WSAID_MULTIPLE_RIO;

DWORD dwBytes = 0;

bool ok = true;
 
if (0 != WSAIoctl(
   s,
   SIO_GET_MULTIPLE_EXTENSION_FUNCTION_POINTER,
   &amp;amp;functionTableId,
   sizeof(GUID),
   (void**)&amp;amp;rio,
   sizeof(rio),
   &amp;amp;dwBytes,
   0,
   0))
{
   const DWORD lastError = ::GetLastError();

   // handle error...
}
else
{
   // all ok, we have access to RIO
}
&lt;/pre&gt;
&lt;div&gt;
Note that you can use the cbSize member of the function table to detect additions to the API if it is changed in later versions of Windows.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;First impressions of the RIO API&lt;/h2&gt;
&lt;div&gt;
Looking at &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/hh437226(v=VS.85).aspx" target="_blank"&gt;the preliminary on-line documentation&lt;/a&gt; a couple of things immediately jumped out at me:
&lt;ul&gt;&lt;li&gt;None of &lt;code&gt;RIOReceive()&lt;/code&gt;, &lt;code&gt;RIOReceiveEx()&lt;/code&gt;, &lt;code&gt;RIOSend()&lt;/code&gt; and &lt;code&gt;RIOSendEx()&lt;/code&gt; support scatter/gather I/O. That is, they all take a single &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/hh437219(v=VS.85).aspx" target="_blank"&gt;&lt;code&gt;RIO_BUF&lt;/code&gt;&lt;/a&gt; structure rather than a chain of them. Standard Winsock send and recv functions take chains of &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/ms741542(v=vs.85).aspx" target="_blank"&gt;&lt;code&gt;WSABUF&lt;/code&gt;&lt;/a&gt; structures allowing for scatter/gather I/O.&lt;/li&gt;
&lt;li&gt;Completions are signalled via an event. Each completion queue can have its own event associated with it and the pattern for retrieving completions appears to be, issue operations, wait on event, retrieve completions. Whilst this is most likely the most performant method for small numbers of connections it leaves you having to scale it yourself which is likely non-trivial.&lt;/li&gt;
&lt;/ul&gt;
Luckily the header files are not consistent with the documentation and the fact that they DO include support for scatter/gather I/O and they also allow completion notification either via an event or via IOCP means that I'm pretty sure that the code is right and the docs are wrong. Anyway, I'm getting ahead of myself...
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;How does RIO work?&lt;/h2&gt;
&lt;div&gt;
As I mentioned above, RIO provides increased performance by working with pre-locked buffers, fixed sized completion queues and reduced user mode to kernel mode transitions. You enable the RIO extensions on a socket by creating the socket with the &lt;code&gt;WSA_FLAG_REGISTERED_IO&lt;/code&gt; flag, it seems that this can be combined with &lt;code&gt;WSA_FLAG_OVERLAPPED&lt;/code&gt; which is convenient as RIO provides no alternatives to &lt;code&gt;AcceptEx()&lt;/code&gt; and &lt;code&gt;ConnectEx()&lt;/code&gt; and so it's likely that your sockets will require both &lt;code&gt;WSA_FLAG_REGISTERED_IO&lt;/code&gt; AND &lt;code&gt;WSA_FLAG_OVERLAPPED&lt;/code&gt;.
&lt;/div&gt;
&lt;pre class="brush: cpp gutter: false"&gt;SOCKET s = ::WSASocket(
   AF_INET,
   SOCK_STREAM,
   IPPROTO_TCP,
   NULL,
   0,
   WSA_FLAG_REGISTERED_IO);
&lt;/pre&gt;
&lt;div&gt;
Once you have your socket you need to create a registered I/O socket descriptor on the socket. You do this by calling &lt;code&gt;RIOCreateRequestQueue()&lt;/code&gt;. The documentation for this function is out of sync with the headers, the call actually looks like this:
&lt;/div&gt;
&lt;pre class="brush: cpp gutter: false"&gt;ULONG maxOutstandingReceive = 10;
ULONG maxReceiveDataBuffers = 1;
ULONG maxOutstandingSend = 10;
ULONG maxSendDataBuffers = 2;

void *pContext = 0;

RIO_RQ requestQueue = rio.RIOCreateRequestQueue(
   s,
   maxOutstandingReceive,
   maxReceiveDataBuffers,
   maxOutstandingSend,
   maxSendDataBuffers,
   recvQueue,
   sendQueue,
   pContext);
&lt;/pre&gt;
&lt;div&gt;
This is where you place limits on the number of outstanding requests (and the number of buffers that can be used with each request) and where you associate your per-connection context that will be returned to you with each completion; this is the same as the "completion key" in regular IOCP designs. You also need a receive queue and a send queue (you can use one queue for both), these queues are created by a call to &lt;code&gt;RIOCreateCompletionQueue()&lt;/code&gt;. Again the documentation is out of sync, the call looks like this:
&lt;/div&gt;
&lt;pre class="brush: cpp gutter: false"&gt;HANDLE hEvent = WSACreateEvent();

RIO_NOTIFICATION_COMPLETION type;

type.Type = RIO_EVENT_COMPLETION;
type.Event.EventHandle = hEvent;
type.Event.NotifyReset = TRUE;

RIO_CQ queue = rio.RIOCreateCompletionQueue(queueSize, &amp;amp;type);
&lt;/pre&gt;
&lt;div&gt;
Which creates a completion queue of a specified size which uses an event to signal that it's no longer empty. When you create a request queue the number of outstanding operations is used to ensure that the queue size is suitable for all the sockets associated with it. You can resize both completion queues and request queues at a later time if you need to but I would imagine that it's better not to base your design on doing so.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
As an alternative you can use an IOCP for completion notification.
&lt;/div&gt;
&lt;pre class="brush: cpp gutter: false"&gt;HANDLE hIOCP = CreateIoCompletionPort(
   INVALID_HANDLE_VALUE,
   0,
   0,
   0);

OVERLAPPED overlapped;

RIO_NOTIFICATION_COMPLETION type;

type.Type = RIO_IOCP_COMPLETION;
type.Iocp.IocpHandle = hIOCP;
type.Iocp.CompletionKey = pCompletionKey;
type.Iocp.Overlapped = &amp;amp;overlapped;

RIO_CQ queue = rio.RIOCreateCompletionQueue(queueSize, &amp;amp;type);
&lt;/pre&gt;
&lt;div&gt;
This makes it easier to scale the use of RIO with a pool of IOCP threads processing completions from one or more queues. Of course the overlapped structure should likely be dynamically allocated so that it can last until the queue is closed.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The header file parameter annotations suggest that the completionType parameter is optional, thus there's a third way to create a completion queue.
&lt;/div&gt;
&lt;pre class="brush: cpp gutter: false"&gt;RIO_CQ queue = rio.RIOCreateCompletionQueue(queueSize, 0);
&lt;/pre&gt;
&lt;div&gt;
Which seems to provide a polled interface...
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Receiving data using RIO&lt;/h2&gt;
&lt;div&gt;
Once the socket is connected using normal connection methods you can send and receive using RIO. The two receive functions available in the Windows Developer Preview SDK differ from the documentation in that they DO support scatter/gather I/O. The simplest recv call looks like this:
&lt;/div&gt;
&lt;pre class="brush: cpp gutter: false"&gt;RIO_BUF buffer;

buffer.BufferId = id;
buffer.Offset = 0;
buffer.Length = buffer1Size;

DWORD flags = 0;

void *pOperationContext = 0;

rio.RIOReceive(requestQueue,
   &amp;amp;buffer,
   1,
   flags,
   pOperationContext);
&lt;/pre&gt;
&lt;div&gt;
The &lt;code&gt;RIO_BUF&lt;/code&gt; structure allows us to create a buffer slice from a registered data buffer. This lets us register large buffers, which is more efficient, and lets us slice them up into blocks that are more appropriate to use. A buffer is registered like this:
&lt;/div&gt;
&lt;pre class="brush: cpp gutter: false"&gt;const DWORD bufferSize = 4096;

char *pBuffer = new char[bufferSize];

RIO_BUFFERID id = rio.RIORegisterBuffer(pBuffer, bufferSize);
&lt;/pre&gt;
&lt;div&gt;
Note that, of course, the buffer should outlive the buffer registration and that it would be better to allocate memory that is page aligned, using &lt;code&gt;VirtualAlloc()&lt;/code&gt; as buffer registration locks the buffer in memory and the locking granularity is page sized. See &lt;a href="http://msdn.microsoft.com/en-us/library/windows/desktop/hh437199(v=VS.85).aspx" target="_blank"&gt;the documentation for &lt;code&gt;RIORegisterBuffer()&lt;/code&gt;&lt;/a&gt; for more details.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The operation context is your per-operation data, this is what you would have previously used your 'extended' &lt;code&gt;OVERLAPPED&lt;/code&gt; structure for. In a real design this is likely a pointer to a reference counted 'operation data' object which knows about the buffer slices being used and the operation, in this case a read, that is being executed.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
So, what happens when &lt;code&gt;RIOReceive()&lt;/code&gt; completes? Well, if we're dealing with the event based completion mechanism, then, at present, you need to call RIONotify() to tell the RIO API that you want to receive notifications when completions occur (once again the docs for this function are out of sync with the implementation). You then wait on your event until it's signalled and then call &lt;code&gt;RIORIODequeueCompletion()&lt;/code&gt; to retrieve completion results. Like &lt;code&gt;GetQueuedCompletionStatusEx()&lt;/code&gt; you can remove multiple completions in a single call, in this case, by passing an array of RIORESULT structures to this call.
&lt;/div&gt;
&lt;pre class="brush: cpp gutter: false"&gt;RIORESULT result;

ULONG numResults = rio.RIODequeueCompletion(queue, &amp;amp;result, 1);
&lt;/pre&gt;
&lt;div&gt;
Here we request a single completion, but for better performance it's probably best to always work with arrays of completions. We can now wait for another completion, but, at least at present, we need to call &lt;code&gt;RIONotify()&lt;/code&gt; again to request more notifications, we can't simply reset our event and wait again. It seems strange that we have to call &lt;code&gt;RIONotify()&lt;/code&gt; manually when &lt;code&gt;RIO_NOTIFICATION_COMPLETION&lt;/code&gt; has a boolean member called NotifyReset, but, at present we do. I would expect that by setting NotifyReset to true the act of dequeuing completions would reset the event AND enable further notifications, thus avoiding another potential kernel mode transition.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
There are some flags that you can specify in your RIOReceive() call. The most interesting is &lt;code&gt;RIO_MSG_WAITALL&lt;/code&gt; which causes the recv call to only complete when the buffer slice supplied is full, an error occurs, or the connection is terminated. This would be very useful for servers that work with messages which are of a fixed length, or are framed with a length prefix. By supplying a buffer of the appropriate size and specifying &lt;code&gt;RIO_MSG_WAITALL&lt;/code&gt; you'll get a single completion when the buffer is full. This is considerably better than getting a completion with a partial buffer and then needing to adjust the start position of the buffer and reissue the read so that you can read the rest of the message into the buffer. The reduced number of completions that need to be processed in this scenario, especially with large messages, would likely turn into a huge performance gain.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Note that currently, in a multi-buffer slice read operation, &lt;code&gt;RIO_MSG_WAITALL&lt;/code&gt; will cause a completion to occur when the first buffer slice is full, not when all buffer slices supplied in the call are full.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Sending data using RIO&lt;/h2&gt;
&lt;div&gt;
Sending data is pretty much the same as receiving it, you pass an array of buffer slices to RIOSend() and process the completion in the normal way.
&lt;/div&gt;
&lt;pre class="brush: cpp gutter: false"&gt;RIO_BUF sendBuffer;

sendBuffer.BufferId = id;
sendBuffer.Offset = 0;
sendBuffer.Length = bufferSize;

// memcpy your data into the buffer slice...

if (!rio.RIOSend(requestQueue, &amp;amp;sendBuffer, 1, flags, pContext))
{
   // handle error
}
&lt;/pre&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Conclusions&lt;/h2&gt;
&lt;div&gt;
I've only scratched the surface of RIO here and the fact that the documentation is out of sync with the actual implementation means that this could all change before Windows 8 is released, but... Although RIO will likely mean that your design is more complicated than a "normal" IOCP design, I expect the performance gains will be worth it for certain types of networking applications. Being able to pre-lock all your memory buffers in memory and pre-assign your completion queues likely means that your server will be more robust, with failures due to &lt;a href="http://technet.microsoft.com/en-us/library/cc959494.aspx" target="_blank"&gt;the I/O page lock limit&lt;/a&gt; and non-paged pool exhaustion becoming a thing of the past. Processing completions on dedicated threads using the eventing version of the API is likely to result in higher performance for applications that suit that design whilst using the IOCP based notification system will scale more easily. I can see two use cases for RIO and I'm sure there are many more. The first is for high performance, jitter free, low latency connections where you have a small number of connections to deal with an want the best performance possible. The second is for servers with many thousands of concurrent connections where the performance gains from streamlined send and recv APIs and the correspondingly reduced kernel transitions lead to higher throughput on all connections.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
We certainly intend to incorporate support for RIO into The Server Framework and right now we're investigating the best way to do this. I'll be blogging more about RIO over the next few weeks, why not &lt;a href="http://www.serverframework.com/asynchronousevents/atom.xml" target="_blank"&gt;subscribe to our RSS feed&lt;/a&gt; or follow us on &lt;a href="http://twitter.com/ServerFramework" target="_blank"&gt;Twitter&lt;/a&gt;.
&lt;/div&gt;
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=qx83mdo1voM:Fg-xDk0XPc0:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=qx83mdo1voM:Fg-xDk0XPc0:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/qx83mdo1voM" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2011/10/windows-8-registered-io-networking-extensions.html</feedburner:origLink></entry>

<entry>
    <title>Out of band data, TCP Urgent mode and overlapped I/O - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/WUqCqFiqc3E/out-of-band-data-and-overlapped-io.html" />
    <id>tag:www.serverframework.com,2011:/asynchronousevents//2.1116</id>

    <published>2011-10-20T09:30:00Z</published>
    <updated>2011-10-24T13:58:42Z</updated>

    <summary> Some stream protocols have the concept of 'out of band' (OOB) data. This is a separate logical communication channel between the peers which enables data that is unrelated to the current data in the stream to be sent alongside...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="General" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="New features" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
Some stream protocols have the concept of 'out of band' (OOB) data. This is a separate logical communication channel between the peers which enables data that is unrelated to the current data in the stream to be sent alongside the normal data stream. This is often a way for some data to jump ahead of the normal stream and arrive faster than if it were delivered via the the normal data stream. 
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Winsock supports out of band data in a protocol independent way, see &lt;a href="http://msdn.microsoft.com/en-us/library/ms740102(v=vs.85).aspx" target="_blank"&gt;here&lt;/a&gt;, but accessing it from networking code that uses overlapped I/O rather than the old-fashioned BSD API is somewhat under documented. By default, out of band data does not appear in the normal data stream, you have to read it explicitly by setting &lt;code&gt;MSG_OOB&lt;/code&gt; in the flags of a call to &lt;a href="http://msdn.microsoft.com/en-us/library/ms741688(v=vs.85).aspx" target="_blank"&gt;WSARecv()&lt;/a&gt;. For non overlapped I/O designs you can use &lt;a href="http://msdn.microsoft.com/en-us/library/ms741540(v=vs.85).aspx" target="_blank"&gt;WSAAsyncSelect()&lt;/a&gt; or &lt;a href="http://msdn.microsoft.com/en-us/library/ms740141(v=VS.85).aspx" target="_blank"&gt;select()&lt;/a&gt; to explicitly check for the presence of out of band data. With overlapped I/O your options appear limited, it seems that you should be able to use an overlapped &lt;a href="http://msdn.microsoft.com/en-us/library/ms741621(v=vs.85).aspx" target="_blank"&gt;WSAIoctl()&lt;/a&gt; call with &lt;code&gt;SIOCATMARK&lt;/code&gt; but this will return immediately when OOB is either present or not present, it doesn't wait for OOB to become available.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The solution is to post a separate, out of band, overlapped &lt;code&gt;WSARecv()&lt;/code&gt; passing the &lt;code&gt;MSG_OOB&lt;/code&gt; flag. This will only return on socket closure or when out of band data arrives. By using a distinct indicator in your 'per operation data' you can distinguish this read from normal reads and deal with it accordingly. Once you have processed the special out of band data read you can then post another to read subsequent out of band data.
&lt;/div&gt;

        &lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;Inline Out Of Band Data&lt;/h2&gt;
&lt;div&gt;
Winsock also allows you to include any OOB in the normal data stream, you can do this by setting the &lt;code&gt;SO_OOBINLINE&lt;/code&gt; socket option on the connection. When &lt;code&gt;SO_OOBINLINE&lt;/code&gt; is set the OOB data is included in the results of normal &lt;code&gt;WSARecv()&lt;/code&gt; calls without needing to set &lt;code&gt;MSG_OOB&lt;/code&gt; in the flags, however, with overlapped I/O it's then impossible to determine if OOB exists and if it's present in the data buffer returned by any particular &lt;code&gt;WASRecv()&lt;/code&gt; call. When &lt;code&gt;SO_OOBINLINE&lt;/code&gt; is set calls to &lt;code&gt;WSAIoctl()&lt;/code&gt; with &lt;code&gt;SIOCATMARK&lt;/code&gt; always return &lt;code&gt;true&lt;/code&gt; to indicate that no OOB is waiting.
For most Winsock Providers this is probably fine. You most probably want to read your OOB separately for X.25 or whatever and the last thing you want is for it to be mixed in with the normal data from the data stream.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
For TCP things are somewhat more complex. TCP doesn't actually have the concept of OOB, but the BSD API maps the TCP Urgent data concept onto OOB.  Due to differences in the TCP RFCs and historical issues there's complexity in using TCP Urgent data reliably if you don't know which RFC the implementation adheres to. This complexity makes it almost impossible to use OOB with overlapped TCP in a reliable manner, especially if you have no control over the platform upon which your clients run.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;h2 class="entry-body"&gt;TCP Urgent Data&lt;/h2&gt;
&lt;div&gt;
In TCP out of band data is implemented in terms of 'urgent data' using the URG bit and the Urgent Pointer, see &lt;a href="http://www.tcpipguide.com/free/t_TCPPriorityDataTransferUrgentFunction.htm"&gt;here&lt;/a&gt;, however, there are two conflicting descriptions of how this works, &lt;a href="http://tools.ietf.org/html/rfc0793#page-15" target="_blank"&gt;RFC 793&lt;/a&gt; which details TCP says that the Urgent Pointer indicates the byte that follows the urgent data but &lt;a href="http://tools.ietf.org/html/rfc1122#page-84" target="_blank"&gt;RFC 1122&lt;/a&gt; corrects this and states that the Urgent Pointer indicates the final byte of urgent data. This leads to interoperability issues if one peer uses the RFC 793 definition and the other uses the RFC 1122 definition. The Windows documentation for the standard TCP Winsock Provider claims that it operates as BSD does however this can be changed using the &lt;code&gt;TCP_EXPEDITED_1122&lt;/code&gt; socket option if the Winsock provider supports it. There's more compatibility complexity in that Windows only supports a single byte of out of band data whereas RFC 1122 specifies that TCP MUST support sequences of urgent data bytes of any length. Windows also doesn't specify how or if it will buffer subsequent out of band data, so if you are slow in reading a byte of urgent data and another byte of urgent data arrives then one of the bytes may be lost; though our tests have shown that Windows does buffer urgent data. This all makes the use of out of band signalling using urgent data somewhat unreliable on Windows with TCP.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
The unreliability shows itself if you're using separate OOB reads using &lt;code&gt;MSG_OOB&lt;/code&gt;. In this instance you may get the 'wrong' byte of OOB data if you are dealing with a peer that implements the altenative RFC. This wouldn't be a problem if the byte hadn't been removed from the normal data stream and so you end up with an incorrect byte in the normal data stream (the intended OOB byte) and a missing byte, which has been treated as OOB data. This doesn't cause problems for the two traditional users of TCP Urgent Data, Telnet and Rlogin as they simply use the Urgent data notification to alert the server that urgent data is present in the data stream and then read and discard all of the normal data in the stream until they can read the urgent data. See here for how this facility is used in &lt;a href="http://www.tcpipguide.com/free/t_TelnetInterruptHandlingUsingOutOfBandSignalingTheT.htm" target"_blank"=""&gt;the Telnet protocol's synch function&lt;/a&gt;. It's practically impossible to implement this functionality with overlapped I/O as you don't want the urgent data removed from the data stream, so you need to operate in &lt;code&gt;SO_OOBINLINE&lt;/code&gt; mode and yet when in that mode you will never be notified that urgent data exists. To implement the telnet model with overlapped I/O you pretty much need to always read all inbound data and buffer it in your server rather than allowing it to buffer in the TCP stack and allowing TCP flow control to prevent the client from sending more until you're ready.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
So, if you want to use out of band data with TCP with overlapped I/O you need to remember five things:
&lt;ul&gt;
&lt;li&gt;Out of band data in TCP on Windows when interoperating with other, non-Windows, operating systems is likely to be unreliable due to differences between RFC 793 and RFC 1122.&lt;/li&gt;
&lt;li&gt;Expecting to send more than a single byte of out of bound data is likely not to work.&lt;/li&gt;
&lt;li&gt;Your out of band data may get "lost" if you send out of band data faster than the receiver is processing it.&lt;/li&gt;
&lt;li&gt;To read out of band data using overlapped I/O you need to post a special &lt;code&gt;WSARecv()&lt;/code&gt; with &lt;code&gt;MSG_OOB&lt;/code&gt; set in the flags.&lt;/li&gt;
&lt;li&gt;Real out of band communication using TCP is better achieved with a separate 'control' connection rather than using TCP's Urgent data function via &lt;code&gt;MSG_OOB&lt;/code&gt;.
&lt;/li&gt;&lt;/ul&gt;
Despite all of this, The Server Framework will support out of band data in the next major release. By default OOB data will be disabled and we'll abort connections that use it. You can also decide to enable OOB inline which effectively disables OOB data from the receiver's point of view and simply keeps the out of band data in the main data stream. Or, you can enable async OOB which will read OOB data using an optional special purpose overlapped read with &lt;code&gt;MSG_OOB&lt;/code&gt; set.  
&lt;/div&gt;
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=WUqCqFiqc3E:gdybbhH-DYI:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=WUqCqFiqc3E:gdybbhH-DYI:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/WUqCqFiqc3E" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2011/10/out-of-band-data-and-overlapped-io.html</feedburner:origLink></entry>

<entry>
    <title>Latest release of The Server Framework: 6.5.1 - AsynchronousEvents</title>
    <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/LenHolgate/~3/i2s-V4jE-A4/latest-release-of-the-server-framework-651.html" />
    <id>tag:www.serverframework.com,2011:/asynchronousevents//2.1142</id>

    <published>2011-10-11T07:23:31Z</published>
    <updated>2011-10-12T07:48:16Z</updated>

    <summary> Version 6.5.1 of The Server Framework was released today. This is primarily a bug fix release, although we also add several new example clients and servers. This release includes the following, see the release notes, here, for full details...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Releases" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en" xml:base="http://www.serverframework.com/asynchronousevents/">
        &lt;div&gt;
Version 6.5.1 of The Server Framework was released today.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
This is primarily a bug fix release, although we also add several new example clients and servers.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
This release includes the following, see the release notes, &lt;a href="http://www.serverframework.com/ServerFramework/latest/Docs/sockettoolsreleasenotes.html"&gt;here&lt;/a&gt;, for full details of all changes.
&lt;/div&gt;
&lt;div&gt;&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;Bug fixes to &lt;a href="http://www.serverframework.com/products---the-options.html"&gt;The Core Framework&lt;/a&gt; which affect the use of the newly added "Read Again" functionality.&lt;/li&gt;
&lt;li&gt;Fixes to the Hixie76 WebSockets protocol handler to improve interoperability.&lt;/li&gt;
&lt;li&gt;Added outbound connection establishment support to the Hixie76 protocol handler.&lt;/li&gt;
&lt;li&gt;Updated the WebSocket Echo Server Test and the Secure WebSocket Echo Server Test example clients to support the creation of both Hixie and HyBi connections.&lt;/li&gt;
&lt;li&gt;Fixed a race condition in the WebSocket example clients that could cause a connection to "stall" - note that this was an issue with the how the client code used the WebSocket layer and not an issue in &lt;a href="http://www.serverframework.com/products---the-websockets-option.html"&gt;The WebSockets Option Pack&lt;/a&gt; itself.&lt;/li&gt;
&lt;li&gt;Added two new example servers; a secure, managed WebSocket server which hosts the CLR and routes complete WebSocket messages to managed code for processing and a version of this example which also hosts a simple HTTP server.&lt;/li&gt;
&lt;li&gt;Added a new HTTP client. This is designed to stress test HTTP and HTTPS servers by creating thousands of concurrent connections and requesting various resources at controllable rates.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
        
    &lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=i2s-V4jE-A4:MOVeUN2N7dA:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/LenHolgate?a=i2s-V4jE-A4:MOVeUN2N7dA:ZC7T4KBF6Nw"&gt;&lt;img src="http://feeds.feedburner.com/~ff/LenHolgate?d=ZC7T4KBF6Nw" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/LenHolgate/~4/i2s-V4jE-A4" height="1" width="1"/&gt;</content>
<feedburner:origLink>http://www.serverframework.com/asynchronousevents/2011/10/latest-release-of-the-server-framework-651.html</feedburner:origLink></entry>

</feed>

