<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Andrew</title>
	<atom:link href="http://floodyberry.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://floodyberry.wordpress.com</link>
	<description>Just another WordPress.com weblog</description>
	<lastBuildDate>Mon, 19 Jan 2009 08:09:56 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<image>
		<url>http://www.gravatar.com/blavatar/29d9a9ef79e2b61b37d1f1b6c2485bd6?s=96&#038;d=http://s.wordpress.com/i/buttonw-com.png</url>
		<title>Andrew</title>
		<link>http://floodyberry.wordpress.com</link>
	</image>
			<item>
		<title>Generating .DLL Wrappers</title>
		<link>http://floodyberry.wordpress.com/2008/09/08/generating-dll-wrappers/</link>
		<comments>http://floodyberry.wordpress.com/2008/09/08/generating-dll-wrappers/#comments</comments>
		<pubDate>Mon, 08 Sep 2008 10:25:49 +0000</pubDate>
		<dc:creator>floodyberry</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Win32]]></category>
		<category><![CDATA[cplusplus]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[dll]]></category>

		<guid isPermaLink="false">http://floodyberry.wordpress.com/?p=31</guid>
		<description><![CDATA[A while ago I came across Create your Proxy DLLs automatically by Michael Chourdakis. I thought it was a good idea, but had some room for improvement:

 Having to use an external .exe (dumpbin/tdump) was an unnecessary step, all the information you need is in the PE header!
He did not handle wrapping mangled names or [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=31&subd=floodyberry&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>A while ago I came across <a href="http://www.codeproject.com/KB/DLL/CreateYourProxyDLLs.aspx">Create your Proxy DLLs automatically</a> by <a href="http://www.turboirc.com/main2/">Michael Chourdakis</a>. I thought it was a good idea, but had some room for improvement:</p>
<ul>
<li> Having to use an external .exe (dumpbin/tdump) was an unnecessary step, all the information you need is in the PE header!</li>
<li>He did not handle wrapping <a href="http://en.wikipedia.org/wiki/Name_mangling">mangled names</a> or forwarding forwards.</li>
<li>Generating an actual project instead of a command line compile call would be a lot more useful considering you will want to do some actual coding instead of generating an empty wrapper.</li>
<li>His coding style was somewhat awkward and not easy to modify.</li>
</ul>
<p>With this in mind, I set about writing my own version. <span id="more-31"></span> Dumping the export information from the PE header was the first step and relatively straightforward as the PE header is not complicated. I did, however, need to steal GetPtrFromRVA and GetSectionFromRVA from <a href="http://www.wheaty.net/">Matt Pietrek</a> because the RVAs (Relative Virtual Addresses) are only relative when the sections are properly mapped out in memory, not while in the PE container! Those two functions also happen to be used for game hacks when the cheat-maker want to bypass any LoadLibrary &amp; EnumProcessModules hooks by manually mapping a .DLL in to memory.</p>
<p>Handling export forwards is tricky if you don&#8217;t know what they are and simple once you find out.  Since there are no flags to indicate if an export is a forward, and an export name can contain any non-zero character, the system needs a way to tell if &#8220;NTDLL.RtlAllocHeap&#8221; is a forward or not. The linker solves this by pointing the exported function address at the name of the export in the export table, so you merely need to check if the exported function address is within the export table or not.</p>
<p>Proxying name mangling is unfortunately not as simple. Say a .DLL exports</p>
<pre name="code" class="cpp">
void __stdcall SimpleExport( void ) {
}
</pre>
<p>as a C++ function, resulting in the mangled export of &#8220;?SimpleExport@@YGXXZ&#8221;. Now say your proxy .DLL implements a stub and attempts to forward it (the @1 at the end of the export is telling the linker which ordinal to assign the export to):</p>
<pre name="code" class="cpp">
.cpp file:

int __stdcall SimpleExport__YGXXZ() {
	...
}
</pre>
<pre name="code" class="cpp">
.def file

?SimpleExport@@YGXXZ=SimpleExport__YGXXZ @1
</pre>
<p>When you compile your .DLL, you&#8217;ll be surprised to find that it is exporting &#8220;SimpleExport&#8221;, not &#8220;?SimpleExport@@YGXXZ&#8221;! What is going on?! It turns out the Microsoft linker checks the target function for name mangling (an @ symbol), and if the target function name isn&#8217;t mangled, formats the exported name as a C function. The only way that I&#8217;ve found to export a mangled name is to <strong>point it at a mangled name</strong>:</p>
<pre name="code" class="cpp">
.def file v2

?SimpleExport@@YGXXZ=?SimpleExport__YGXXZ@@YGXH@Z @1
</pre>
<p>The good news is that this works great. The bad news is that if you alter your proxy function to match the source function, the name mangling changes and your .def file will need to be updated with the new name (not a trivial thing!). The &#8220;at least it works&#8221; solution I came up with is to have proxy functions for your proxy functions, i.e. functions whose name mangling will never change and which jmp to the real proxy function:</p>
<pre name="code" class="cpp">
.cpp file:

int __stdcall SimpleExport__YGXXZ() {
	...
}

naked void __stdcall decorated1() {
	__asm {
		jmp SimpleExport__YGXXZ
	}
}
</pre>
<pre name="code" class="cpp">
.def file:

?SimpleExport@@YGXXZ=?decorated1@YGXXZ @1
</pre>
<p>Now you can alter the calling conventions, parameters, and return type of the actual proxy function as much as you like and it won&#8217;t affect the linking.</p>
<p>A benefit the trouble mangled names produce is that, with the <a href="http://msdn.microsoft.com/en-us/library/ms681400(VS.85).aspx">UnDecorateSymbolName</a> DbgHelp API call, it is trivial to annotate the return type, calling conventions, and parameters of the proxied function; e.g. you can now see &#8220;?func1@a@@AAEXH@Z&#8221; actually means &#8220;private: void __thiscall a::func1(int)&#8221;.</p>
<h4>My Version</h4>
<p>There are a lot of ways to generate the wrapper and I thought Michael&#8217;s generated source also had room for improvement. For example, he doesn&#8217;t name any of the generated functions, and calling through to the original function requires a typedef (doesn&#8217;t work with intellisense), a non-obvious index in to the imported function array, and a cast. Extending the code so any function can call through to any original function would require the typedef to be global (which still won&#8217;t work with intellisense).</p>
<p>The main points I wanted to hit were:</p>
<ol>
<li>Allow the original functions to be easily callable from anywhere</li>
<li>Require as few changes as possible to if you want to convert a stub to a proxied function</li>
<li>Intellisense has to work!</li>
</ol>
<p>#1 meant creating dummy functions which jmp to their target proc. By creating them as:</p>
<pre name="code" class="cpp">

inline __declspec(naked) int call_AcceptEx() {
	__asm {
		jmp dword ptr [ mProcs + 0 * 4 ]
	}
}
</pre>
<p>I&#8217;m not only able to stuff them in the header (keeping the source file clean), but naked means you can alter their parameters and return type as much as you like and don&#8217;t need to worry about how to use the FARPROC entry or which index to use.</p>
<p>To make one of the original functions globally callable, you only need to change the function signature in two places: The header stub, and the proxy function in the .cpp file. I would have loved to only have a single signature to change but wasn&#8217;t able to come up with a clean method. It&#8217;s &#8220;possible&#8221; with some god-awful macro magic, but I don&#8217;t even have a need for a wrapper at this point and the macros really obfuscate the code.</p>
<p>That is about it! More work could be done, but it&#8217;s decently functional now and I think I&#8217;ve been sitting on it for long enough now. My version generates projects for both Visual Studio 2003 &amp; 2005 and attempts to load the original .DLL from the Windows system directory by default (fairly arbitrary, easy to change). There&#8217;s also a sample .DLL (dll.dll) with odd exports (a forwarded function, a mangled class function, a mangled C++ global function, and a __stdcall C function) so you can make sure any changes you make still compile fine.</p>
<p>Now I just need to find something to wrap with it..</p>
<h4>Downloads</h4>
<ul>
<li>Example output for sample DLL: [<a href="http://www.team5150.com/~andrew/blog/files/dll.cpp">.cpp</a>] [<a href="http://www.team5150.com/~andrew/blog/files/dll.h">.h</a>] [<a href="http://www.team5150.com/~andrew/blog/files/dll.def">.def</a>]</li>
<li><a href="http://www.team5150.com/~andrew/blog/files/genwrapper.rar">Download genwrapper.zip</a></li>
</ul>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/floodyberry.wordpress.com/31/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/floodyberry.wordpress.com/31/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/floodyberry.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/floodyberry.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/floodyberry.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/floodyberry.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/floodyberry.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/floodyberry.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/floodyberry.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/floodyberry.wordpress.com/31/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/floodyberry.wordpress.com/31/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/floodyberry.wordpress.com/31/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=31&subd=floodyberry&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://floodyberry.wordpress.com/2008/09/08/generating-dll-wrappers/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/06f8635e391015012b8eb663e26a0352?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">floodyberry</media:title>
		</media:content>
	</item>
		<item>
		<title>The id Tech 4 Script Compiler Sucks Hard!</title>
		<link>http://floodyberry.wordpress.com/2008/07/09/the-id-tech-4-script-compiler-sucks-hard/</link>
		<comments>http://floodyberry.wordpress.com/2008/07/09/the-id-tech-4-script-compiler-sucks-hard/#comments</comments>
		<pubDate>Wed, 09 Jul 2008 20:23:05 +0000</pubDate>
		<dc:creator>floodyberry</dc:creator>
				<category><![CDATA[Games]]></category>
		<category><![CDATA[Optimization]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[id]]></category>
		<category><![CDATA[compiler]]></category>
		<category><![CDATA[etqw]]></category>

		<guid isPermaLink="false">http://floodyberry.wordpress.com/?p=23</guid>
		<description><![CDATA[Whoever did most of the work on the id Tech 4 Script Compiler, I&#8217;m calling you out! I&#8217;ll grant that you managed to write a major component of a successful commercial engine, but&#8230; it&#8217;s just so bad. What confounds me even more is that all of the engines after DooM III did almost nothing (effective) [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=23&subd=floodyberry&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Whoever did most of the work on the <a href="http://en.wikipedia.org/wiki/Id_Tech_4">id Tech 4</a> Script Compiler, I&#8217;m calling you out! I&#8217;ll grant that you managed to write a major component of a successful commercial engine, but&#8230; it&#8217;s just so bad. What confounds me even more is that all of the engines after DooM III did almost nothing (effective) to try and fix it: If you check out the Quake IV, Prey, and ET:QW SDKs, they all have the same basic compiler with a couple things bolted on. The ET:QW guys did do a bit of work on it and tried to speed it up a bit, but &#8220;glacially slow&#8221; doesn&#8217;t seem like much of an improvement on &#8220;geologically slow&#8221;.</p>
<p>I first noticed how bad it was when I was doing the ET:QW -&gt; Tribes stuff and started playing around with the scripts. Being so use to Tribes style scripting, two things hit me right off:</p>
<ol>
<li>You have to exit the mission and recompile every script if you want to update a script you just edited. Ok, pretty annoying, but I&#8217;m just playing around so it should get easier.</li>
<li>On my AMD64 3200+, recompiling was &#8220;damn slow&#8221; (<strong>~20</strong> seconds) in Release mode and &#8220;I&#8217;m going to read a book&#8221; (<strong>~60</strong> seconds) in Debug. Issue #1 just got a lot more annoying.</li>
</ol>
<p>How did they manage to develop on this for longer than 10 minutes before going crazy, let alone create an entire game? Apparently I was the first to get fed up enough about it, so I went to check out the compiler and see if I could find any hot spots.</p>
<p><span id="more-23"></span></p>
<h4>Putting Out Dead Fires</h4>
<p>What is most depressing about looking at the source is that you can tell someone <em>tried</em> to optimize it, but failed miserably.</p>
<ul>
<li>There are static lists for some elements (strings, functions, globals, statements, objects). <strong>idList</strong> grows linearly, not exponentially, so I assume someone noticed adding XXXX elements was really slow, knew it must be <strong>idList</strong> resizing itself far too many times, but didn&#8217;t know of an actual solution. A static list also means the code will break if you add too many items.</li>
<li>A Hash <strong>is</strong> used for <strong>idVarDef</strong>s (a catch all struct to hold scope/info for variables, functions, objects, and constants), but it is based on name only, i.e. every time it looks up a variable named &#8220;i&#8221;, it gets a linked list of <strong>every</strong> variable named &#8220;i&#8221; and has to check every one to see if they match the scope of the &#8220;i&#8221; you want.
<p>Even better, the list of constants is kept in the <strong>idVarDef</strong> list named &#8220;&lt;Immediate&gt;&#8221;, which means a lookup to see if a constant value exists requires that you iterate over every string constant, numeric constant, stack return constant, etc.</li>
<li>Splash Damage did add a Hash for the <strong>idTypeDef</strong> list in ET:QW, but <strong>only</strong> for the user defined types i.e. objects. The default types were still checked for with a cascaded if statement.
<p>They also added blockAllocators for a couple elements, but allocations weren&#8217;t the main problem.</li>
</ul>
<h4>Active Volcanos</h4>
<p>So what were the (major) problems?</p>
<p><em>Script_Compiler.cpp</em>:</p>
<ul>
<li><strong>idCompiler::FindImmediate</strong> lumped all constants under a single name (&lt;Immediate&gt;), leading to a linear search every time one was looked up</li>
<li><strong>idCompiler::CheckType</strong> checked for default types (string, boolean, virtual, etc) with a cascaded if statement.</li>
<li><strong>idCompiler::GetExpression</strong> did a linear search through the opcode list</li>
<li><strong>idCompiler::ParseReturnStatement</strong> did a linear search through the opcode list to find the &#8220;=&#8221; opcode</li>
</ul>
<p><em>Script_Program.cpp</em></p>
<ul>
<li><strong>idVarDef</strong> was stored in a hash by name only. Each <strong>idVarDef</strong> is a unique pair of name &amp; scope, so the entire list of <strong>idVarDef</strong>s would have to be searched for the proper scope member</li>
<li>The <strong>idVarDef</strong>s were also stored in a weird linked list thing that required constant maintenance</li>
<li><strong>idProgram::FindType</strong> used a linear search even when the hash table was available</li>
<li><strong>idProgram::MatchesVirtualFunction</strong> used a linear search through the virtual functions</li>
</ul>
<p>The generous explanation is that nobody got around to storing things in hash tables properly because the <strong>idHashTable</strong> implementations suck. They&#8217;re hardcoded to be key&#8217;d on a string so you can&#8217;t construct custom keys (such as an object containing a name and a scope for <strong>idVarDef</strong>s).</p>
<p>What is especially ironic (maybe only to me) is that Carmack ran in to performance problems <a href="http://www.team5150.com/~andrew/carmack/johnc_plan_1996.html#d19960903">due to linear searches</a> before with qcc! I have no idea if he contributed at all to the compiler, but it still tickles me.</p>
<h4>Postmature Optimization</h4>
<p>The main fix for this type of problem should be obvious: more hash tables! It took me around a week and a half or so to figure out what exactly was going on, identify the hotspots, and fix them. This involved lookups for the compiler default types, the opcode table, and the global virtuals table, as well as creating an <strong>idScopeName</strong> object to key the list of <strong>idVarDef</strong>s on.</p>
<p>To cope with the bloat the hash tables might introduce, I also used an <strong>idStrPool</strong> for all the strings inside <em>idProgram.cpp</em>. This had the added bonus of allowing pointer hashing and comparisons on the <strong>idPoolStr*</strong> since they are guaranteed to be unique.</p>
<p>If you remember that I said the <strong>idHashTable</strong> implementations were awful, they are! Luckily the Splash Damage guys created <strong>sdHashMapGeneric</strong> so I didn&#8217;t have to make my own or figure out what exactly the weird id stuff is doing (I still don&#8217;t understand it).</p>
<p>There were some structural changes that had to be made to accommodate all of this, but nothing earth shattering. I chopped out some useless classes and created new ones such as <strong>idTypeDef_Static</strong> (<strong>idTypeDef</strong>s need an <strong>idPoolStr*</strong> name now, but you can&#8217;t statically create one), etc.</p>
<h4>Did It Get Faster?</h4>
<p>Yes! I don&#8217;t really have a wide assortment of test systems, but these are the improvements I saw:</p>
<pre name="code" class="cpp">
                           Default    Optimized     Speedup
FeaRog&#039;s Laptop(Debug):  ~180-300s         ~12s      15-25x
AMD64 3200+(Release):         ~20s          ~1s         20x
AMD64 3200+(Debug):           ~60s          ~4s         15x
E8400@3.6ghz(Release):       ~1.5s        ~0.3s          5x
E8400@3.6ghz(Debug):        ~32.5s        ~2.3s         14x
</pre>
<p>Memory usage did go up by ~0.5mb, but that&#8217;s really nothing when the compiler is using 6-7mb as it is. I also spiffed up <strong>idProgram::CompileStats</strong> to produce more detailed stats so you can see exactly where the memory is going (and the overhead on the various containers). I would have fixed <strong>idList</strong> to resize exponentially and gotten rid of the awful <strong>.setGranularity</strong> calls, but editing anything in <strong>idLib</strong> always forced nearly the entire solution to recompile so I let it go.</p>
<h4>Fin</h4>
<p><a href="http://www.team5150.com/~andrew/blog/files/etqwcompiler-1.5.zip">Download the source</a> (for ET:QW SDK 1.5) and try it out! Simply unzip it to your ET:QW SDK source directory and re-compile, it should work flawlessly unless you&#8217;ve made conflicting modifications to the 1.5 SDK.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/floodyberry.wordpress.com/23/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/floodyberry.wordpress.com/23/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/floodyberry.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/floodyberry.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/floodyberry.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/floodyberry.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/floodyberry.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/floodyberry.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/floodyberry.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/floodyberry.wordpress.com/23/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/floodyberry.wordpress.com/23/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/floodyberry.wordpress.com/23/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=23&subd=floodyberry&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://floodyberry.wordpress.com/2008/07/09/the-id-tech-4-script-compiler-sucks-hard/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/06f8635e391015012b8eb663e26a0352?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">floodyberry</media:title>
		</media:content>
	</item>
		<item>
		<title>Tribes 1 Physics, Part Four: Explosions</title>
		<link>http://floodyberry.wordpress.com/2008/04/29/tribes-1-physics-part-four-explosions/</link>
		<comments>http://floodyberry.wordpress.com/2008/04/29/tribes-1-physics-part-four-explosions/#comments</comments>
		<pubDate>Tue, 29 Apr 2008 17:30:23 +0000</pubDate>
		<dc:creator>floodyberry</dc:creator>
				<category><![CDATA[Games]]></category>
		<category><![CDATA[Tribes]]></category>
		<category><![CDATA[tribes physics]]></category>

		<guid isPermaLink="false">http://floodyberry.wordpress.com/?p=21</guid>
		<description><![CDATA[Explosions (or more accurately, knockback), the final piece to the physics puzzle! If you implement all of the previous articles, make a disc launcher, plug in the apparently simple knockback force and radius from baseProjData.cs, and finally attempt to disc jump, you will be greeted with&#8230; a nice and wimpy boost. Playing around with the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=21&subd=floodyberry&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Explosions (or more accurately, knockback), the final piece to the physics puzzle! If you implement all of the previous articles, make a disc launcher, plug in the apparently simple knockback force and radius from baseProjData.cs, and finally attempt to disc jump, you will be greeted with&#8230; a nice and wimpy boost. Playing around with the knockback force will only break explosions in new ways. Argh, you were so close! How could Tribes possibly screw this one up? <span id="more-21"></span></p>
<h4>Basic Knockback</h4>
<p>The basic idea for knockbacks from explosions is you want to take the position of the explosion and the object, scale the explosion force, and apply an impulse to the object in the appropriate direction. Generally this is implemented something like so:</p>
<pre name="code" class="cpp">
Player.onExplosion( Vector3 explosion, float radius, float knockback, float damage ) {
    Vector3 distance = ( hitbox.center - explosion ), direction = distance.Normalize
    if ( distance.Length &gt; radius )
        return

    float power = 1 - ( distance.Length / radius )
    ApplyDamage( power * damage )
    ApplyImpulse( knockback * ( power * direction ) / armor.Mass )
}
</pre>
<p><em>power</em> is where the magic happens and while it should be an exponential falloff, it is common to do linear because there is not much of a noticeable difference. It is also where Tribes deviates from the norm in a fairly big way.</p>
<h4>Tribes Knockback</h4>
<p>What is obvious when you implement the basic knockback is that it packs a much smaller punch than Tribes. What you might not notice is that the Tribes knockback up close, e.g. at your feet from a standstill, is actually weaker than from a bit of a distance, e.g. at your feet while you&#8217;re in the air after jumping. Whether this was intended as an aid to disc jumping or an accident of a botched formula is impossible to say, but it doubtless would have been harder to gain speed from disc jumps without this odd feature.</p>
<p>If you work out exactly what the Tribes knockback function does (not fun) and re-factor the formula (not fun and also confusing), you come out with a somewhat familiar power calculation. Instead of the basic ( 1 &#8211; ( d / r ) ), Tribes uses ( ( d / Max( d &#8211; minHitboxDimension, 1 ) ) &#8211; ( d / r ) ), i.e. instead of using a constant ( 100% &#8211; linear falloff ) for the power, Tribes does some weird calculations which I can only guess are trying to account for the hitbox in some way. This image illustrates the power falloff for Tribes, the power falloff for the basic formula, and the power falloff for the basic formula if it were boosted so players were able to disc jump properly:</p>
<p><img src="http://www.team5150.com/~andrew/blog/images/t1knockback.png" /></p>
<p>This shows why the linear falloff formula cannot be fixed by boosting the knockback force without getting wildly inaccurate up close. Here is the proper formula:</p>
<pre name="code" class="cpp">
Player.onExplosion( Vector3 explosion, float radius, float knockback, float damage ) {
    Vector3 distance = ( hitbox.Center - explosion ), direction = distance.Normalize
    float minbox = Min( Min( hitbox.Width.x, hitbox.Width.y ), hitbox.Width.z )
    float d = Max( distance.Length - minbox, MetersToUnits( 1 ) )
    if ( d &gt; radius )
        return

    float power = ( distance.Length / d ) - ( distance.Length / radius )
    ApplyDamage( damage * ( 1 - d / radius ) )
    ApplyImpulse( knockback * ( power * direction ) / armor.Mass )
}
</pre>
<p>For the Light Armor with a BOX_WIDTH of 0.5, BOX_DEPTH of 0.5, and BOX_HEIGHT of 2.3, minbox equates to Min( Min( BOX_WIDTH * 2, BOX_DEPTH * 2 ), BOX_HEIGHT ), or 1.0. The player origin is at the center of the player&#8217;s feet and the hitbox dimensions extend out from that, so the width and depth extend out in either direction and need to be doubled while the height goes from the feet to the head. Note that if you increase the smallest hitbox dimension, e.g. increase BOX_WIDTH and BOX_DEPTH to have a value of 2.3 when doubled, the linear ramp leading up to the downward falloff of the power will grow, presumably to account for greater surface contact area with the explosion.</p>
<h4>You&#8217;re Done!</h4>
<p>You should have all you need to get authentic Tribes 1 Physics going in your engine of choice! After this you should only need to create some dummy weapons, scale their velocities to whatever units you are using, add two flags, and you can get an honest game of LT going. </p>
<p>What? You don&#8217;t believe me that it works and you don&#8217;t want to waste your time implementing the physics only to find out that I lied? Hmmm, here&#8217;s something that should convince you. This is the full T1 physics running on Raindance in ET:QW. I did this for the ET:QW Tribes mod that was in the works, but unfortunately things kind of fizzled out. The map is authentic Raindance (every crevasse and triangle is there) with an 8192&#215;8192 command map texture overlaid on top with Arcanox&#8217;s bases and bridge. This is currently the closest thing to Tribes you will ever see in another engine.</p>
<p><span style="text-align:center; display: block;"><a href="http://floodyberry.wordpress.com/2008/04/29/tribes-1-physics-part-four-explosions/"><img src="http://img.youtube.com/vi/zOcwJX-nNok/2.jpg" alt="" /></a></span></p>
<p>You can also download the <a href="http://www.team5150.com/~andrew/blog/files/etqw-raindance-discjump.avi">hi-res version (15mb)</a>, although the burps are more apparent since my computer wasn&#8217;t quite up to spec to run QW. How I got the terrain in and textured will have to wait for another day.</p>
<h4>Tribes 1 Physics Series</h4>
<ul>
<li>Part One: <a href="http://floodyberry.wordpress.com/2008/02/20/tribes-1-physics-part-one-overview/">Overview</a></li>
<li>Part Two: <a href="http://floodyberry.wordpress.com/2008/02/24/tribes-1-physics-part-two-movement/">Movement</a></li>
<li>Part Three: <a href="http://floodyberry.wordpress.com/2008/04/11/tribes-1-physics-part-three-collision/">Collision</a></li>
<li>Part Four: <a href="http://floodyberry.wordpress.com/2008/04/29/tribes-1-physics-part-four-explosions/">Explosions</a></li>
</ul>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/floodyberry.wordpress.com/21/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/floodyberry.wordpress.com/21/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/floodyberry.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/floodyberry.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/floodyberry.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/floodyberry.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/floodyberry.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/floodyberry.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/floodyberry.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/floodyberry.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/floodyberry.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/floodyberry.wordpress.com/21/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=21&subd=floodyberry&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://floodyberry.wordpress.com/2008/04/29/tribes-1-physics-part-four-explosions/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
<enclosure url="http://www.team5150.com/~andrew/blog/files/etqw-raindance-discjump.avi" length="15779840" type="video/x-msvideo" />
	
		<media:content url="http://0.gravatar.com/avatar/06f8635e391015012b8eb663e26a0352?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">floodyberry</media:title>
		</media:content>

		<media:content url="http://www.team5150.com/~andrew/blog/images/t1knockback.png" medium="image" />

		<media:content url="http://img.youtube.com/vi/zOcwJX-nNok/2.jpg" medium="image" />
	</item>
		<item>
		<title>Tribes 1 Physics, Part Three: Collision</title>
		<link>http://floodyberry.wordpress.com/2008/04/11/tribes-1-physics-part-three-collision/</link>
		<comments>http://floodyberry.wordpress.com/2008/04/11/tribes-1-physics-part-three-collision/#comments</comments>
		<pubDate>Sat, 12 Apr 2008 00:47:59 +0000</pubDate>
		<dc:creator>floodyberry</dc:creator>
				<category><![CDATA[Games]]></category>
		<category><![CDATA[Tribes]]></category>
		<category><![CDATA[tribes physics]]></category>

		<guid isPermaLink="false">http://floodyberry.wordpress.com/?p=20</guid>
		<description><![CDATA[This article covers the collision physics of Tribes 1, i.e. attempting to actually move the player and what to do when the player runs in to something. This is the most convoluted part of the physics and requires a lot of little touches to get right. Tribes movement and collision handling actually gets a little [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=20&subd=floodyberry&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>This article covers the collision physics of Tribes 1, i.e. attempting to actually move the player and what to do when the player runs in to something. This is the most convoluted part of the physics and requires a lot of little touches to get right. Tribes movement and collision handling actually gets a little too low level, so I won&#8217;t be able to show <em>exactly</em> how it works (it gets down to dealing with the raw triangle lists which I don&#8217;t think all collision detection systems will let you get at), but it will be detailed enough so that there will be no major differences. <span id="more-20"></span></p>
<h4>Warning!</h4>
<p>Before I get in to the details, a little explanation is required. When Tribes attempts to move an object, it takes the maximum distance the object will cover in the remaining time (X), then divides the remaining time by ceil(X) and does ceil(X) collision detection loops. This is ostensibly to ensure that an object only moves 1m at a time, but for what reason, I don&#8217;t know. The engine is obviously capable of fairly arbitrary translations (or else Gambase::GetLosInfo would not work) so I can only imagine many short translations proved to be more efficient than a single large translation.</p>
<p>I am going to be using a single translation, and normally this would not make a difference, but Tribes does something a little weird on collisions which necessitates a rather odd fix-up when you are using a single translation versus slicing them up. Instead of adjusting the position based on velocity when there is no collision and adjusting the velocity &amp; setting the position to the collision point on a collision, Tribes adjusts the position based on the velocity <em>no matter what</em>. This means Tribes will attempt to move the player, hit a surface, adjust the velocity based on the collision, and then move the player anyway <em>from their original position</em> based on the new velocity, usually resulting in the player bouncing ever so slightly away from the contacted surface. There is actually a noticeable difference between the correct way and Tribes way of handling collisions, as the correct way feels slightly velcroish while Tribes feels more fluid. </p>
<p>Things get more complicated when you move the player in a single translation instead of sliced up translations as the weird adjustment on collisions is only done for the last fraction of the translation and not the entire thing. I worked around this by calculating some values which let me figure out how many &#8220;non-collision&#8221; slices have occurred and only do the weird handling on the last slice.</p>
<h4>Collision Code</h4>
<pre name="code" class="cpp">
Player.UpdatePosition( float tickLen ) {
    float decayFriction = currentFriction * Physics.FRICTIONDECAY
    float lastSurfDirection = lastJumpableNormal.Dot( Gravity.upNormal )
    float timeLeft = tickLen
    int maxBumps = 4, bumps

    currentFriction = 0
    collisionLastTick = false

    for ( bumps = 0; ( bumps &lt; maxBumps ) &amp;&amp; ( timeLeft &gt; 0 ); bumps++ ) {
        // slice fixup values
        Vector3 maxDistance = ( velocity * timeLeft )
        int iterations = ceil( UnitsToMeters( maxDistance.Length ) )
        float sliceTime = timeLeft / iterations

        // attempt to move through the world
        Vector3 originalPos = position, endPos = position + maxDistance
        moveFraction, finalPos, contactNormal = Physics.Translate( originalPos, endPos )

        // figure out how long we moved for and adjust the remaining time
        float duration = timeLeft * moveFraction
        timeLeft -= duration
        position = finalPos

        // did we move the entire distance safely?
        if ( !timeLeft )
            break

        // collisionLastFrame gets set even if we step up and don&#039;t have an actual collision
        collisionLastFrame = true
        float surfDirection = contactNormal.Dot( Gravity.upNormal )

        if ( surfDirection &lt; armor.JUMPSURFACE_MINDOT ) {
            // code to handle potentially stepping up sheer surfaces
            if ( steppedUp )
                continue
        }

        float impactDot = -velocity.Dot( contactNormal )

        // take damage if needed
        if ( UnitsToMeters( impactDot ) &gt; armor.MINDAMAGESPEED )
            OnDamage( ( UnitsToMeters( impactDot ) - armor.MINDAMAGESPEED ) * armor.DAMAGESCALE )

        // if we hit a jumpable surface, update the jumpable normal and reset the timestamp
        if ( surfDirection &gt;= armor.JUMPSURFACE_MINDOT ) {
            if ( ( lastJumpableNormalTimestamp &gt; ( Physics.TICKBASE * 1000 ) ) ||
                 ( surfDirection &lt; lastSurfDirection ) ) {
                lastSurfDirection = surfDirection
                lastJumpableNormalTimestamp = 0
                lastJumpableNormal = contactNormal
            }
        }

        // do some voodoo for tribes collision adjustments and timeslices
        int impactIterations = ceil( duration / sliceTime )
        float fullMotionTime = sliceTime * ( impactIterations - 1 )
        float fixupTime = duration - fullMotionTime
        position = originalPosition + ( velocity * fullMotionTime )

        // bounce
        Vector3 bounce = ( contactNormal * ( impactDot + MetersToUnits( Physics.ELASTICITY ) ) )
        velocity += bounce

        // readjust position based on bounced velocity
        position = Physics.Translate( position, position + ( velocity * fixupTime ) )

        // only update friction on upward facing surfaces
        if ( surfDirection &gt; 0 ) {
            currentFriction = surfDirection

            if ( crawledToStop &amp;&amp; ( velocity &lt; MetersToUnits( Physics.MINSPEED ) ) ) {
                velocity = Vector3( 0, 0, 0 )
                position = originalPosition
                break
            }
        }
    }

    if ( bumps &gt;= maxBumps ) {
        // Tribes sets the velocity to 0 here, this is where skibugs happen
    }

    if ( collisionLastFrame )
        currentFriction = Min( Max( currentFriction, decayFriction ), 1 )

    return ( collisionLastFrame )
}
</pre>
<h4>Notes</h4>
<p>Ski bugs are caused when the translation loop exceeds the maximum number of collisions. When this happens, Tribes zeros the player&#8217;s velocity as it is having trouble successfully moving. I think there is something in the velocity <em>bounce</em> that occasionally causes the translation loop to get stuck running in to the surface over and over with no change in velocity. When this happens, there is obviously no right answer as to what to do, but zeroing the velocity is fairly annoying answer. I&#8217;ve found that just ignoring the situation and hoping the next tick results in the player getting dislodged appears to work much better.</p>
<p>Also note that I keep forgetting to add constants (MINDAMAGESPEED and DAMAGESCALE in this post) and need to update the original post to include them. Since nobody is reading this and I am taking a while in getting it together, I do not think anyone will mind.</p>
<h4>Tribes 1 Physics Series</h4>
<ul>
<li>Part One: <a href="http://floodyberry.wordpress.com/2008/02/20/tribes-1-physics-part-one-overview/">Overview</a></li>
<li>Part Two: <a href="http://floodyberry.wordpress.com/2008/02/24/tribes-1-physics-part-two-movement/">Movement</a></li>
<li>Part Three: <a href="http://floodyberry.wordpress.com/2008/04/11/tribes-1-physics-part-three-collision/">Collision</a></li>
<li>Part Four: <a href="http://floodyberry.wordpress.com/2008/04/29/tribes-1-physics-part-four-explosions/">Explosions</a></li>
</ul>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/floodyberry.wordpress.com/20/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/floodyberry.wordpress.com/20/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/floodyberry.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/floodyberry.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/floodyberry.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/floodyberry.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/floodyberry.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/floodyberry.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/floodyberry.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/floodyberry.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/floodyberry.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/floodyberry.wordpress.com/20/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=20&subd=floodyberry&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://floodyberry.wordpress.com/2008/04/11/tribes-1-physics-part-three-collision/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/06f8635e391015012b8eb663e26a0352?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">floodyberry</media:title>
		</media:content>
	</item>
		<item>
		<title>Tribes 1 Physics, Part Two: Movement</title>
		<link>http://floodyberry.wordpress.com/2008/02/24/tribes-1-physics-part-two-movement/</link>
		<comments>http://floodyberry.wordpress.com/2008/02/24/tribes-1-physics-part-two-movement/#comments</comments>
		<pubDate>Sun, 24 Feb 2008 16:48:55 +0000</pubDate>
		<dc:creator>floodyberry</dc:creator>
				<category><![CDATA[Games]]></category>
		<category><![CDATA[Tribes]]></category>
		<category><![CDATA[tribes physics]]></category>

		<guid isPermaLink="false">http://floodyberry.wordpress.com/?p=19</guid>
		<description><![CDATA[This article covers the movement physics of Tribes 1, i.e. jumping, jetting, and ground movement/friction. These account for ~90% of the &#8220;Tribes&#8221; feeling, although even 90% will still feel wrong to anyone who knows the authentic feel well. There will be some variables which are only set in the Collision code, but there shouldn&#8217;t be [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=19&subd=floodyberry&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>This article covers the movement physics of Tribes 1, i.e. jumping, jetting, and ground movement/friction. These account for ~90% of the &#8220;Tribes&#8221; feeling, although even 90% will still feel wrong to anyone who knows the authentic feel well. There will be some variables which are only set in the Collision code, but there shouldn&#8217;t be any confusion as to what they do. <span id="more-19"></span></p>
<h4>Movement Code</h4>
<p>There are a few spots where Tribes took shortcuts with it&#8217;s math and assumed that gravity would always be &#8220;0 0 -X&#8221;, saving a couple multiplies on dots and such. I&#8217;ve replaced these spots to be gravity agnostic as I don&#8217;t think the CPU savings are that big and they make the code difficult to understand. It shouldn&#8217;t be hard to &#8220;re-optimize&#8221; if you feel the need. Dot products will be a .Dot method because overloaded multiplication operators confuse me.</p>
<h5>Player.Tick</h5>
<p>Player.Tick starts off by taking the user&#8217;s movement input (left, right, forward, back) and creating the speed vector for walking and direction vector for jumping and jetting. This is pretty basic and there is nothing out of the ordinary here.</p>
<pre name="code" class="cpp">
Player.Tick( Move move, int tickLenMs ) {
    float tickLen = ( 1000 / tickLenMs )

    float maxSpeed = MetersToUnits( armor.WALKSPEED )
    float forwardSpeed = MetersToUnits( armor.WALKSPEED ), sideSpeed = MetersToUnits( armor.WALKSPEED - 1 )

    move.speed = Vector3( ( move.right - move.left ) * forwardSpeed, ( move.forward - move.back ) * sideSpeed, 0 )
    if ( move.speed.Length &gt; maxSpeed )
        move.speed *= ( maxSpeed / move.speed.Length )

    move.speed = self.ToWorldSpace( move.speed )
    move.direction = move.speed.Normalize
}
</pre>
<p>After this, the Player energy is updated, we check lastJumpableNormalTimestamp to see if a jump should be allowed, and then do the jumping, jetting, and walking. Most of the real work takes place in Jump, Jet, and Friction, so the tick function is fairly sparse.</p>
<p>A few things to note:</p>
<ul>
<li>Tribes lets you jump up to 256ms after your last ground contact, allowing you to jump smoothly through little bumps and such where you technically leave the ground, but don&#8217;t really appear to.</li>
<li>Tribes handles jetpack energy a little differently, i.e. once you get to around 5% of your jets, they cut off and recharge a little causing the jets to stutter. I haven&#8217;t worked this part out yet, but may re-edit this section if I do.</li>
<li>Tribes applies gravity <em>constantly</em>, it is not hacked off when you are resting on a surface.</li>
<li>JETENERGY_CHARGE equals &#8220;8 + 3&#8243; with the &#8220;+ 3&#8243; being the recharge boost from an energy pack.</li>
</ul>
<pre name="code" class="cpp">
    bool isJetting = ( move.jetting &amp;&amp; ( energy &gt; 0 ) )
    bool isJumping = ( move.jumping &amp;&amp; ( lastJumpableNormalTimestamp &lt; Physics.MAXJUMPTICKS ) )

    // jump
    if ( isJumping )
        velocity += Jump( move.direction )
    crawlToStop = ( velocity.Length &lt; MetersToUnits( Physics.CRAWLTOSTOP ) )

    // jets and acceleration
    accel = Gravity.force
    energy += ( armor.JETENERGY_CHARGE * tickLen )
    if ( isJetting ) {
        accel += Jet( move.direction, tickLen )
        energy -= ( armor.JETENERGY_DRAIN * tickLen )
    }
    energy.Clamp( 0, armor.MAXENERGY )
    velocity += ( accel / tickLen )

    // walking and friction
    if ( collisionLastTick )
        velocity += Friction( move.speed, tickLen )

    // update jumpableNormal timestamp and try to move
    lastJumpableNormalTimestamp += tickLenMs
    /* position, velocity = UpdatePosition( tickLen ) */
}  
</pre>
<h5>Player.Jump</h5>
<pre name="code" class="cpp">
Player.Jump( Vector3 moveDirection ) {
    // need another ground contact before we can jump again
    lastJumpableNormalTimestamp = Physics.MAXJUMPTICKS

    // jump up
    float surfaceDirection = lastJumpableNormal.Dot( Gravity.upNormal )
    float impulse = MetersToUnits( armor.JUMPIMPULSE / armor.MASS )
    Vector3 jump = ( surfaceDirection * impulse ) * Gravity.upNormal 

    // if we&#039;re moving away from the surface, jump away
    float orientation = lastJumpableNormal.Dot( moveDirection )
    if ( orientation &gt; 0 )
        jump += ( impulse * orientation ) * moveDirection

    return ( jump )
}
</pre>
<h5>Player.Jet</h5>
<p>Side jets only kick in if the player is holding down a movement key and it&#8217;s been longer than MAXJUMPTICKS since the last ground contact. Since jumping sets lastJumpableNormalTimestamp to the limit, jumping and jetting results in side jets being enabled immediately, while simply holding down your jets on the ground will give a little startup time of full jets regardless of whether a direction key is down.</p>
<pre name="code" class="cpp">
Player.Jet( Vector3 moveDirection ) {
    float forwardVelocity = MetersToUnits( armor.JETFORWARD )
    float jetForce = MetersToUnits( armor.JETFORCE / armor.MASS )

    if ( ( lastJumpableNormalTimestamp &gt;= Physics.MAXJUMPTICKS ) &amp;&amp; ( moveDirection.Length != 0 ) ) {
        float sidePower
        float orientation = velocity.Dot( moveDirection )
        if ( orientation &gt; forwardVelocity )
            sidePower = 0
        else if ( orientation &lt; 0 )
            sidePower = armor.JETSIDEFORCE
        else
            sidePower = ( 1 - ( orientation / forwardVelocity ) )

        sidePower = Min( sidePower, armor.JETSIDEFORCE )
        Vector3 sideForce = ( sidePower * jetForce ) * moveDirection
        Vector3 upForce = ( ( 1 - sidePower ) * jetForce ) * Gravity.upNormal
        return ( upForce + sideForce )
    } else {
        // straight up, full jets
        return ( jetForce * Gravity.upNormal )
    }
}
</pre>
<h5>Vector.ProjectOntoPlane</h5>
<p>This is needed for the gravity agnostic Friction function. I don&#8217;t know how common it is.</p>
<pre name="code" class="cpp">
Vector3.ProjectOntoPlane( Vector3 normal ) {
    this -= ( this.Dot( normal ) * normal ) )
}
</pre>
<h5>Player.Friction</h5>
<p>Multiplying GROUNDTRACTION by currentFriction is unfortunately not the magical &#8220;Friction = 0&#8243; that supposedly causes skiing. currentFriction is decayed every tick when there hasn&#8217;t been a ground contact, resulting in the ability to jet and slide against walls and ceilings without being slowed down by the contact friction.</p>
<p>ProjectOntoPlane is probably not needed since the player should always be oriented so the move will never include a component not in the gravity plane, but it doesn&#8217;t hurt to include it.</p>
<pre name="code" class="cpp">
Player.Friction( Vector3 moveSpeed, float tickLen ) {
    Vector3 dampen = ( moveSpeed - velocity )
    dampen.ProjectOntoPlane( Gravity.downNormal )

    float traction = Min( currentFriction * armor.GROUNDTRACTION, 1 )
    float force = ( MetersToUnits( armor.GROUNDFORCE / armor.MASS ) * traction * tickLen )
    if ( dampen.Length &gt; force )
        dampen *= ( force / dampen.Length )
    else
        crawlToStop = true

    return ( dampen )
}
</pre>
<h4>Tribes 1 Physics Series</h4>
<ul>
<li>Part One: <a href="http://floodyberry.wordpress.com/2008/02/20/tribes-1-physics-part-one-overview/">Overview</a></li>
<li>Part Two: <a href="http://floodyberry.wordpress.com/2008/02/24/tribes-1-physics-part-two-movement/">Movement</a></li>
<li>Part Three: <a href="http://floodyberry.wordpress.com/2008/04/11/tribes-1-physics-part-three-collision/">Collision</a></li>
<li>Part Four: <a href="http://floodyberry.wordpress.com/2008/04/29/tribes-1-physics-part-four-explosions/">Explosions</a></li>
</ul>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/floodyberry.wordpress.com/19/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/floodyberry.wordpress.com/19/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/floodyberry.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/floodyberry.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/floodyberry.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/floodyberry.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/floodyberry.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/floodyberry.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/floodyberry.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/floodyberry.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/floodyberry.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/floodyberry.wordpress.com/19/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=19&subd=floodyberry&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://floodyberry.wordpress.com/2008/02/24/tribes-1-physics-part-two-movement/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/06f8635e391015012b8eb663e26a0352?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">floodyberry</media:title>
		</media:content>
	</item>
		<item>
		<title>Tribes 1 Physics, Part One: Overview</title>
		<link>http://floodyberry.wordpress.com/2008/02/20/tribes-1-physics-part-one-overview/</link>
		<comments>http://floodyberry.wordpress.com/2008/02/20/tribes-1-physics-part-one-overview/#comments</comments>
		<pubDate>Thu, 21 Feb 2008 04:04:35 +0000</pubDate>
		<dc:creator>floodyberry</dc:creator>
				<category><![CDATA[Games]]></category>
		<category><![CDATA[Tribes]]></category>
		<category><![CDATA[tribes physics]]></category>

		<guid isPermaLink="false">http://floodyberry.wordpress.com/?p=18</guid>
		<description><![CDATA[Games have been trying replicate the feel of Tribes 1 for almost as long as it&#8217;s been out (Tribes 2 started development in mid-1999) and every single one of them has failed, usually miserably.

Tribes 2 physics are an abomination, although in hindsight it should have come as no surprise after the Base+ mod Dynamix play-tested [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=18&subd=floodyberry&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Games have been trying replicate the feel of Tribes 1 for almost as long as it&#8217;s been out (Tribes 2 started development in mid-1999) and every single one of them has failed, usually miserably.</p>
<ul>
<li>Tribes 2 physics are an abomination, although in hindsight it should have come as no surprise after the <a href="http://www.gamers.org/pub/idgames2/planetquake/planetstarsiege/tribes/mods/plus.zip">Base+ mod</a> Dynamix play-tested in Tribes 1 to massive disdain. It&#8217;s as if they were trying to build on the success of Tribes 1 when it was 2 weeks old, not 2 years. Unfortunately for the majority of the community not in the beta, we didn&#8217;t find this out until after the game was paid for. Further mods such as Base++, Team Rabbit 2, and Classic attempted to rectify the situation yet were still only a pale ghost of Tribes 1.</li>
<li><a href="http://legendsthegame.net/">Legends</a> has always felt wrong despite how often they tweaked and twiddled the physics and <a href="http://www.tribalwar.com/forums/showthread.php?p=9349485#post9349485">boasted of having the original physics source code</a>. If they did have the source, they either didn&#8217;t know how to implement it or didn&#8217;t have enough of the source to properly replicate all of the required physics.</li>
<li>Tribes Vengeance is so far removed from the feel of Tribes that it shouldn&#8217;t even enter the discussion. Jetting is wrong, air movement shouldn&#8217;t even exist, collisions are wrong, and skiing is a sick joke. I can only hope KineticPoet remembered what used to be and silently winced every time he sat down to work on the game.</li>
<li>The yet-to-be-released <a href="http://pc.ign.com/objects/142/14228914.html">Fallen Empire: Legions</a> appears to be following the &#8220;We&#8217;re not a Tribes 1 clone, so let&#8217;s make wacky changes to ram that home&#8221; mantra. Ideas like 6 way jetting (jetting down while in the air and laterally while on the ground), non-friction sliding a-la T:V, jetpack overdrive, a charge up sniper rifle, etc., sound like they could easily alter the game beyond good taste.</li>
</ul>
<p>What all of these games ostensibly want is to appeal to Tribes 1 players, yet they attempt to accomplish this by using a different and/or completely arbitrary physics system, adding something that resembles a jetpack and skiing and hoping everyone likes it. While I don&#8217;t know if a carbon copy of Tribes 1 on a modern engine would be a success, I do know that almost any Tribes 1 veteran will be unsatisfied with any Tribes style game that does not replicate the feel of Tribes 1 regardless of how often the developers hide behind the claim of &#8220;not Tribes 1&#8243;. </p>
<p>These articles will solve that problem. I will provide everything needed for a 99% re-creation of the Tribes 1 physics on any 3d engine. There are a few pieces to the puzzle so I&#8217;ll be breaking the topics in to separate articles for movement, collision, and explosions. This article will go over the basics of the engine and document the structures and constants I&#8217;ll be using.<span id="more-18"></span></p>
<h4>Basics</h4>
<p>To start off, all units will be in <strong>meters</strong>, the default unit of Tribes. You will need to provide a separate function to convert the Tribes meters to your engine&#8217;s native units, e.g. if 1 unit in your engine of choice is roughly equal to an inch, you will need to convert from Tribes meters to inches, i.e. multiply ~39.37. This will ensure that the authentic Tribes 1 constants are being used and that nothing has been messed up to due to being mis-scaled by hand. I will use <strong>MetersToUnits</strong> and <strong>UnitsToMeters</strong> to indicate when a conversion needs to be made.</p>
<p>Tribes runs at 31.25 ticks a second, or 0.032s per tick. Altering the ticks per second <strong>will</strong> result in slightly different physics as the gravity is accumulated based on the tick length while jumping is a singular impulse, so as the tick length grows, gravity slowly subsumes the jump impulse. There are other less obvious calculations based on the tick length which will be altered as well; some of these can be accounted for, some not.</p>
<p>The coordinate system is Z up, and Vectors are assumed to have overloaded operators.</p>
<p>All code examples will be in a pseudo C-like language and should be easy to translate in to any engine.</p>
<h5>Physics Workflow (details omitted)</h5>
<pre name="code" class="cpp">
Tick( Move move, float ticklen ) {
    if ( jumping and canJump )
        velocity += Jump( move.direction )

    accel = Gravity.force
    if ( jetting )
        accel += Jet( move.direction )

    velocity += ( accel / ticklen )
    if ( collisionLastTick )
        velocity += Friction( move.speed, ticklen )

    position, velocity = UpdatePosition( ticklen )
}
</pre>
<p>There is nothing special in Tribes 1 physics, skiing is not achieved by &#8220;Friction = 0&#8243;, and the only feature an engine needs is the ability to properly detect collisions. There are a lot of nuances required to achieve a simulation that doesn&#8217;t feel subtly wrong, but none are of the earth-shatteringly unique variety that many claim only the Tribes/Torque engine can replicate. They are, however, nuances that you are highly unlikely to match with pure guesswork.</p>
<h5>Structures</h5>
<p>These are the structures and values I will be using for detailed explanations of the physics workflow. Don&#8217;t worry if you don&#8217;t know what a value does just yet.</p>
<pre name="code" class="cpp">
Gravity {
    Vector3 force = ( 0, 0, -20 )
    Vector3 upNormal = -force.Normalize
    Vector3 downNormal = force.Normalize
}

Physics {
    float TICKBASE = 0.032
    float ELASTICITY = 0.001
    float CRAWLTOSTOP = 0.1
    float MINSPEED = 0.75
    float FRICTIONDECAY = 0.6
    int MAXJUMPTICKS = 256
}

LightArmor : Armor {
    float MASS = 9
    float GROUNDFORCE = 9 * 40
    float GROUNDTRACTION = 3
    float WALKSPEED = 11
    float JUMPIMPULSE = 75
    float JUMPSURFACE_MINDOT = 0.2

    float MINDAMAGESPEED = 25
    float DAMAGESCALE = 0.005

    float JETFORCE = 236
    float JETSIDEFORCE = 0.8
    float JETFORWARD = 22
    float MAXENERGY = 60
    float JETENERGY_DRAIN = 25
    float JETENERGY_CHARGE = 8 + 3

    float BOX_WIDTH = 0.5
    float BOX_DEPTH = 0.5
    float BOX_HEIGHT = 2.3
}

Player {
    Armor armor
    Vector3 position, velocity
    float energy

    bool crawledToStop
    bool collisionLastTick
    Vector3 lastJumpableNormal
    int lastJumpableNormalTimestamp
    float currentFriction
}
</pre>
<h4>Tribes 1 Physics Series</h4>
<ul>
<li>Part One: <a href="http://floodyberry.wordpress.com/2008/02/20/tribes-1-physics-part-one-overview/">Overview</a></li>
<li>Part Two: <a href="http://floodyberry.wordpress.com/2008/02/24/tribes-1-physics-part-two-movement/">Movement</a></li>
<li>Part Three: <a href="http://floodyberry.wordpress.com/2008/04/11/tribes-1-physics-part-three-collision/">Collision</a></li>
<li>Part Four: <a href="http://floodyberry.wordpress.com/2008/04/29/tribes-1-physics-part-four-explosions/">Explosions</a></li>
</ul>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/floodyberry.wordpress.com/18/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/floodyberry.wordpress.com/18/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/floodyberry.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/floodyberry.wordpress.com/18/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/floodyberry.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/floodyberry.wordpress.com/18/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/floodyberry.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/floodyberry.wordpress.com/18/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/floodyberry.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/floodyberry.wordpress.com/18/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/floodyberry.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/floodyberry.wordpress.com/18/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=18&subd=floodyberry&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://floodyberry.wordpress.com/2008/02/20/tribes-1-physics-part-one-overview/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/06f8635e391015012b8eb663e26a0352?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">floodyberry</media:title>
		</media:content>
	</item>
		<item>
		<title>Writing a (Tribes 1) Master Server</title>
		<link>http://floodyberry.wordpress.com/2008/02/15/writing-a-tribes-1-master-server/</link>
		<comments>http://floodyberry.wordpress.com/2008/02/15/writing-a-tribes-1-master-server/#comments</comments>
		<pubDate>Fri, 15 Feb 2008 15:59:10 +0000</pubDate>
		<dc:creator>floodyberry</dc:creator>
				<category><![CDATA[Games]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Tribes]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[master server]]></category>

		<guid isPermaLink="false">http://floodyberry.wordpress.com/?p=16</guid>
		<description><![CDATA[While I wrote this in September 2007, for various reasons I did not get around to putting the finishing touches on it. Please pretend you&#8217;re reading it then and not now!
After the hubbub over Sierra&#8217;s announcement that they were ceasing multiplayer support for Tribes 1 and the resulting scramble to locate a replacement master server, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=16&subd=floodyberry&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><em>While I wrote this in September 2007, for various reasons I did not get around to putting the finishing touches on it. Please pretend you&#8217;re reading it then and not now!</em></p>
<p>After the hubbub over Sierra&#8217;s <a href="http://www.sierra.com/en/home/news/product_news/071607_-_sierra_heritage.html">announcement</a> that they were ceasing multiplayer support for Tribes 1 and the resulting scramble to locate a replacement master server, I decided to give a shot at writing one. The required feature set appeared simple enough to only take a week or so to implement but with enough gotchas to keep it suitably interesting. While I only had a vague idea of what was required, I got a jump start on proper design by finding <a href="http://www.gamasutra.com/features/20000511/bernier_01.htm">Half-Life and Team Fortress Networking: Closing the Loop on Scalable Network Gaming Backend Services</a> by Yahn W. Bernier, an article detailing the design, implementation, and potential problems of the Half Life master server. Even though some of the topics did not apply to the Tribes 1 requirements, e.g. I can&#8217;t alter the client&#8217;s behavior to auto rate-limit the server list transmission, the article was still quite valuable and an interesting read even if you aren&#8217;t implementing a master server. <span id="more-16"></span></p>
<h4>Getting Started</h4>
<p>My initial server was as basic as you can get: a hash table of servers behind a udp socket which wakes up periodically to time out servers. Tribes servers do not send a &#8220;going offline&#8221; packet, but as their heartbeat period is every 2 minutes the server will not remain in the list for long. </p>
<p>If the intended environment was a closed setting with a limited number of servers to track, this would be more than sufficient, but in the real world it is vulnerable to many issues and attacks, intentional or otherwise.</p>
<h4>Issues and Attacks</h4>
<p><strong>Bogus Servers</strong>: The ability to add bogus servers is probably the most damaging problem due to the ability to inflate the server list which makes it possible to easily saturate the master server&#8217;s upload by requesting the server list repeatedly. The two ways of adding bogus servers are either by IP spoofing or simply sending thousands of keepalives from your IP with different ports. The Half Life Networking article goes in to this problem in depth and offers the solution of a challenge-response system where the master server, on receiving a heartbeat, sends a random value to the server and only allows the heartbeat to be registered if the server sends the correct value back. </p>
<p>While the challenge/response system solves the problem of IP spoofing completely, it is still trivial to listen and respond to the challenges on your real IP. To defeat this attack, it is necessary to limit each IP to a certain number of registered servers. This way, even if you can intercept and respond to 20,000 challenges, only X will be accepted and the server list will be no worse for the wear. To be honest, limiting an IP to a low number of servers, say 3-5, should be in place regardless of the chance for spam. The only reason to run multiple servers on a single IP is either NAT or that the additional servers are backups/rarely used. Server CPU lag is one of the worst experiences to have online, although many mods run so poorly they tend to anticipate this fact and over-compensate with ridiculous weapons and items which leave skill completely out of the equation. That is a topic for another day though.</p>
<p>Of note is the fact that while Tribes network protocol does include a sequence number which can be used as a challenge, it is only 16 bits wide. To prevent brute force attacks (however unlikely), the challenge-response value should have a timeout of a few seconds at most.</p>
<p><strong>Upload Starvation</strong>: If an IP is spamming list requests to the master server, it might be possible to saturate the master server&#8217;s upload even with a relatively small server list. I wanted to be able to address the problem of spamming requests without requiring human intervention, but at the same time not penalize legitimate users who may click the refresh button a few times too often or have multiple players behind NAT. </p>
<p>The solution, ironically, came from Tribes. I implemented a penalty per IP system similar to the in-game chat spam penalty. Each request accrues a certain amount of penalty, and when the penalty reaches a certain limit the master server stops responding to that IP. The penalty is decreased by 1 for each second that passes, allowing the client to eventually access the master again. I also added a maximum penalty cap (currently at 10 seconds) so that bans will be over fairly quickly once the IP ceases to spam. </p>
<p><strong>Unintentional flooding</strong>: Because Tribes has no auto-rate limiting mechanism to control the server list download speed, it&#8217;s possible the master server could either upload the server list faster than the client can handle or saturate it&#8217;s own upload by firing off too many packets at once when multiple requests come in. The obvious solution is for the master server to throttle it&#8217;s upload, i.e. queuing up packets instead of sending them off instantly. While a simple rate limited FIFO queue would keep the server&#8217;s upload from becoming saturated, it would also mean that after a certain point the packets on the tail end of a large queue would be timed out by the client before they had a chance to arrive. This means rate limiting needs to be done <em>per-connection</em> and not globally.</p>
<p>Tribes, however, throws a wrench in to how low you can rate limit a connection. When the first packet of a master server response arrives, Tribes creates X pending &#8220;ping&#8221; responses to wait for, where X is the number of packets left in the master server response. Because both the master server list request and server ping packets are the same (\x10\x03), Tribes piggybacks the master server list request in the server pending ping list and simply re-pings any packet in the list which times out. Unfortunately, the timestamps for a particular set of master server response packets are <em>never updated</em> when succeeding packets come in, meaning that if Tribes does not receive <em>every</em> master server packet within $pref::pingTimeoutTime milliseconds (default 900ms), it sends a new request for each remaining packet in the list.</p>
<p>So lets say the server list contains 3000 servers, and the master server batches 64 servers per packet. This would result in 47 packets at around ~450 bytes each, or 21k of data. If the master server uploads at 4k/s to the client, only 4k*0.9s = 3.6k of data will make it to the client before the remaining 38 packets are timed out. Tribes would then see 38 &#8220;pings&#8221; that timed out and re-ping each one, updating their packet keys and timestamps so that the packets still coming in from the previous request would be discarded. If Tribes hits $pref::pingRetryCount on any single master server packet, it will decide it can&#8217;t contact a master server and either move on to the next master or pop up an error box. This unfortunately means the per-client rate limit has to be set high enough to make sure the server list can be transmitted within a couple seconds at most, or 900ms to be safe. Assuming a maximum of 500 servers, or ~4k of data, 5k/s per client should be a decent limit. </p>
<p>Due to a poor choice of data structures, Tribes itself limits how large the list can grow. The server list and pending ping list are stored in vectors which grow by 5 items at a time!, leading to tons of potential copying on vector resizes and painful O(N) lookups. When the &#8220;to ping&#8221; list reaches ~6000 servers, Tribes locks up for longer than I care to wait. This is exacerbated when you raise $pref::pingTimeoutTime to accommodate larger server lists, leading to a realistic ceiling of around 4000 servers regardless of the timeout bug.</p>
<h4>Unresolved Attacks</h4>
<p>Because Tribes has no challenge-response mechanism available for server list requests, there is nothing to be done about Upload Starvation by IP spoofing. Please try not to annoy anyone who has the ability to spoof and the desire to spam your master server!</p>
<h4>Mirrors</h4>
<p>I thought about doing something with mirrors, but there really is no good reason to bother. </p>
<p>1. CPU load is a non-issue. Even if there were 50,000 live servers sending heartbeats every 2 minutes, that&#8217;s only 415 heartbeats a second. A heartbeat requires 1 hashtable lookup for the IP penalty, 1 hashtable lookup for the server table, and 3 hashtable lookups for the pending servers table if the server doesn&#8217;t exist. A worst case of 5 hash table lookups * 415 = 2075 hashtable lookups a second which would not even register on a load monitor.</p>
<p>Using a separate process to spam my master with heartbeats, I can get up to ~50,000 heartbeats a second without sending a challenge packet and ~35,000 heartbeats sending the challenge (and using ~1000kb/s up in the process). Using a separate machine to spam heartbeats peaks at around ~14,000 a second with around ~30% CPU load for the master server.</p>
<p>A &#8220;realistic&#8221; test of 18,000 servers, 22 server list requests a second, and 450 heartbeats a second resulted in ~3-10% CPU load and 2,800 kb/s upload. Memory usage peaked at around 40mb, largely due to the 530 concurrent 120k server lists being served from memory at 5kb/s. Reducing the server count to 1,000 while keeping the requests and heartbeats the same results in no measurable CPU load and 900k peak memory usage.</p>
<p>2. Bandwidth, while an issue with a community supporting 50,000 servers, is not an issue with the Tribes community. </p>
<ul>
<li>There are currently 121 active servers in the current master list. </li>
<li>According to archive.org&#8217;s snapshot of gamespy&#8217;s <a href="http://web.archive.org/web/20040612064055/http://archive.gamespy.com/stats/">stats page</a>, Tribes only had ~250 servers up in July 2004. </li>
<li>Going back even farther is one of Tim Sweeney&#8217;s <a href="http://www.team5150.com/~andrew/sweeney/tims_news_1999.html#d19991002">old news posts</a> from October 1999, showing 589 servers during Tribes&#8217; heyday. </li>
</ul>
<p>Lets do the math:</p>
<pre name="code" class="cpp">
        Servers    Bytes / List     Req/s 40kb up   160kb up
1999        589            4323              9.47      37.90
2004        250            1950             21.01      84.02
2007        121            1047             39.12     156.49
</pre>
<p>Even a crappy cable modem can handle 39 reqs/s with the current list, and a T1 could do just as well with the server count from 1999. Obviously it makes more sense to find a host that will stay up than to make a tangled net of mirrors. There are also other problems with unorganized mirrors such as all of the mirrors a server reports to going down and the fact that the Tribes client queries mirrors sequentially instead of randomly.</p>
<p>If a host can&#8217;t be found that is guaranteed to be up, then finding 5 such hosts and needlessly complicating the master server isn&#8217;t going to make the system more robust.</p>
<h4>Code!</h4>
<p>The only thing I don&#8217;t really like about the current code is the packet queuing. While the current server can obviously scale quite high, it should be possible to construct packets on the fly instead of batching them up in memory. The problem to be solved is how to keep multiple iterators in a hash table which can have elements added/removed at any time, and since the master has to tell Tribes how many packets it will be sending, to not send any additional servers which are added after the time of the list request. I have some ideas, but I should probably post this first instead of continuing to put it off.</p>
<p>The code is cross-platform, although I don&#8217;t have anything other than Linux to test on so you will probably need to jump through some hoops to compile on BSD or OSX. </p>
<p>All of the tunable parameters are at the top of t1master.cpp and should be fairly self explanatory. I didn&#8217;t want any external dependencies so the MOTD is &#8220;hardcoded&#8221;, although you can pass it on the command line. I also didn&#8217;t see a point in backing up the server list at any point since the list will fully regenerate itself in 2 minutes. Note that I do randomize the hash seed so even if a spoofer floods the server with bogus IPs, he won&#8217;t be able to logjam the hash tables with collisions! </p>
<p>t1spam.cpp is the program I used for stress testing by hammering the server with heartbeats and listreqs. You should see the lines to comment/uncomment in t1master.cpp(masterserver::process) and optionally in t1master.cpp(main) to simulate random request sources to get realistic loads for the hash tables and penalties. </p>
<ul>
<li>Download <a href="http://www.team5150.com/~andrew/blog/files/t1master.zip">t1master.zip</a></li>
</ul>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/floodyberry.wordpress.com/16/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/floodyberry.wordpress.com/16/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/floodyberry.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/floodyberry.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/floodyberry.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/floodyberry.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/floodyberry.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/floodyberry.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/floodyberry.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/floodyberry.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/floodyberry.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/floodyberry.wordpress.com/16/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=16&subd=floodyberry&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://floodyberry.wordpress.com/2008/02/15/writing-a-tribes-1-master-server/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/06f8635e391015012b8eb663e26a0352?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">floodyberry</media:title>
		</media:content>
	</item>
		<item>
		<title>C++ Templates and Class Inheritance</title>
		<link>http://floodyberry.wordpress.com/2007/05/17/c-templates-and-class-inheritance/</link>
		<comments>http://floodyberry.wordpress.com/2007/05/17/c-templates-and-class-inheritance/#comments</comments>
		<pubDate>Thu, 17 May 2007 06:03:31 +0000</pubDate>
		<dc:creator>floodyberry</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[cplusplus]]></category>
		<category><![CDATA[templates]]></category>

		<guid isPermaLink="false">http://floodyberry.wordpress.com/?p=14</guid>
		<description><![CDATA[The following code is not legal C++:

[sourcecode language='cpp']
template 
struct A {
	void f() {}
	type mX;
};

template 
struct B : public A {
	void g() { mY = ( mX ); f(); }
	type mY;
};
[/sourcecode]

The best part is that unless you know the obscure reason why it is not legal, it appears legal and might even compile and run perfectly depending on which compiler you're using. Not surprisingly, that is exactly how I ran in to it. I was doing templated class inheritance and thought I was in the clear because everything ran fine with MSVC7.1 and ICC 9, but when I belatedly tried to compile with g++ 3.4.4, I ran in to the following errors..<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=14&subd=floodyberry&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>The following code is not legal C++:</p>
<pre name="code" class="cpp">

template &lt; class type &gt;
struct A {
	void f() {}
	type mX;
};

template &lt; class type &gt;
struct B : public A&lt;type&gt; {
	void g() { mY = ( mX ); f(); }
	type mY;
};
</pre>
<p>The best part is that unless you know the obscure reason why it is not legal, it appears legal and might even compile and run perfectly depending on which compiler you&#8217;re using. Not surprisingly, that is exactly how I ran in to it. I was doing templated class inheritance and thought I was in the clear because everything ran fine with MSVC7.1 and ICC 9, but when I belatedly tried to compile with g++ 3.4.4, I ran in to the following errors: <span id="more-14"></span></p>
<blockquote>
<div>tmpl.cpp: In member function `void B&lt;type&gt;::g()&#8217;:<br />
tmpl.cpp:9: error: `mX&#8217; undeclared (first use this function)<br />
tmpl.cpp:9: error: (Each undeclared identifier is reported only once for each function it appears in.)<br />
tmpl.cpp:9: error: there are no arguments to `f&#8217; that depend on a template parameter, so a declaration of `f&#8217; must be available<br />
tmpl.cpp:9: error: (if you use `-fpermissive&#8217;, G++ will accept your code, but allowing the use of an undeclared name is deprecated)</div>
</blockquote>
<p>What the who? Why can&#8217;t it find mX or f? How are you supposed to inherit templated classes in g++? Why does it work on some compilers and not others? Is this a bug? The answer is C++ Standards and in this case, gcc 3.4. It didn&#8217;t take me long to find a response to this exact same problem on the gcc mailing list from 2004:</p>
<p><a href="http://gcc.gnu.org/ml/gcc-help/2004-04/msg00382.html">re: gcc 3.4 template problem</a></p>
<blockquote>
<div>This happens because gcc 3.4 now implements two-phase name lookup, see <a href="http://gcc.gnu.org/gcc-3.4/changes.html">http://gcc.gnu.org/gcc-3.4/changes.html</a> and page down until you see the 2nd example in the C++ section, which is much like yours.</p>
<p>Also see DR 213 <a href="http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/cwg_defects.html#213">http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/cwg_defects.html#213</a> whose resolution adds:</p>
<blockquote>
<div>In the definition of a class template or a member of a class template, <strong>if a base class of the class template depends on a template-parameter, the base class scope is not examined during unqualified name lookup</strong> either at the point of definition of the class template or member or during an instantiation of the class template or member.</div>
</blockquote>
<p>to the standard.</p>
<p><strong>You can also use &#8216;this-&gt;x&#8217; to make x dependent, making the implementation to look into the base class.</strong></div>
</blockquote>
<p>The C++ FAQ Lite has some further information on the subject with the understandable disclaimer &#8220;This might hurt your head; better if you sit down&#8221;: <a href="http://www.parashift.com/c++-faq-lite/templates.html#faq-35.19">[35.19] Why am I getting errors when my template-derived-class uses a member it inherits from its template-base-class?</a>.</p>
<p>While I can understand what the standard is saying and the consequences of it, I can&#8217;t figure why some compilers still allow the now illegal behavior (leading to portability nightmares if you don&#8217;t know what the heck just broke), and why the intuitive interpretation was made illegal in the first place. Unfortunately, supporting the intuitive version when the standard says the opposite only opens up more opportunities to create non-portable code. While I dislike &#8220;this-&gt;&#8221; litter in my classes, it looks like the cleanest portable solution that doesn&#8217;t open up other issues (such as explicit A&lt;type&gt;:: prefixing which would break virtual functions).</p>
<p>If you haven&#8217;t had enough, try making sense of this thread in comp.lang.c++.moderated: <a href="http://groups.google.com/group/comp.lang.c++.moderated/browse_frm/thread/eb1a641f1807aad0/533e7e39fd887ab4?tvc=1#533e7e39fd887ab4"> Dependent names in templates, or are they?</a> What a horrid little rule.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/floodyberry.wordpress.com/14/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/floodyberry.wordpress.com/14/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/floodyberry.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/floodyberry.wordpress.com/14/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/floodyberry.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/floodyberry.wordpress.com/14/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/floodyberry.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/floodyberry.wordpress.com/14/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/floodyberry.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/floodyberry.wordpress.com/14/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/floodyberry.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/floodyberry.wordpress.com/14/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=14&subd=floodyberry&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://floodyberry.wordpress.com/2007/05/17/c-templates-and-class-inheritance/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/06f8635e391015012b8eb663e26a0352?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">floodyberry</media:title>
		</media:content>
	</item>
		<item>
		<title>UTF-8 Conversion Tricks</title>
		<link>http://floodyberry.wordpress.com/2007/04/14/utf-8-conversion-tricks/</link>
		<comments>http://floodyberry.wordpress.com/2007/04/14/utf-8-conversion-tricks/#comments</comments>
		<pubDate>Sat, 14 Apr 2007 08:04:03 +0000</pubDate>
		<dc:creator>floodyberry</dc:creator>
				<category><![CDATA[Optimization]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[cplusplus]]></category>
		<category><![CDATA[optimization]]></category>
		<category><![CDATA[unicode]]></category>
		<category><![CDATA[utf-8]]></category>

		<guid isPermaLink="false">http://floodyberry.wordpress.com/?p=11</guid>
		<description><![CDATA[UTF-8 is a wonderfully simple encoding format with some very nice properties, but the juggling required to convert to UTF-16, and UTF-32 can be a little tricky and fairly easy to do poorly. This is further compounded by the various error conditions you must keep an eye out for, such as overlong encodings, reserved ranges, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=11&subd=floodyberry&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><a href="http://en.wikipedia.org/wiki/UTF-8">UTF-8</a> is a wonderfully simple encoding format with some very nice properties, but the juggling required to convert to UTF-16, and UTF-32 can be a little tricky and fairly easy to do poorly. This is further compounded by the various error conditions you must keep an eye out for, such as overlong encodings, reserved ranges, surrogate markers, incomplete sequences, and so on.</p>
<p>These are a couple tricks you can employ to hopefully keep the conversion fast <em>and</em> robust.</p>
<p><span id="more-11"></span></p>
<h4>Tail Length Lookup</h4>
<p>Our first trick is to use a lookup table for the initial byte. This allows you to both a) tell whether the byte is valid (80 to bf and fe to ff are invalid leading bytes, as well as f5 to fd if you don&#8217;t want to handle 5 and 6 byte sequences) and b) determine the number of trailing bytes in the expected sequence. We will also need the length of the sequence to quickly ensure there are enough bytes in left in the input as well as for other upcoming tricks, so this actually results in multiple wins.</p>
<p>If you want to cut down on the table size, you could use 128 values and take (c&lt;&lt;1), or 64 values and take ((c-0&#215;80)&lt;&lt;1), although you&#8217;ll need an extra check for 80-bf with 64 values.</p>
<pre name="code" class="cpp">

const UTF32 Replacement = ( 0xfffd );

const unsigned char UTF8TailLengths[256] = {
	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
	0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
	1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
	1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
	2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
	3,3,3,3,3,3,3,3,4,4,4,4,5,5,0,0
};
</pre>
<pre name="code" class="cpp">

UTF32 utf8_to_utf32( UTF8 *&amp;s, const UTF8 *end ) {
	UTF32 c = ( *s++ );
	if ( c &lt; 0x80 )
		return ( c );
	unsigned int tail = ( UTF8TailLengths[ c ] );
	if ( !tail ) || ( s + tail &gt; end )
		return ( Replacement );
</pre>
<h4>Overlong Encodings and Magic Subtraction</h4>
<p>Once we know the length of the expected sequence, we can attempt to decode it.The basic decoding loop is something like:</p>
<pre name="code" class="cpp">

c &amp;= ( 0x3f &gt;&gt; tail );

unsigned int i;
for (  i = 0; i &lt; tail; ++i ) {
	if ( ( s[i] &amp; 0xc0 ) != 0x80 )
		break;

	c = ( ( c &lt;&lt; 6 ) + ( s[i] &amp; 0x3f ) );
}

s += i;
if ( i != tail )
	return ( Replacement );
</pre>
<p>At the end of decoding, we will still be faced with a problem: How do you tell if it was an overlong encoding? To keep the mapping of UTF-8 to UTF-32 one to one, we are required to reject any encoding that uses more bytes than it requires. Markus Kuhn&#8217;s <a href="http://www.cl.cam.ac.uk/~mgk25/ucs/utf8_check.c">utf8_check.c</a> has a jungle of conditionals to detect the specific lead and tail byte encodings that would indicate an overlong encoding, but this is not something we want to do in our innerloop!</p>
<p>This is where our Overlong Encoding and Magic Subtraction lookup comes in. Since we know the length of the tail, we can create a lookup of the <em>minimum</em> value a sequence with <em>tail</em> bytes needs to be.</p>
<p>Magic Subtraction is a side bonus to knowing the length of the tail. With Magic Subtraction, we can skip masking off the lead byte as well as eliminating the &amp;0&#215;3f mask in the innerloop! Magic Subtraction works by accumulating the value of the masked off bits into a single value, and subtracting that value at the end. Because we&#8217;re making sure each byte is well formed, we can be sure that the masked off bits will add up to a constant value. I got this trick from <a href="http://www.unicode.org/Public/PROGRAMS/CVTUTF/ConvertUTF.c">ConvertUTF.c</a> by Mark E. Davis.</p>
<p>If you want to double check the magic subtraction values, you can calculate them yourself like so: Find the constant value for the lead byte of each sequence, then for each byte in the sequence, shift the value over 6 bits and add 80.</p>
<ul>
<li>1 tail byte: (c0&lt;&lt;6)+80</li>
<li>2 tail bytes: (((e0&lt;&lt;6)+80)&lt;&lt;6)+80</li>
<li>3 tail bytes: (((((f0&lt;&lt;6)+80)&lt;&lt;6)+80)&lt;&lt;6)+80</li>
<li>etc.</li>
</ul>
<pre name="code" class="cpp">

struct UTF8Lookup {
	UTF32 mOverlongMinimum, mMagicSubtraction;
} const UTF8Lookups[ 6 ] = {
	{ 0x00000000, 0x00000000 },
	{ 0x00000080, 0x00003080 },
	{ 0x00000800, 0x000E2080 },
	{ 0x00010000, 0x03C82080 },
	{ 0x00200000, 0xFA082080 },
	{ 0x04000000, 0x82082080 },
};
</pre>
<pre name="code" class="cpp">

unsigned int i;
for ( i = 0; i &lt; tail; ++i ) {
	if ( ( s[i] &amp; 0xc0 ) != 0x80 )
		break;

	c = ( c &lt;&lt; 6 ) + s[i];
}

s += i;
if ( i != tail )
	return ( Replacement );

const UTF8Lookup &amp;lookup = UTF8Lookups[ tail ];
c -= ( lookup.mMagicSubtraction );
if ( c &lt; lookup.mOverlongMinimum )
	return ( Replacement );
</pre>
<h4>Tail Byte Error Bits</h4>
<p>You may have noticed that we are checking every single tail byte to see if it is well formed ( *s &amp; 0xc0 != 0&#215;80 ). Even if we used a switch on <em>tail</em> to unroll our loop, we would still need to have all the conditionals. The Tail Byte Error Bits trick is what I came up with to remove the checks.</p>
<p>If we have a lookup table that has 1 for invalid tail bytes and 0 for valid bytes, we can accumulate these values in a mask variable which will be non-zero at the end of our decoding loop if any of the tail bytes were invalid. Further more, if we accumulate them so that ( mask = ( mask &lt;&lt; 1 ) | UTFInvalidTailBytes[ s[i] ] ), we can also tell which was the first invalid tail byte by looking at which bits in <em>mask</em> are set. This allows us to back the source pointer up to the last valid byte.</p>
<p><em>Feb. 27, 2008</em>: Re-looking at this, I realized you can shrink the table to just 4 values since only the top 2 bits are being checked. This leaves us with ( mask = ( mask &lt;&lt; 1 ) | UTFInvalidTailBits[ s[i]&gt;&gt;6 ] ).</p>
<pre name="code" class="cpp">

const unsigned char UTF8InvalidTailBits[4] = {
	1,1,0,1,
};

const unsigned int UTF8InvalidOffset[32] = {
	0,1,2,2,3,3,3,3,4,4,4,4,4,4,4,4,
	5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,5,
};
</pre>
<pre name="code" class="cpp">

unsigned int mask = ( 0 );
for ( unsigned int i = 0; i &lt; tail; ++i ) {
	c = ( c &lt;&lt; 6 ) + s[i];
	mask = ( mask &lt;&lt; 1 ) | UTF8InvalidTailBits[ s[i]&gt;&gt;6 ];
}

s += tail;
if ( mask ) {
	s -= UTF8InvalidOffset[ mask ];
	return ( Replacement );
}

const UTF8Lookup &amp;lookup = UTF8Lookups[ tail ];
c -= ( lookup.mMagicSubtraction );
if ( c &lt; lookup.mOverlongMinimum )
	return ( Replacement );
</pre>
<h4>Finishing Up</h4>
<p>If we made it this far, we can be sure that our UTF-8 sequence is valid. Well, <em>almost</em> valid. There are still certain UTF-32 values that, even when properly encoded in UTF-8, are still illegal, and must be checked for if you want a robust converter.</p>
<p>The offending values:</p>
<div>
<blockquote>
<ul>
<li><strong>d800-dfff</strong>: UTF-16 uses d800-dfff to encode it&#8217;s surrogate pairs, i.e. values that don&#8217;t fit in to 16 bits. This means UTF-8/UTF-32 are not allowed to encode these values.</li>
<li><strong>fdd0-fdef</strong>: This range was added to make your life more difficult.</li>
<li><strong>xfffe-xffff</strong>: x ranges from 0 to 10 (hex), so the values to check for are fffe-ffff, 1fffe-1ffff, etc.</li>
<li><strong>&gt; 10ffff</strong>: 10ffff is the highest Unicode codepoint.</li>
</ul>
</blockquote>
</div>
<p>The first two ranges can be checked using a subtraction and a compare instead of two compares, a trick I got from uClibc&#8217;s <a href="http://www.uclibc.org/cgi-bin/viewcvs.cgi/trunk/uClibc/libc/misc/wchar/wchar.c">wchar.c</a>. The third range can be checked with a simple and.. and compare. The last range (10ffff) is a simple compare.</p>
<pre name="code" class="cpp">

bool InRange( UTF32 c, UTF32 lo, UTF32 hi ) { return ( (UTF32 )( c - lo ) &lt; ( hi - lo + 1 ) ); }

bool IsSurrogate( UTF32 c ) { return ( InRange( c, 0xd800, 0xdfff ) ); }
bool IsNoncharacter( UTF32 c ) { return ( InRange( c, 0xfdd0, 0xfdef ) ); }
bool IsReserved( UTF32 c ) { return ( ( c &amp; 0xfffe ) == 0xfffe ); }
bool IsOutOfRange( UTF32 c ) { return ( c &gt; 0x10ffff ); }
</pre>
<p>This can be further simplified if you want to make use of a rather excessive lookup table. If you initialize a 32k entry lookup table like so, you can check any value under 10000 with a simple lookup.</p>
<pre name="code" class="cpp">

for ( UTF32 c = 0; c &lt; 0x10000; ++c )
	BMPInvalid[ c &gt;&gt; 1 ] = ( IsSurrogate( c ) || IsNonCharacter( c ) || IsReserved( c ) );

bool IsInvalidBMP( UTF32 c ) { return ( BMPInvalid[ c &gt;&gt; 1 ] ); }
</pre>
<h4>Cheating</h4>
<p>Ok, all we have to do is paste this all together and we&#8217;ll have a lightning fast UTF-8 to UTF-32 converter, right? Well.. consider the following example: You have a 1000k UTF-8 file which is 90% ASCII that you want converted to UTF-32. Would you rather convert it with a function that takes 30 cycles for an ASCII character and 50 cycles for a multibyte character, or a function that takes 10 cycles for an ASCII character and 90 cycles for a multibyte character?</p>
<p>If you have an input agnostic converter and/or unlucky compiler optimizations, your conversion function can end up looking something like:</p>
<pre name="code" class="cpp">

mov esi, [ start ]
mov edi, [ dest ]
cmp esi, [ end ]
jge finished
loop:
	movzx eax, byte ptr [ esi ]
	inc esi
	cmp eax, 80
	jle ascii

	...130 bytes of code to handle multi-byte encodings...

ascii:
	mov [ edi ], eax
	add edi, 4
	cmp esi, [ end ]
	jl loop
finished:
</pre>
<p>Yuck! Instead of a nice tight loop for the easy ASCII case, this will most likely crap all over the cache and slow you down. If you&#8217;re especially unlucky when trusting your compiler to handle templates and functions that should <em>obviously</em> be inlined, you&#8217;ll even end up with a call or two <strong>per character</strong>. Doing a little optimization and tweaking can result in code like:</p>
<pre name="code" class="cpp">

while ( start &lt; end ) {
	while ( ( *start &lt; 0x80 ) &amp;&amp; ( start &lt; end ) ) {
		*to++ = *start++;
	}
	if ( start &lt; end ) {
	...
	}
}
</pre>
<pre name="code" class="cpp">

mov esi, [ start ]
mov edi, [ dest ]
cmp esi, [ end ]
jge finished
loop:
	movzx eax, byte ptr [ esi ]
	inc esi
	cmp eax, 80
	jge notascii
	mov [ edi ], eax
	add edi, 4
	cmp esi, [ end ]
	jl loop
	jmp finished

notascii:

	...130 bytes of code to handle multi-byte encodings...

	mov [ edi ], eax
	add edi, 4
	cmp esi, [ end ]
	jl loop
finished:
</pre>
<p>This function should cut through ASCII oriented UTF-8 like butter, even if the multi-byte handling is a little slower than a more optimized converter. This code re-working may have little to no gain if your input is highly varied, but if you have a good idea what you&#8217;ll be facing, it may be worth it to tweak your functions to the data.</p>
<h4>5 and 6 byte sequences</h4>
<p>The original UTF-8 specification allowed for 5 and 6 byte sequences (up to 31 bits of data), however, only up to 4 byte sequences are valid under <a href="http://www.faqs.org/rfcs/rfc3629.html">RFC 3629</a>. So what do you do with 5 and 6 byte sequences? You can interpret the entire sequence and dump it as a single invalid character, or dump an invalid character for every byte in the sequence. Since the lead byte for 5 and 6 byte sequences (f5-fd) will never appear in <strong>any</strong> 4 byte or shorter sequence, interpreting the sequences as a single (invalid) character appears to make the most sense:</p>
<ul>
<li>If you are not actually processing UTF-8 text, or your input is corrupted, it won&#8217;t matter how you interpret them as any interpretation will produce garbage</li>
<li>If you <em>are</em> processing valid UTF-8 text, they can only appear due to an intentional 5 to 6 byte sequence. While illegal, it still represents a single character, not 5 to 6 invalid characters.</li>
</ul>
<h4>Fin</h4>
<p>dreamprojections&#8217; wonderful <a href="http://code.google.com/p/syntaxhighlighter/">Syntax Highlighter</a> was a contributing factor to the length of this post.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/floodyberry.wordpress.com/11/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/floodyberry.wordpress.com/11/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/floodyberry.wordpress.com/11/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/floodyberry.wordpress.com/11/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/floodyberry.wordpress.com/11/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/floodyberry.wordpress.com/11/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/floodyberry.wordpress.com/11/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/floodyberry.wordpress.com/11/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/floodyberry.wordpress.com/11/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/floodyberry.wordpress.com/11/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/floodyberry.wordpress.com/11/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/floodyberry.wordpress.com/11/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=11&subd=floodyberry&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://floodyberry.wordpress.com/2007/04/14/utf-8-conversion-tricks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/06f8635e391015012b8eb663e26a0352?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">floodyberry</media:title>
		</media:content>
	</item>
		<item>
		<title>Breaking SuperFastHash</title>
		<link>http://floodyberry.wordpress.com/2007/03/29/breaking-superfasthash/</link>
		<comments>http://floodyberry.wordpress.com/2007/03/29/breaking-superfasthash/#comments</comments>
		<pubDate>Thu, 29 Mar 2007 08:31:40 +0000</pubDate>
		<dc:creator>floodyberry</dc:creator>
				<category><![CDATA[Hashing]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[hash]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://floodyberry.wordpress.com/?p=7</guid>
		<description><![CDATA[After the problems SuperFastHash had in Hash Algorithm Attacks, I decided to try and break it completely, i.e. generate collisions algorithmically instead of brute forcing them. The attempt was more successful than I had anticipated, although Paul is obviously aware of the weak mixing in the final bits as evidenced by his comment in the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=7&subd=floodyberry&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>After the problems <a href="http://www.azillionmonkeys.com/qed/hash.html">SuperFastHash</a> had in <a href="http://www.team5150.com/~andrew/blog/2007/03/hash_algorithm_attacks.html">Hash Algorithm Attacks</a>, I decided to try and break it completely, i.e. generate collisions algorithmically instead of brute forcing them. The attempt was more successful than I had anticipated, although Paul is obviously aware of the weak mixing in the final bits as evidenced by his comment in the source code, &#8220;Force &#8216;avalanching&#8217; of final 127 bits&#8221;. My favorite collisions encountered would have to be &#8220;10/4 &lt; 3&#8243;, &#8220;10/5 = 2&#8243;, and &#8220;10/6 &gt; 1&#8243;, which have the property of hashing to the same value while being mathematically correct!</p>
<p>As I was writing this, I came up with a way to attack Bob Jenkins&#8217; <a href="http://www.burtleburtle.net/bob/hash/doobs.html">lookup3</a> as well. Unlike SuperFastHash, the lookup3 attack is due to the way the input bytes are being read in and does not indicate a deficiency in the mixing itself. If you are using lookup3 with hash tables, the core function will still be quite safe; it will only need to be modified if you are using it to generate unique 64bit identifiers <em>and</em> the input data could be altered for a malicious purpose.</p>
<p>With that said, let&#8217;s look at the attacks:</p>
<p><span id="more-7"></span></p>
<h4>SuperFastHash</h4>
<p>To begin with, I separated the innerloop in to its 4 steps:</p>
<pre name="code" class="cpp">
1:	hash  += get16bits (data);
2:	tmp    = (get16bits (data+2) &amp;lt;&amp;lt; 11) ^ hash;
3:	hash   = (hash &amp;lt;&amp;lt; 16) ^ tmp;
4:	hash  += hash &amp;gt;&amp;gt; 11;
</pre>
<p>and then ran a couple of known duplicate strings through and printed out the value at each step (input bytes are only consumed in steps 1 and 2):</p>
<blockquote class="fixed-width"><div>
&#8220;aaaaaaaa&#8221; vs &#8220;aaadadaf&#8221;</p>
<p><strong>aa</strong> 1: 00006169 <strong>aa</strong> 2: 030b6969 3: 62626969 4: 626eb5b6   <strong>aa</strong> 1: 626f<strong>1717</strong> <strong>aa</strong> 2: <strong>61641f17</strong> 3: 76731f17 4: 7681ed7a</p>
<p><strong>aa</strong> 1: 00006169 <strong>ad</strong> 2: 03236969 3: 624a6969 4: 6256b2b6   <strong>ad</strong> 1: 6257<strong>1717</strong> <strong>af</strong> 2: <strong>61641f17</strong> 3: 76731f17 4: 7681ed7a</p>
<p>&#8220;ifkzihfe&#8221; vs &#8220;igdqhtfp&#8221;</p>
<p><strong>if</strong> 1: 00006671 <strong>kz</strong> 2: 03d33e71 3: 65a23e71 4: 65aef2b8   <strong>ih</strong> 1: 65af<strong>5b21</strong> <strong>fe</strong> 2: <strong>66846b21</strong> 3: 3da56b21 4: 3dad1fce</p>
<p><strong>ig</strong> 1: 00006771 <strong>dq</strong> 2: 038b4771 3: 64fa4771 4: 6506e6b9   <strong>ht</strong> 1: 6507<strong>5b21</strong> <strong>fp</strong> 2: <strong>66846b21</strong> 3: 3da56b21 4: 3dad1fce</p>
</div>
</blockquote>
<p>We can see that after round 2 step 1 (2:1) the lower 16 bits of <em>hash</em> are the same, and that after (2:2) all 32 bits of <em>tmp</em> are the same. Looking at step 3 reveals how this creates a collision: <em>hash = (hash &lt;&lt; 16) ^ tmp;</em>. The upper 16 bits of hash are thrown away, leaving the lower 16 bits (which are the same in the observed collisions) and 32 bits of tmp (which is the same in the observed collisions). Thus if after an initial 4 bytes we can find 2 bytes that create the same lower 16 bits in <em>hash</em> at (2:1), and then find the remaining 2 bytes that create the same 32 bits in <em>tmp</em> at (2:2), we have generated a colliding 8 byte string.</p>
<p>This can be further generalized to any string that&#8217;s a multiple of 8 bytes. If, from an initial hash value of X (the length of the string for SuperFastHash), you can find 8 bytes that collide with hash Y, then you simply use Y as the initial hash value and re-work the problem for the next 8 bytes, and so on. This works because SuperFastHash has very poor mixing for last 8 bytes that have been hashed and does not use it&#8217;s avalanching fix-up until it has reached the end of the input.</p>
<h5>Attack</h5>
<p>The attack was fairly straightforward once I identified what needed to be done. After a couple of revisions and refinements, I came up with:</p>
<ol>
<li>Take a source string S that has a length which is a multiple of 8 bytes</li>
<li>Hash S and for each 8 byte sequence, find the values of <em>hash</em> and <em>tmp</em> at steps (2:1) and (2:2)</li>
<li>Hash every permutation of the first 4 bytes of your attack string:
<ul>
<li>Find bytes 5 and 6 for each permutation which generate the lower 16 bits for the constant in (2:1).</li>
<li>Finally find bytes 7 and 8 for which generate the constant in (2:2). If this succeeds, either print the hit or process another 8 bytes until you reach the target length for your source string</li>
</ul>
</li>
</ol>
<p>This will probably not uncover every possible duplicate, especially for keys longer than 8 bytes, but it ran fast enough and generated enough duplicates that I did not need to refine it further.</p>
<h5>Results</h5>
<p>When using an 8 byte source string with all possible characters as input ( bytes 0&#215;00 to 0xff ), <strong>~130,000,000</strong> colliding strings can be found in around 3 minutes. When you restrict the character set to printable charactersI (symbols, numbers, uppercase, and lowercase), anywhere from 10,000 to 200,000 (possibly higher) can be found in a few seconds.</p>
<p>Using a random initial value (keying the hash) reduces the collisions, but does not alleviate them entirely. For example, 108,600 strings were generated that hash to the same value as &#8220;zzzzzzzz&#8221;. When run through a hashtable insertion test with 131072 buckets using &#8220;0xdeadbeef + length&#8221; as the initial value instead of just &#8220;length&#8221;, there were still 3,374,795 compares due to full hash collisions, and the largest bucket had 376 links. By comparison, Bob Jenkins&#8217; lookup3 had 1 compare and 8 links in it&#8217;s largest bucket, and the x31 variant of the Torek/DJB hash had 0 compares and 6 links in it&#8217;s largest bucket. A small win is that increasing the key length does reduce the collisions with a keyed hash: 100,000 strings hashing to the same value as &#8220;zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz&#8221; resulted in only 500,000 compares with the largest bucket containing 158 links.</p>
<h4>lookup3</h4>
<p>The lookup3 attack is very simple and I actually feel dense for not having seen it until now. This is the basic 32bit little endian algorithm (32bit aligned reads) with input lengths assumed to be 24 or larger and a multiple of 12:</p>
<pre name="code" class="cpp">
1:	a = b = c = ( 0xdeadbeef + len )

2:	for ( ; len &amp;gt; 12; len -= 12, key += 12 ) {
		a += ( key[0] )
		b += ( key[4] )
		c += ( key[8] )
		mix( a, b, c )
	}

3:	c += ( key[8] )
	b += ( key[4] )
	a += ( key[0] )

4:	final( a, b, c )

	return ( c );
</pre>
<p>mix() and final() mix every bit in a, b, and c very thoroughly, so trying to attack lookup3 algorithmically would be quite futile (for me). However, because every 12 bytes, or 96 bits, of input directly alters every bit of a, b, and c, there is a shortcut: </p>
<ol>
<li>Take any input string and run it through lookup3; record the values of a, b, and c after Step 3</li>
<li>Take your attack string and pad it to be a multiple of 12 bytes long, then add an additional 12 bytes</li>
<li>Run your attack string through lookup3 and stop after Step 2. We should now have the additional 12 bytes remaining to hash</li>
<li>Construct the final 12 bytes so that key[ 8 ] = ( target.c &#8211; c ), key[ 4 ] = ( target.b &#8211; b ), and key[ 0 ] = ( target.a &#8211; a ). The internal state will now match that of the target string before the call to final(), guaranteeing a collision</li>
</ol>
<p>Now, this attack relies completely on knowing what a, b, and c are initialized to at the start of the hash. If you are using lookup3 for hash tables, you should already be initializing a, b, and c to random values to defeat bucket attacks, i.e. attacks searching for keys where the lower 15 to 20 bits match. Even the best algorithm is vulnerable to bucket attacks, so choosing a random initial value should be mandatory no matter what.</p>
<p>However, if you&#8217;re using lookup3 for something like a 64bit unique id or a file checksum, your initial state will need to be static and thus open to attack with this method. I&#8217;m not sure what you could do to get around this safely while retaining lookup3&#8217;s high speed; adding another 32 bits of state that wasn&#8217;t directly altered by the input may help, but where to stick it and how to mix it in isn&#8217;t something I couldn&#8217;t guess at. I&#8217;ve emailed Bob, but I don&#8217;t know if deliberate attacks like this are something he is concerned with. The lookup3 source does state <em>Use for hash table lookup, or anything where one collision in 2^^32 is acceptable.  Do NOT </em>use for cryptographic purposes., so we&#8217;ll see.</p>
<h4>Conclusions</h4>
<p>I already knew SuperFastHash had some peculiar results from my previous tests, and the outcome of this experiment drove the point home. While the offending collisions will not be common (what pathological input is?), the fact that they exist, were so readily obtained, and were still somewhat evident when changing the initial value of the hash suggests that it is probably best to start over from scratch. I should have noticed a problem sooner when SuperFastHash was running in to light collisions in <a href="http://www.team5150.com/~andrew/blog/2007/03/when_bad_hashing_means_good_caching.html">When bad hashing means good caching</a>, and that wasn&#8217;t even an intentional attack. </p>
<p>As far as lookup3, the trivial collisions are disturbing, but they are <em>only</em> a problem if an attacker can craft input for the hash function, and then <em>only</em> when you are using it to generate unique ids. It would be nice to see the issue addressed, though, as lookup3 is still unrivaled in terms of mixing and actually faster than SuperFastHash in some cases.</p>
<h4>Source</h4>
<p>You may download the <a href="http://www.team5150.com/~andrew/blog/files/superfasthash_attack.cpp">SuperFastHash attack source</a> and try it out for yourself. It comes preloaded with attacks against &#8220;********&#8221;, yielding 194 collisions, and &#8220;10/5 = 2&#8243;, yielding my favorite set of collisions. If you want to run it with all possible characters as input, make sure to turn the debug printf off or redirect output to a file; it can take a while to end process a program that has a lot of \x07 bells queued up.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/floodyberry.wordpress.com/7/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/floodyberry.wordpress.com/7/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/floodyberry.wordpress.com/7/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/floodyberry.wordpress.com/7/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/floodyberry.wordpress.com/7/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/floodyberry.wordpress.com/7/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/floodyberry.wordpress.com/7/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/floodyberry.wordpress.com/7/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/floodyberry.wordpress.com/7/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/floodyberry.wordpress.com/7/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/floodyberry.wordpress.com/7/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/floodyberry.wordpress.com/7/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=floodyberry.wordpress.com&blog=4164432&post=7&subd=floodyberry&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://floodyberry.wordpress.com/2007/03/29/breaking-superfasthash/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/06f8635e391015012b8eb663e26a0352?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">floodyberry</media:title>
		</media:content>
	</item>
	</channel>
</rss>