evilrix

Bloom Filters

evilrix — Tue, 17 Oct 2017 16:31:02 +0000

In general, the worse case scenario when searching through a data set is when the datum being searched for doesn’t exist. In this case, the complete data storage needs to be searched before it’s possible to conclude that the datum cannot be found. If only there was a way to eliminate the need to perform unnecessary searches when we know the data won’t be found. Fortunately, there is a data structure that allows you to do just that. This data structure is knows as a “Bloom Filter”.

So, how does this witchcraft work I can hear you ask? Well, a Bloom Filter is a probabilistic data structure that can tell you if an item definitely doesn’t exist in a data set. What it can’t tell you, with any degree of certainty, is if an item does exist. This clever data structure was invented by a chap called Burton Howard Bloom circa 1970. The power of a Bloom Filter is that checking it for existence of a datum is significantly faster
than checking a complete data store. For those who are into Big O notation, the time complexity for performing a check on a Bloom Filter is O(k), where k is the number of hash functions used and bits set (see below).

The fact we can know for definite that an item doesn’t exist in a data store means we don’t have to waste our time searching for something we’re definitely not going to find. Unfortunately, we can only know for sure that data doesn’t exist. We can get false positive hits, which means the data might exist and in these cases a search of the data store is still necessary. With careful usage of a Bloom Filter, we can avoid performing expensive searches if we know the data won’t
be found. Neat, eh?

The principle of how a Bloom Filter works is quite simple, when an item is “added” to the Bloom Filter a statistically unique “bit pattern” is generated using a series of hash functions (one hash function for each bit), which is then written into a single bit vector (the same bit vector is used to store all bit patterns – hence the possibility of a false positive). When checking to see if an item has been added to the Bloom Filter we check to see if the same bit pattern exists in the bit vector. If it doesn’t then we know the item was never added to the filter and so won’t be found in the data store. If we find a matching bit pattern the item may exist in the data store and so a full search is required.

The reason we can’t know for sure that it wasn’t added is because, over time, as more items are added to the filter there is a chance that there will be a collision on the specific bit pattern for a particular item, because one or more other items may have generated the same bit pattern (either singularly or as a group). This means, we cannot say for sure that an item does exist, only that it doesn’t – if the bit pattern for a particular item isn’t set then it cannot exist.

Let’s consider an example. Let’s assume we have a very simple Bloom Filter that is using a 16 bit filter (normally we’d use many more bits that this). We’re going to add three numbers, 21, 34 and 57, and each number will generate
3 unique bit patterns:


 +------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
 | bits | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F |
 +------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
 |  21  |   |   | X |   |   |   |   | X | X |   |   |   |   |   |   |   |
 +------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
 |  34  |   |   |   |   | X |   |   |   |   | X |   |   | X |   |   |   |
 +------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
 |  57  |   |   |   |   |   |   |   |   | X |   |   | X |   |   |   | X |
 +------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

Now, let’s assume we want to see if number 85 is in the set. This generates the following bit pattern:


+------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| bits | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F |
+------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|  85  |   |   | X |   |   |   |   |   |   |   |   |   | X |   | X |   |
+------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

Bit 2 is set in the bit vector, bit 12 is set, but bit 14 is not set and so we know for certain 85 is not in there; we can be sure it doesn’t exist in the data set.

Now, let’s do the same for 91.


+------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| bits | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F |
+------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|  91  |   |   |   |   |   |   |   | X |   |   |   | X |   |   |   | X |
+------+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

In this case, we have a collision of bits, that just happen to match with numbers 21 and 57, which means 91 might exist in the data set and so a full search of the data store is necessary. Notice that we’ve never actually added 91, but because we have a collision cause by bits set when adding 21 (bit 7) and 57 (bits 11 and 15) we cannot rule out that 91 might exists, hence we have no choice but to search the data store to see if this number exists or not.

This is why a Bloom Filter is probabilistic. We can say an item definitely doesn’t exist or it might exist. This doesn’t help us in the case where data might exist, but it can save us performing unnecessary expensive searches when we know the data definitely doesn’t exist. That’s the job of a Bloom Filter, to allow us to identify those cases where data definitely doesn’t exist, hopefully saving us from performing an expensive search.

Now, for a Bloom Filter to be useful, we want to avoid as many collisions as we can and there is a trade off between the number of bits we set when adding a value (a Bloom Filter can work with any data, not just numbers), the size of the bit vector and the number of values we add. The more bits you set the less chance of a false positive; however the smaller your bit vector the quicker the space will become polluted and so the more chance of a false positive.

Unfortunately, there is no hard and fast rule in determining the size of the bit vector nor the number of hash functions to use and so trial and error is necessary. Experiment with a representative sample of your data set to try and find the ideal number of hashes vs. the size of your bit vector. As a rule of thumb, the more data you inject into the filter and the more bits you set, the larger the bit field needs to be to avoid collisions.

To set the bits, when adding a new item to the Bloom Filter, it is necessary to user different hash functions, one for each bit. Each hash function will generate a new and statistically unique hash value for the datum and the bit position can then be obtained by performing a modulo calculation against the hash using the number of bits in the bit vector. Each hash function should generate a different unique hash value, thus setting a different bit. For example, by using three different hash functions we can set three different bits.

Implementing a Bloom Filter in C++ is pretty simple, with the tricky part deciding on which hash functions to use. This is a decision that needs to be made during the implementation of the Bloom Filter and it’s a good idea to test a number of different
hash functions to ensure they give an even distribution of bits within the bit vector for the type of data you plan to add to the filter. You can filter any type of data you like, the only requirement is that the data is suitable for hashing.

Once you’ve identified suitable hash functions, it’s just a case of deciding how large your bit vector needs to be and then storing the bit patterns for each datum added into the bit vector. When performing a lookup in the filter we just need to re-generate the bit pattern and see if it exists in the bit vector. If it doesn’t then the datum does not exist in our storage, if it does then the datum might exist in the storage and so a full search will be necessary.

The full code for a very simple Bloom Filter can be found, below. Although there is quite a lot of code, most of this is just the implementation of a few simple hash functions. The main guts of the Bloom Filter (see the bloom_filter class) is actually very straight forward. In this implementation, the hash functions are passed into the Bloom Filter’s constructor and so this means the filter can use any number of hash functions. The size of the bit vector is fixed at 128, but this can be any size and can even be decided at run-time. The only requirement is that it’s bigger than the number of hash functions. The actual size is dependent on how much data you plan to add to the filter.

#include 
#include 
#include 
#include 

// =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
// Original hash function implementations and descriptions can be found here:
// http://www.eternallyconfuzzled.com/tuts/algorithms/jsw_tut_hashing.aspx
// =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

class hash
{
public:
   virtual ~hash() = default;

   virtual uint32_t operator() (void const * key, size_t len) const = 0;

   uint32_t operator() (std::string const & s) const
   {
      return (*this)(s.c_str(), s.size());
   }
};

// =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

class djb_hash : public hash
{
public:
   uint32_t operator() (void const * key, size_t len) const override
   {
      auto p = reinterpret_cast(key);
      auto h = uint32_t(0);

      for(auto i = size_t(0); i < len; i++)
      {
         h = 33 * h + p[i];
      }

      return h;
   }

   using hash::operator();
};

// =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

class sax_hash : public hash
{
public:
   uint32_t operator() (void const * key, size_t len) const override
   {
      auto p = reinterpret_cast(key);
      auto h = uint32_t(0);

      for(auto i = size_t(0); i < len; i++)
      {
         h ^= (h << 5) + (h >> 2) + p[i];
      }

      return h;
   }

   using hash::operator();
};

// =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

class fnv_hash : public hash
{
public:
   uint32_t operator() (void const * key, size_t len) const override
   {
      auto p = reinterpret_cast(key);
      uint32_t h = 2166136261;

      for(auto i = size_t(0); i < len; i++)
      {
         h = (h * 16777619) ^ p[i];
      }

      return h;
   }

   using hash::operator();
};

// =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

class oat_hash : public hash
{
public:
   uint32_t operator() (void const * key, size_t len) const override
   {
      auto p = reinterpret_cast(key);
      auto h = uint32_t(0);

      for(auto i = size_t(0); i < len; i++)
      {
         h += p[i];
         h += (h << 10);          h ^= (h >> 6);
      }

      h += (h << 3);       h ^= (h >> 11);
      h += (h << 15);

      return h;
   }

   using hash::operator();
};

// =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
// The actual Bloom Filter
class bloom_filter
{
public:
   static size_t const bit_count = 128;

   bloom_filter(std::vector> && hashers)
      : hashers_(std::move(hashers))
      , bits_(bit_count)
   {
   }

   void add(std::string const & s)
   {
      for(auto const hasher : hashers_)
      {
         size_t idx = (*hasher)(s) % bit_count;
         bits_[idx] = true;
      }
   }

   bool exists(std::string const & s)
   {
      for(auto const hasher : hashers_)
      {
         size_t idx = (*hasher)(s) % bit_count;
         if(!bits_[idx])
         {
            return false;
         }
      }

      return true;
   }

private:
   std::vector> hashers_;
   std::vector bits_;

};

// =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

int main()
{
   auto && bfilt = bloom_filter{
      std::vector>{
         std::make_shared(),
         std::make_shared(),
         std::make_shared(),
         std::make_shared(),
      }
   };

   bfilt.add("hello");
   bfilt.add("world");

   // these should exist
   assert(bfilt.exists("hello"));
   assert(bfilt.exists("world"));

   // these should not exist
   assert(!bfilt.exists("foobar"));
   assert(!bfilt.exists("eggplant"));
}

When is Unicode not Unicode? When Microsoft gets involved!

evilrix — Sat, 09 May 2015 18:15:48 +0000

Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode? They will have you believe that Unicode requires you to use a WCHAR (wide) character type and that Unicode cannot be represented by a CHAR (narrow) character type. In fact, both of these statements are completely and utterly false. Microsoft has misled you in the most egregious way.

Before we go any further, I need to clarify some terminology that is often confused. This is especially true of Windows programmers who, quite often, mistakenly believe that using a wide character type means they are using Unicode:

Character Set: This is a complete set of characters recognized by the computer hardware and software.

Character Encoding: This is a way of encoding a character set, generally to fit within the boundaries of a particular data type. ASCII, ANSI and UTFx are all examples of character encodings.

Character Type: This is a fundamental data type used to represent a character.

These three things are intrinsically related. The character type chosen to represent a character set will have a direct impact on the character encoding used. In C++, the normal fundamental character types are either wchar_t (wide) or char (narrow). The size of the narrow and wide types are platform dependent, although C++11 has introduced fixed sized character types. For the purposes of this discussion, it being Windows centric, we will assume wide is 16 bit and narrow is 8 bit.

Unicode code points are 32 bit. That’s it, the end. If you want to work with raw Unicode code points you have no choice than to use a 32 bit character type. That said, some very clever people who work for The Unicode Consortium realised that the majority of the western world uses the Latin alphabet and most of this can be represented using just 8 bits. The majority of the rest of the world uses characters that can be represented by 16 bits. The remainder of the world requires 32 bits. On that basis, forcing the world to adopt 32 bit character types would be, for the most of us, completely insane. Of course, the same could be said for 16 bits… eh, Microsoft?

Those clever people went on to invent a number of Unicode Transformation Formats (character encodings) that would allow Unicode to be encoded using smaller character types than 32. The most common of these are UTF16 and UTF8, although other less common encodings do exist. These are encoding formats that allow Unicode to be represented as multi-byte representations of their 32 code points using either 16 or 8 bit character types. Of the two, UTF8 is by far the most efficient for the majority of cases and has the advantage of being directly backwards compatible with systems designed to only use ASCII (meaning all old programs will just work).

Unfortunately, Microsoft decided to jump on the Unicode bandwagon without really thinking things through and, in their infinite wisdom, decided to adopt UTF16 as the standard encoding format for Unicode on the Windows platform. Frankly, this couldn’t have been a worse decision, and it is one that has plagued Windows programmers the world over ever since. The rest of the sane world realised that was UTF16 was just stupid and decide to use UTF8. Amazingly enough, the rest of the world has no significant problems writing programs what will work with Unicode in a portable fashion. Windows on the other hand… um… no!

The reason UTF16 makes no sense is because not only is it very wasteful for the majority of us who just use plain old ASCII most of the time, it’s also a real pain to use, especially if you want to be able to generate data that is portable and can be used cross-platform. You see, each character in a UTF16 encoding is larger than a byte and so the storage and retrieval of text encoded in this format requires that the reader be able to identify and (if necessary) convert the encoding format to the correct endianness for the platform.

Also, most legacy programs are written using narrow character types and so to “port” these over to use Unicode means making major changes to the code base to use wide character types. Now, this might sound like a simple “search and replace” but it’s really not. In a language such a C or C++, where the programmer, and not the compiler, is completely responsible for preserving the integrity of the memory, introducing a larger data type without reviewing each and every change to make sure it doesn’t bust data boundaries is coding suicide. Basically, this decision to use UTF16 meant that all existing code has to be broken and then fixed to be international friendly. That is a huge cost to business, and so most just didn’t (and don’t) bother!

Further, regardless of what Microsoft would have you believe, UTF16 is still a multi-byte format because you can’t represent the full 32 bit code-point set using 16 bit types. Sure, the majority of usable code points will fit into 16 bits, but it is a lie to say that Unicode can be represented by single wide character types. It’s just impossible. 32 bit into 16 bits does not fit! A quart does not fit into a pint pot! The use of “Unicode” in Windows is a broken promise that just makes life oh so unnecessarily hard for the software engineer.

By contrast, other platforms (Linux, for example) use UTF8 natively. This means that all data can be stored and retrieved using narrow types. Because UTF8 is a byte level encoding format it has no sense of endianness and so is easy to port between different platforms. It is also way more efficient than UTF16 because, in the general case of only using the standard ASCII character set, each character requires only 1 byte (rather than 2) to be represented. Even in the extended case of non-standard ASCII it’s normally still way more efficient because only some non-ASCII chars require more than one byte of encoding. UTF8 is a highly efficient encoding format, UTF16 is just not!

When Microsoft talks of Unicode please don’t be confused. They are NOT referring to Unicode, they are referring to the UTF16 encoding format. They do this because they want the world to believe that only 16 bits and UTF16 can be used to represent Unicode. They do this because they don’t want you to know how stupid they were to decide to use this pointless encoding format. Yes, this is a way to represent Unicode, but it is not the only way and from a software engineering point of view it is quite probably the stupidest way.

Further, whereas the rest of the sensible software engineering world uses UTF8 as a narrow character encoding format to represent Unicode, Microsoft insists on sticking with ANSI Code Pages. Unlike UTF8, these cannot represent the full range of the Unicode character set and, worse, unless you know the original code-page you have absolutely no idea what the encoding format actually represents. You may as well be working with a file of random binary, because that’s about as useful as an ANSI format file with no code-page information would be. It wouldn’t be so bad if Microsoft offered UTF8 as a native encoding format, but to date, this isn’t the case. It’s UTF16, ANSI or nothing!

So, Windows programmers, when you start talking about your project being “Unicode” please remember that to the rest of the sane world this phrase is meaningless. All you are saying is that your project uses wide rather than narrow data types for representing characters and you just so happen to have been fooled into using UTF16 when you could quite as easily have used UTF8. That’s right, you don’t have to use UTF16, even in Windows, to be Unicode friendly, you can use UTF8. There, I said it. The secret is out! I always code all my projects using narrow character types and, internally, I work with UTF8. I only convert (on Windows) to UTF16 when I absolutely have to (at the system API boundary).

But why do this? Doesn’t that make life hard? Good question. Yes and no. Yes, because it means at some point I still have to convert to UTF16. No, because C++11 now provides nice, efficient tools to do this conversation process so it is pretty painless. What it does mean is that my code will work on any platform. By using a platform agnostic character set, which UTF8 is, it means my code will run just as well on Windows as it will on Linux or OS-X.

For more reading on why we should all be using UTF8, why forcing us to use UTF16 is just silly and why Microsoft owes us all a very large apology for the mess they have made of “Unicode” on Windows, I highly recommend taking a look a the excellent UTF8 Everywhere website.

C / C++ main function prototypes

evilrix — Sat, 09 May 2015 18:14:07 +0000

There seems to be some fundamental misunderstanding about the function prototype for the “main” function in C and C++. More specifically what type this function should return. I see so many programmers use void as the return type. People, I’m sorry to tell you but that’s just plain wrong!

The C/C++ standards are very (VERY) clear about the prototype for the main function. It can be one of the following two (and only the following two) formats:

int main(void) { /*...*/ }

and

int main(int argc, char *argv[]) { /*...*/ }

Any other prototype is ill-defined and will result in undefined behaviour. Don’t be fooled into thinking that it must be okay to have void as the return type because, if it wasn’t the compiler would chuck an error. Actually, the C / C++ standard does not require the compiler to do this. All the standard states regarding this matter is:

“If the main function executes a return that specifies no value, the termination status returned to the host environment is undefined.”

Now, the chances are that you’ll never see the effect of your mistake directly. By the time the brown stuff hits the fan, your program has likely ended. No, it won’t be you who gets caught out, it’ll be the user of your program who suffers at your hands.

You see, all processes return an exit code, which just happens to be the value returned from main. The actual value of this code is completely arbitrary but, nevertheless, the OS will expect it and the C/C++ runtime will deliver it – even if you haven’t provided it with one!

By de-facto standard, zero is returned for success and any other value means failure. If you define your function to return void, what will be returned will normally be the last value stored in the accumulator register. This is because the accummulator is normally used to store the return value of a function. The chances are that in our case this value will not be zero. In fact, it’ll be whatever was left in the accumulator when the program exited.

Anyone attempting to write a script to use your program is going to have a head-scratchingly hard time trying to work out why your program randomly seems to fail when they test the result code of your process. They will discover that it appears to be an arbitrary value. Guess what, it is! Your decision to completely ignore a simple standard will have properly ruined someone’s day.

Interestingly, whilst the main function MUST be defined to return an int, in C++ you don’t have to actually return anything from main. The main function is treated as a special case; whereas, if you omit a return value the C++ runtime will automatically return zero for you. In other words, the following is about the smallest and, yet, still perfectly valid (if pointless) C++ program one can possibly write.

int main() {}

NB. This does NOT apply to the C programming language, where you MUST always return a value from main.

Whether you consider it good practice to have a function defined as returning an int that don’t actually return a specific value is entirely up to you, but in this one specific case you are allowed to violate what is normally a fundamental rule; functions defined as returning values MUST return values!

C++ Catching exceptions in constructors

evilrix — Sun, 03 May 2015 15:15:46 +0000

When deciding whether to catch the exception, especially one thrown during construction, ask yourself a very simple question, if you do catch it can you still leave the class in a usable state? If the answer is yes, then by all means catch it, deal with it and allow construction to continue. If the answer is no then do not catch it! Why? Simple! if you catch and swallow it, how is the code that was attempting to construct your class to know that the class is unusable (or, at least only partially constructed)?

You could add a method that returns a boolean value to identify the class failed to construct but that will only lead to more complications and will just make life hard for everyone; you and the user of our class. You’ll be giving them the abitilty to work with a partionally constructed object and the chances are that isn’t a sane thing to do.

For example, what happens if the user tries to call a function on your partial construction, and almost certainly, not-so-sane class? What if failure to construct has left the class with an unassigned member pointer and that calling any or all of the member functions requires the pointer to be dereference it? Bad things will happen. It will end in tears. Kaboom!

Your solution? You’ll now have to add even more code in each public function to check for this kind of issue to make sure the program doesn’t crash. Every new function you add will need this check. You’ll then have to decide how to deal with the function call if you can’t honour it due to the class being in a insane state. A much more sensible option is to just throw an exception and let it emit from the class, giving the user of your class the option to catch it and deal with the case that construction failed.

Okay, so you decide to let the exception emit, but how does that help? It’s pretty simply really when you think about it. The stack will unwind until it hits a catch handler. The original class you tried to construct is now out of scope. This is because try blocks and catch blocks are seperate scopes. At this point the object of your class cannot be used.

By simply throwing and allowing an exception to emit from the constructor of your class you are making it clear to the end user that construction has failed and the class must not be used. In fact, better; because emiting the exception cases the stack to unwind all access to your class will now be out of scope, so the user can’t attempt to use the partionally constructed object, even if they wanted to.

It should be noted that whilst you can catch and swallow an exception within the constructor of a class by creating a try/catch inside the function block the same is not true if you use a function try/catch block. This is a special type of try/catch where the actual function body itself is a try/catch block. These are especially useful in constructor functions because, unlike try/catch blocks inside the constructor function body, these can catch exceptions that are thrown by class members whilst they are being initialised in the classes constructor list.

In the case of this type of try/catch, even if you don’t explicitly re-throw a caught exception the compiler will do it for you. In other words, the compiler is telling you that you really shouldn’t be writing constructors that can fail silently. Constructors should be as light as possible and do the bare minimum to create a sane object. Any additional set-up should be the remit of an “initialise” function.

The difference between the constructor and the initialise function is that the constructor’s job is just to create a class instance that isn’t completely insane. For example, consider a bank account class. The constructor might initialise the balance to zero (not just let it take on some arbitrary value), whereas, the initialise function would then take arguments to actually set the balance of the account for the specific context of use.

This is a trivial example and probably not the best argument for using an initialise function, but it makes the point; constructors are just to create a sane object whilst initialisation functions are to then set up that sane object for actual use.

So far so good, but now here’s another question for you: if a class constructor emits an exception what happens when its destructor gets executed? For example, let’s say it throws half way through and there are still a number of pointers that are to have memory allocated to them that the destructor is responsible for releasing. How does the destructor know what to do?

Here’s the kicker – it doesn’t because there is no way it could! More importantly, because it doesn’t know it never even bothers trying. What? Yes, it’s true, if a class constructor throws and emits an exception the class destructor is never called. Think about it, how can the compiler destruct a class that was never actually constructed? It just wouldn’t make any sense.

Of course, this leads to a bit of a dilemma. What happens to the resources that were allocated before the exception was thrown? What releases them? Answer: nothing does! Or, more specifically, nothing can because the destructor never gets executed and by the time you know the class has failed to construct it has already gone out of scope. Basically, you have yourself a big fat resource leak. Life sucks, eh?

The solution is to use the RAII design pattern. It’s a fancy acronym (that stands for “Resource Acquisition Is Initialisation”) that basically means use smart pointers (or, at least use management classes for your resources). Although the destructor on the class that emits the exception is not called, any members that have been constructed up until that point do have their own destructors called. This means that as long as your class members are smart pointer type objects that are able to clean up after themselves upon destruction they will take care of cleaning up the partially constructed class for you.

See, life isn’t so bad after all. Time for a nice cup of tea, I think!

C++ Throwing exceptions from destructors

evilrix — Sun, 03 May 2015 15:09:43 +0000

You’re implementing a beautiful class. It’s just wonderful – the most perfect code you’ve ever written, except… for some reason it keeps crashing and you don’t know why. You’ve debugged the code and discovered it happens when your class goes out of scope and needs to throw an exception from the destructor. Tested in isolation, this works fine but integrated into the main code base it causes a crash. You know there are exception handlers that should deal with this and yet it still crashes. This doesn’t make sense. Why is the crashing?

As part of the development of this class it was necessary to write some complex “clean-up” code in the class destructor, but there is the possibility that something could fail during the clean up. For example, maybe our use case for the class is an object that represents a database connection and the destructor is finalising any outstanding transactional data commits before the object goes out of scope.

Your class can’t just leave these commits unfinalised as that could leave the database in an inconsistent state, but the process can fail. What do you do? You have to handle these commits but they can go wrong and so you have no choice but to throw an exception of course, right? Wrong! This really is about the most dangerous thing you could possibly do! Also, if your class is written correctly it should be completely unnecessary.

Here’s a question for you. Let’s say the destructor of your class is being executed because the stack is unwinding due to another exception that was already thrown. The destructor of your class encounters its own error and so throws its own exception. Question: what happens next?

If your answer was that the new exception gets thrown in place of the original please go to the back of the class, stand in the corner and face the wall. You’re wrong. Very wrong. Badly, dangerously, seriously wrong! The C++ Standard document is very clear on what happens next: your program is terminated, abruptly! Yes, you read that correctly. This is not a mis-print or even me messing with you. Your program is just terminated. Good bye dear.

Don’t believe me? Try it!

#include
#include 

class StupidClass
{
   public:
      ~StupidClass()
      {
         // this is just asking for trouble!
         throw std::runtime_error("this is stupid");
      }
};

int main()
{
   try
   {
      auto && sc = StupidClass();
      throw std::runtime_error("something wicked this way comes");
   }
   catch(...) // catch all just used for demonstration purposes!
   {
      // I can't help you - code never gets here!
      std::cerr << "handled error!" << std::endl;
   }
}

Now, to be clear (in case I haven’t been so far), termination is abrupt. It is not a nice and friendly request to exit, it is an abrupt and immediate termination. No further stack unwinding takes place, no more destructors are called and there is no opportunity to purge data and close files. Your program is given no chance to perform any further clean up.

In the C++-3 standard, this only happens if your destructor emits an exception during a current stack-unwinding but in the new C++11 standard it’s even more specific: even if the stack isn’t currently unwinding, due to another exception, the result is still immediate termination of your program. This is now the default behaviour on all class destructor in C++11, they terminate if an exception is allowed to emanate from the destructor.

So, the very blunt point of this article: DO NOT let exceptions emit from the destructors of C++ classes. Ever!!!

Okay, so what do we do if our destructor needs to handle a failure? Easy, you don’t put such code into the destructor.
Instead, you should put such clean-up code in a separate function that can be called before the class goes out of scope. In the case of our example database connection you might add a “flush” method that the user is expected to call before the class goes out of scope.

Ensuring you’ve added this function provides the user the option of finalising usage of the class before the destructor is called. This avoids the situation of the destructor needing to do the work that may fail. Given our example use case, you’ve now given the user the ability to ensure the object is flushed before the class goes out of scope. This affords them the opportunity to deal with any errors that may happen during the clean-up process; the user can catch and deal with them.

You expect the user to have already called this clean-up function by the time the class is destructed; however, for the sake of ensuring your class is a good citizen (who always clean up after themselves) the destructor should still call the clean-up method if it wasn’t called by the user, but it MUST swallow any exceptions that are emitted if the clean-up fails. Unfortunately, this means the user of your class has no idea of the failure, but that’s their fault for not following the correct usage instructions for your class!

My advice would be to add a debug assert to your class such that in a debug build; if the clean-up is not performed by the time the destructor is called it should fire an assert. At least in this way the developer will know s/he has used your class wrongly.

Time for an example:

#include 
#include 
#include 

class StupidClass
{
   public:

      void CleanUp()
      {
         // I'm a stupid function that always throws - duh!
         dirty_ = false;
         throw std::runtime_error("this is stupid");
      }

      ~StupidClass()
      {
         // if this triggers class is being used wrong!
         assert("your class is still dirty" && !dirty_);

         try
         {
            // just in case the user of this class was dumb and didn't read
            // the instructions on how to safely and correctly destroy me.
            if(dirty_)
            {
               // if it fails all we can do is ignore (maybe log to log file)
               CleanUp();
            }
         }
         catch(...)
         {
            // Eeek, dragons! Sadly, we've had to ignore them.
            assert("clean-up failed" && false);
         }
      }

   public:
      bool dirty_; // if set the class needs cleaning
};

int main()
{
   try
   {
      auto && sc = StupidClass();
      sc.CleanUp(); // errors but at least destructor won't terminate now
   }
}

As you can see, there is a specific function provided to allow clean-up and the user of the class is expected to call it before the class is destructed. As a fail-safe the destructor will call the clean-up method if the class is still dirty; however, this is a last resort and if there are any errors they’ll slip by silently (unless logged). In a debug build an “assert” will trigger if the class is still dirty upon destruct and also if the clean-up emits an error. In a release build this will be silently ignored.

C++ Passing by value vs. passing by reference

evilrix — Sun, 03 May 2015 15:04:18 +0000

In C++ all arguments are passed to functions by value. In other words, a copy of the value is taken and that is what the function has to work with. If you want to modify the original value you must either pass a pointer or reference to the original value you wish to modify. Of course, even the pointer or reference are still passed by value. Pointers and references are just another C++ type and when they are passed around you do so by value.

Now, what about the case where we don’t want to modify the value? There’s no need to pass by reference or pointer, right? Wrong! Now, in the case of fundemental types it probably doesn’t make a huge deal of difference since these are unlikely to be any more complicated that a pointer or reference type; howver, in the case of fully blown class objects, such as string, vector, map, set or any other (including your own class objects) it can make a huge difference. You see, passing by value can be very expensive both in terms of memory usage and performance (time / space complexity)

Classes have copy constructors, which are defined to facilitate the correct copying semantics for a class. Now in C++ there are two types of copy, shallow and deep. A shallow copy is where all the values of the class are copied but pointers are not followed. A deep copy is where pointers are followed and all the objects that they point to are also copied, thus creating a copy of all the “deep” objects, too. Any class that contains references to other objects should (unless there is a very good reason not to) provide both an assignment and copy constructor such that the class is always copied deeply.

Consider the std::vector class. This class contains an internal buffer of memory that is managed by the vector. In reality, we can assume that the vector contains a pointer that points to memory allocated on the heap. The vector class implements a copy constructor that will perform a deep copy on a vector object if a copy is taken. This is the only sane thing to do, otherwise we have two objects referencing the same memory and then we have issues of ownership. In other words, which of the vectors is now responsible for managing and freeing that memory and what happens to the other vector if that memory is released? Of course, it’ll be left with a dangling pointer that is referencing invalid memory! Bad mojo for all!!!

Now, imagine we have a vector class that contains thousands of items. If we pass this object to a function by value the whole of the internal buffer will be copied. Not only is this really very inefficient in terms of the time it will take to allocate the memory and copy the values from the original vector to the copy it also increases memory usage greatly and, as a side effect, the risk of memory fragmentation. Imagine if this same vector is copied around again and again (maybe in a loop); it should be pretty clear just how inefficient this is.

The solution is to pass things around by const reference (or const pointer in C). The cost of passing things around by const reference is trivial and about as expensive as passing around an int value. Not only is it so much more efficient to pass objects in this way, but the semantics of your function become way clearer. Just looking at the function prototype tells us that the value being passed is never meant to be modified by this function. You are helping to enforce your objects interface contract.

Let’s see a trivial example.

#include 
#include 
#include 

void foo(std::vector byValue)
{
// do nothing
}

void bar(std::vector const & byRef)
{
// do nothing
}

int main()
{
   auto && v = std::vector(0x7FFFFFF);

   auto && x1 = std::chrono::steady_clock::now();
   foo(v);
   auto && x2 = std::chrono::steady_clock::now();
   bar(v);
   auto && x3 = std::chrono::steady_clock::now();

   auto && d1 = std::chrono::duration_cast(x2 - x1);
   auto && d2 = std::chrono::duration_cast(x3 - x2);

   std::cout
         << "Time to call foo: " << d1.count() << std::endl
         << "Time to call bar: " << d2.count() << std::endl;
}

When running this on my Windows 7 laptop, build with Visual Studio 2013 and executed in as a Release build the call by value takes approximately 1 second whilst the call by reference takes less than a nanosecond. That makes the pass by value a billion times slower! Of course, this is a contrived example and on different machines with different compilers YMMV, but hopefully it serves to demonstrate just how slow passing by value can, when compared to passing by reference!

In the case of passing by value the cost in terms of both time and space complexity is O(N), where N is the number of bytes to be copied. Passing by reference will cost O(1), which is a significant improvement. Okay, the pedants amongst you may wish to argue that even for a reference it’s O(N), because a reference is composed of bytes. True, but the big (massive) difference that the size of a reference is always constant and will be in the order of a few bytes (4 on a 32 bit machine, 8 on a 64 bit machine) and not hundreds, thousands, millions or even billion in the case of non-fundamental objects.

Note: that some compilers may optimize out the calls to the functions foo and bar due to the fact they don’t do anything. This is most likely to happen if you have aggressive optimisation enabled on your compiler. You can either disable this or add some code to these functions to make use of the passed references. Whilst disabling optimisation may skew the results in an absolute sense, the relative comparison should still hold up because what we’re truly interested in here is the asymptotic variance (Big O) rather than wall clock time!

Further reading:

Passing arguments by value
Passing arguments by reference
Passing arguments by address (pointer)

Should I learn C before learning C++? Answer – NO!

evilrix — Fri, 24 Apr 2015 02:55:09 +0000

I often hear those who really should know better giving the advice that before you learn to code in C++ you should first learn to code in C. At face value this would seem like reasonable advice; after all C++ is a superset of C and so by learning C you’ll be learning some of C++. Unfortunately, this advice overlooks some fundamental but very important differences between C and C++ that may very well damage the learning curve of the student.

The main problem is that to say C++ is a superset of C greatly overstates the relationship. It is a superset but only insofar as the core syntax of the languages is very similar. As programming models go the two languages could hardly be further apart. It’s like arguing the case for learning to ride a push-bike before learning to drive a truck because both have wheels. In fact, if you’re going to go on the basis of the languages having similar syntax one could also argue a case for learning C# or even Java before learning C++. Frankly, either of those two languages would still be a better stepping stone than starting with plain on C programming!

The thing is that C and C++ differ greatly in their approaches to software development. The C programming language is a procedural language whose main focus is on being very small and very fast. The code is very linear and has a start, a middle and an end. This is not how you write C++ code (at least, not if you are writing it properly). C is a very powerful, small and fast language, but also very unforgiving. It is very easy to write really very bad C code because the language offers little in the way of protection for the unwary developer.

A stringy mess

The most frequently cited example of things that catch out the unwary is the simple “string” data type as used in C. The C programming language has no real concept of a “string” type. What it considers to be strings are really nothing more than arrays of “char” types, with the very last item in the array being set to NULL. In this way, the C runtime knows when it’s reached the end of a string by virtue of the fact it has discovered a NULL value. Unfortunately, not only does this make coding with C-Style strings very messy, since we need to use special stand-alone functions to perform even simple string manipulation, it’s also incredibly dangerous.

Consider what happens if we accidentally overwrite the terminating NULL with a non-NULL value. Suddenly our simple string is now of an arbitrary length. The string functions that will be looking for the NULL terminator will only stop when (if!) they hit the next arbitrary NULL value in memory. The result is undefined, but you can bet it isn’t going to be pretty. This has been the source of many a “buffer overrun” in badly-written C programs. Such defects can often lead to exploits that can be used to compromise systems.

In general, anything to do with “string” manipulation in C is considered to be (certainly by anyone who isn’t a hard-code C programmer) unsafe. The potential for something to go wrong is far too easy and the consequences are, altogether, far too dangerous. The question you have to ask yourself is why would you recommend a newbie, who is wanting to learn programming, subject himself to such a dangerous and unnecessary environment?

By contrast, C++ has a proper string type. It’s a first class object that can be passed around and it has full string type semantics. No need to call upon dark functions of witchcraft to do simple things like concatenate two strings. No need to perform random acts of memory allocation to ensure we don’t cause buffer overruns. No need to free these additional allocations (or, worse, forget to free them), because the string object does it all for you.

Scott Meyers (very famous author of C/C++ books) once gave a speech in which he argued that there is just no need to teach C++ programmers about C-style strings no C-style arrays. He went on to say that so many defects that exist in C++ code could be avoided if C++ programmers just unlearned (or never learned in the first place) about the existence of unsafe C programming types. C++ has proper object types, provided in the Standard Template Library, that replace these, providing safe and reusable components that just don’t suffer from the serious issues of their C-style equivalents.

I’m not suggesting programmers should not learn of the dangers of things such as “buffer overruns”! Of course they should. What they don’t need to learn (at least in the early days) is how to create them. We don’t give junior doctors scalpels and set them loose in the ER. They have to build up to the scary stuff; learn the best practices first. Only once they have that mastered do they learn the gory things.

Coding structures

As mentioned before, C is a procedural programming language. The basic structural type of a C program is a “function”. Functions contains units of reusable code. They (normally) take arguments as input parameters and (normally) return results. The problem with this is that functions are not first class objects in C. They contain no state (other than static local variables, which are not really the same thing) and they cannot be passed around as units of functionality.

It is possible to pass around function pointers, but this is not the same thing either. A pointer to a function is nothing more than an alias for it. It’s still not a first class function type. The problem with this model is that it doesn’t really make for reusable code. It is a long way from being either “functional” or “object oriented”.

By contrast, C++ is a full-blown object oriented programming language. Actually, more specifically, it is a multi-model programming language. Unlike C, C++ can support many different styles of programming. For example, it has a concept of function objects (functors), which are first class types. This means it’s possible to write “functional” C++ code should one desire.

Of course, more than that, it also has support for proper Object Oriented Programming (OOP). This means that rather than writing your code to be long and linear, you build our code out of reusable objects that model the problem you are trying to solve. It’s a completely different framework and one that makes code so much more robust.

Now the question is, why force a newbie to learn to code procedurally when he will eventually be jumping into OOP? The two styles are so very different and jumping from one to the other can be really quite tricky. Why not just learn OOP from the start? Not only does it end up teaching him bad programming practices from an OOP point of view, but it also teaches him bad habits that are really very hard to break!

I’ve seen so much badly written C++ code that was implemented by C programmers who decided to cross over but had no real clue what object orientation is about. They ended up implementing poor object models, classes that were not cohesive and interfaces that were not loosely coupled. This makes for very brittle object oriented code and is not the way to write C++.

Strong vs. loose typing

C++ is a strongly typed language. The compiler knows all the types (both inbuilt and user defined) and it is able to use its type system to do cool things, such as support function overloading. By contrast, C is not a strongly typed language (at least not in the same sense); it is a loosely typed language. This means that whilst it does have a type system, the compiler doesn’t really make much use of it beyond performing some basic static compile time checks. The compiler is not able to use the type system to do (amongst other things) overloaded function resolution. This means that even doing simple things like outputting stuff to the console requires knowledge of witchcraft and the black-art of format specifiers.

For example, to output anything more simple than a C-style string requires using a function such as printf. This function has to be told via a “format specifier” what types it is being asked to output. If the types it is told do not match the types it is given, the result is undefined (and that is never good). By contrast, C++ has streams, and these streams are type safe. You don’t need to tell a stream what type something is when you send it to be output to the console because the C++ type system already knows. It can’t go wrong because the C++ runtime take care of it for you.

Again, I’ve seen so much poor C++ code that contains a mixture of some C++ and a mixture of unsafe C code, where the programmer has held on to his use of printf (and scanf) with his last dying C programming breath. The results are not only very hard to read and maintain, but they are a disaster waiting to happen. Like most C functions that work with strings, these functions are also subject to the same problems of buffer overrun as most of the others.

In fact, these functions are worse because they also have the added complication of format specifiers. The point is that nearly all things in C (apart from the core shared syntax) are semantically different from C++. Forget C, it is a completely different programming language. Jump right in and just learn C++ (and learn it properly, not a half-baked C hybrid of it).

What’s the alternative?

Learn the semantics and forget the syntax

If you are advising someone who is just starting to learn programming and he wants to know what language to learn to benefit him when learning C++ (often, incorrectly conceived as not being a good language for beginners) recommend something like Python. Sure, it won’t teach him the syntax of C++ but who cares? Syntax is syntax; semantics are what count. By learning Python he will learn how to write object oriented code in a safe programming environment with a language that will hold his hand. Once he has the concepts, then his are ready to learn the syntax of C++ and to take the good programming skills he learned in Python and apply them to the power of C++.

For example, one could learn to speak Spanish (assuming you don’t already) to a level that would be perfectly acceptable without really needing to worry about the syntax. Does knowing the syntax help? Sure it does. Does not knowing it prevent you from learning to speak the language? No, of course it doesn’t. Let me put it even more simply. Hands up who knows the difference between a transitive verb and a non-transitive verb. If your hand is up, well done you! If it’s not, please don’t worry as I promise you’ll still be able to continue speaking English (or Spanish) without ever knowing the difference.

Learn the semantics of object oriented programming and how that applies to C++ and then, when you are ready, figure out the dark corners of the language that are shared with C. As far as the core language goes, you only need to learn those bits that work with C++; you don’t need to learn nor should you care about all the stuff that is in C++ just to make it backwards compatible with C. That stuff was only left in there to make porting C code over to C++ a lot easier. You’re not learning to port code, you’re learning to code in C++ so forget about C and all its weirdness! Learning C is not a short cut to learning C++; rather, it is a hindrance.

In defense of the C programming language

Finally, I just want to note that if this article comes across as berating the C programming language it wasn’t meant to. The simple fact is C and C++ are two very different programming languages that just happen to shame some similar syntax. The goal of the languages is very different and the C programming language excels at what it was originally designed for; to write small fast and very tight code that requires very little resource.

By contrast, C++ is a bit of a bloated beast and is not the language of choice if you are looking, for example, to write code for an embedded device (choose C for that). The only point this article tries to make is that C is not C++ and C++ is not C. They are very different, both have their pros and cons, and learning one is highly unlikely to make learning the other that much easier. Don’t waste your time; if you want to learn C++ then learn C++. C and not C++ never was and never will be!

C99 is not ANSI C

It should also be noted that this discussion was mainly aimed at true C (ANSI C) and doesn’t really consider the C99 standard. This is a greatly enhanced version of C and does add a lot of the nice things that C++ provides. Unfortunately, the semantics and syntax of C99 is still very different from C++ and so the same advice still applies: if you want to learn C++ just learn C++.

Objective-C is…!?

Oh, and don’t even get me started on Objective-C — that is a topic for another article I think!

Functional programming using bind

evilrix — Sun, 16 Feb 2014 19:44:12 +0000

Since the introduction of the STL (Standard Template Library) the use of functors has been a prevalent part of writing C++. Most of the STL algorithms require the use of a functor. For example, the std::transform requires a function object that, given an input of the current value of the current position, it will return a new value what will be used to modify the current value.

This is really nice as it separates the mechanics of the algorithm from the “smarts”. In other words, you don’t have to know nor care how std::transform works, you just need to know what it does and how the functor it uses works. Equally, because std::transform relies on a functor for it’s “smarts” it can be used over and over for different reasons. This is the very epitome of generic code!

The general term for implementing reusable packets of functionality is “functional” programming. It’s a very simple paradigm but it’s also very powerful. By treating functions as object you can create functions that perform little units of work and then pass them around, preloaded with behaviour, to be used elsewhere in the code.

The place the function is used doesn’t need to know what or how the functor works, just that it provides the correct interface. In the case of a functor, the interface spec is it’s prototype; what arguments it expects and what, if anything, it will return. Using function objects in this way is an example of the “Command” design pattern.

Another nice thing with functional programming is you can do what is called “function composition”. Basically, this means you can create a complicated functor out of the composition of smaller and less complicated functors. It’s a little bit like building functionality out of Lego.

However; do not confuse this with normal OOP (Object Oriented Programming). The former is just about binding together function objects, whereas the latter is about creating objects that represent either abstract or concrete concepts. They have properties and interface methods to allow manipulation of those properties.

In functional programming the “objects” are functions; little packets of functionality. The do not have “properties” or “methods” and they do not represent concepts, abstract or otherwise. The closes you get to an interface is the function prototype (ie. the type and number of arguments the function expects when it is called. The only thing an OOP object and a function object have in common is that they can both be passed around and used as “first-class” types.

What’s slightly confusing in C++ is that function objects are implemented using classes that have an implementation of the function operator; allowing the class to be called like a function. This is a syntax thing of the language and doesn’t mean function objects and OOP objects should be thought of as the same thing, at least not semantically.

This is an example of how a simple function object is constructed in C++. Again, yes it is implemented in terms of a class but do not think of it as a class type object. This is just a C++ syntax thing; the semantics of a function object are very different from a class object:

#include
#include
#include 

// remember a struct is just a class whose members are, by default public!
struct hello_message
{
   // function operator of type std::string(std::string const &) const
   std::string operator () (
         std::string const & name
      ) const
   {
      return "hello, " + name;
   }
};

// this expects a functor to execute
void execute_functor(
   std::function const & functor
   )
{
   auto && msg = functor("evilrix");
   std::cout << msg << std::endl;
}

int main()
{
   // create the function object
   auto && hellomsg = hello_message();

   // pass to the executor
   execute_functor(hellomsg);
}

Actually, I lied slightly when I said that functors can’t have properties. Of course, because a functor is just implemented as a class it can have class members (both member functions and variables). This means that the functor can actually be pre-loaded with data upon construction.

Below is the same functor as above but instead of passing the name when the function is called, it is passed when the function is constructed. This would allow it to be passed to another function and executed later. The other function doesn’t need to know what data the function object is preloaded with, it will just call it and output the message:

#include
#include
#include 

// remember a struct is just a class whose members are, by default public!
struct hello_message
{
   // now we need a constructor to "pre-load" the functor with data
   hello_message (std::string const & name)
      : name_(name)
   {
   }

   // function operator of type std::string() const
   std::string operator () () const
   {
      return "hello, " + name_;
   }

   private:
      std::string name_;
};

// this expects a functor to execute, it has no idea what it will output!
void execute_functor(std::function const & functor)
{
   auto && msg = functor();
   std::cout << msg << std::endl;
}

int main()
{
   // create the function object
   auto && hellomsg = hello_message("evilrix");

   // pass to the executor
   execute_functor(hellomsg);
}

An example of where one might want to do something like this is if you were building a generic named dispatcher mechanism. For example, to allow a scripting language to invoke C++ functions. A simple dispatcher might be a std::map that stores, as its key, the name of the function and, as its value, a function object.

To ensure consistency when invoking the dispatcher methods, you might want to ensure that each function in the map always performs certain pre-condition check and handle invalid post-conditions. Rather than imposing this as a restriction of each dispatch function, you can implement a functor object that can call the dispatch function, by proxy, and perform any pre or post condition checks.

For example, you might want to make sure that pointers aren’t null before using them or you might want to take the return status of a function and convert it into an exception that represents the error condition. What you really want is a generic “executor” functor that has no direct coupling to the dispatch functions.

This is quite simple to do using functional programmer. We just create our executor function and have it accept, as an argument, another function object. You can then bind the dispatch function to the generic executor to create a new executor type specific to the dispatch function.

But hold on, I hear you ask…

firstly, this means you are creating a new function object for each dispatch function so what do you save? You’re having to write a new functor for each dispatch function.
Secondly, what on earth do you mean “bind the dispatch function to the executor”?

If you didn’t ask these questions I’m going to assume you did anyway, just for the sake of having the opportunity to expand on both points

Okay, yes we do need to create a new and unique object for each dispatch function but not in the way you might think. We don’t actually need to write any code to do this, we can use a special C++11 function, called std::bind, to get the compiler to do that for us. This not only saves us from writing code it also saves us the possibility of introducing additional defects.

So, let’s look at this bind and see what it is and what it can do for us…

In programming parlance, to bind means to attach one object to another or to create a co-dependency between then. If you are programming using classes you make use of binding all the time and don’t even realise it. When you call a member function on a class you are using “name binding” to bind that function to the class instance.

If the function is virtual C++ uses a slight variation, called “late binding” to ensure the function behaves polymorphically. The difference is that the former is resolved at compile time whilst the latter at runtime.

In both early and late binding, all that’s happing is the correct “this” pointer (either the static or dynamic type) is being bound to the function’s first argument.

“Eh? But my member functions don’t have such an argument”, I can hear you whisper. Strap yourself in whilst I reveal the best kept secret in C++… your compiler lies to you. It lies to you all the time. The code you write is NOT the code it generates when it compiles your source!

Have you ever stopped to think how a class member function works? How does it know which instance of the class is being operated on? How does it know which instance it should use when you call access member functions or member data? Simple, each class instance has it’s own member function… right? Wrong!!!

In fact class member functions are (at least in a semantic sense) just the same as free standing functions. They have no special affinity to the class instance context or, frankly, even the class they are defined within. There is only one of each but here is something slightly special about them though; they all have a hidden first argument, which is the “this” pointer.

Normally, the “this” pointer has the same type as the class in which is it defined. In virtual functions, the type is that of the dynamic rather than static type; in other words, it’s the type of the base class where the function was first declared virtual.

When you call a class member such as:

myClass.foo(1);

This is really nothing more than syntactic sugar for:

foo(&myClass, 1);

Although your classes prototype, as you’ve defined it, looks like this:

void MyClass::foo(int);

The compiler actually sees it as this

void MyClass::foo(MyClass * this, int);

It adds a hidden first argument, which is a pointer to the class instance that the function is to operate on. So when you call a class member you are, in actually fact, binding that class instance to that class member function. In other words, you don’t see it but the address of the class instance is passed into the function via a pointer as the first argument.

OK, but what does this have to do with our generic dispatch executor?

Okay, we have a generic executor and, as it’s first argument, it expects to receive a functor to execute. It might also expect other arguments and that is just fine. What we do is bind the dispatcher function to the first argument of the executor function so that when the executor function is invoked the first argument will always be the passed as the dispatch function implicitly. Don’t worry if that sounds a little confusing as there is a nice simple example coming up!

To achieve this bit of magic we’ll employ the help of a new feature in C++11 called std::bind. This clever function, which was originally (and still is) part of Boost Allows you to create compose compund function objects from existing functions; binding values (which includes other function objects) to the arguments of the function. You don’t need to write any code for this, the compiler does it all for you!

The result of this composition is a new function object that need only be called with those values that can’t be or shouldn’t be resolved until the point of evaluation. This is very similar to the example above, where we changed the hello_message functor to bind the name of the user at the point it was constructed rather than needing to pass it at the point it is called.

Before we move on with our dispatch engine, let’s have a very quick look at a simple use of bind. Let’s say we have a function called multiply that takes two int arguments and returns the product of these two arguments.

int multiply (int x, int y) { return x * y; }

Now, let’s suppose that we had cause to always wanted to find the product of any number to a factor of 10. We could do this by always passing in 10 as the first argument but we want to create a new generic unit of functionality that we can then pass to std::transform to multiply all the numbers in a vector by 10.

Clearly, we have a problem here because std::transform don’t know that this function will always need two arguments and that the first always has to be 10. No, std::transform expects the functor to only take on argument and the output should be that argument “transformed” into the new value. We could write a new wrapper function, that would work:

int multiplyby10 (int x, int y)
{
   return multiply(10 * y);
}

void somefunc(std::vector & v)
{
   std::transform(
      v.begin(),
      v.end(),
      v.begin(),
      multiplyby10
      );
}

Of course, that works but it’s not very elegant is it? I mean, it now means we have to create a new function just for, what could very well be, a localised requirement. There has to be a better solution… and there is! The solution is to use std::bind.

void somefunc(std::vector & v)
{
   std::transform(
      v.begin(),
      v.end(),
      v.begin(),
      std::bind(multiply, 10, _1)
      );
}

What’re we’re doing here is creating a new function object, that only exists for the duration of that one line of code, that will call multiply passing 10 and the first argument). The result will be a functor that std::transform can call like this:

// where N is any integer value
int result = new_functor(N);

Notice the weird _1 (underscore one) as the 3rd argument to bind? It’s called a placeholder and is just that. A placeholder for the argument that the functor still needs. It tells bind that whatever the value of the new functors first argument should be passed as the 2nd value to multiply.

We know it’s the 2nd argument because if you look at bind you can thing of the function name as the 0th argument, the 10 as the 1st argument and the _1 and the 2nd argument. If multiply took more arguments, you can also use _2 … _N as placeholders.

So back to your dispatcher. We need to bind the dispatch function to the executor functor. Now we understand bind a little better let’s see how we can do that:

/**
 * @brief Simple dispatch mechanism using std::bind
 */

#include
#include
#include

#include 

using namespace std::placeholders; // for _1 .. _N

// Our dispatch map, that maps names to functions with a prototype of void(int)
using dispatch_map = std::map<
   std::string,
   std::function
   >;

// A test dispatch function
bool dispatch_function(int testval)
{
   // just for testing, return true if > 0, else false
   return testval > 0;
}

// Generic dispatch function executor
void executor(std::function dispfunc, int testval)
{
   // pre-condition: testval must NOT be zero!
   if (0 == testval)
   {
      throw std::runtime_error(
         "dispatch function error: testval cannot be zero");
   }

   // execute dispatch function
   auto ok = dispfunc(testval);

   // post-condition: dispfunc must return true, else error!
   if (!ok)
   {
      throw std::runtime_error(
         "dispatch function error: " + std::to_string(testval));
   }

   // If we get here we've passed all pre and post conditions!
   std::cout << "OK: " + std::to_string(testval) 

As you can see, we’re creating a new dispatch function by binding the real one to the executor and using that to initialise the dispatch map. The executor take care of all error handling for us. Clearly, this is a very simplified example and is not meant to demonstrate the idea way to implement a dispatcher. It’s a bear bones example so you can focus on how the bind mechanism works, to create compound function objects.
The bind mechanism is a great tool if you just want to create simple composite functors but it’s a little inflexible. For example, if you want to use the output of bind at the input to another bind object, so as to created a nested bind, you’ll run into problems. This is because, unlike Boost bind, the std::bind doesn’t provide a mechanism to protect agains early evaluation of the bind object.
When you pass the output of one bind object as the input to another std::bind call the nested bind object will be evaluated (run and the result obtained) and that is what will be passed to the inner bind. Sometimes this is exactly what you want but other times you actually want the bind object to be passed into the inner bind without being evaluated, so that it can be evaluated in the nested bind. You can do this with using “boost::protect” Boost bind, but not with std::bind.
I’m aware that this might sound a little like gobbledygook, but a little example should make it very clear how the tested bind situation works.
Consider:
#include 

using boost;

bind(f, bind(g, _1))(x);

Is the same as:
f(g(x));

Whereas:
#include
#include  // not found in C++11

using boost;

bind(f, protect(bind(g, _1)))(x);

Is the same as:
f(bind(g, x));

So, essentially, boost::protect actually “protects” the nested bind from being evaluated before calling f, so the nested bind object rather than the result of calling the nested bind object, which would be the result of g(x), are is to function ‘f’.
So, hold on a minute, you can’t do this in C++11? Why did they remove this useful feature?
Yes, I agree, on face value it is slightly frustrating as lazy evaluation on nested bind object is very handy; however, do not despair… C++11 had a different trick up its sleeve as we’ll see in my next article where we will take a look at lambda expressions.

Lowest Common Ancestor (non-BST)

evilrix — Tue, 12 Nov 2013 23:36:19 +0000

In the earlier article, Lowest Common Ancestor (BST), I discussed how you can use the special ordering of a Binary Search Tree to quickly and easily identify the Lowest Common Ancestor of two nodes. Of course, not all trees are BSTs and so in this article we’ll look at a way of finding the LCA in a non-BST.

So, just to keep things simple I am actually going to use the same code as before, but with a modified find_lca function. This means the tree is actually a BST; however, it is important to note that this is an irrelevancy since we’re not making use of the BST’s ordering properties and that this algorithm would work on any binary tree.

There are a number of ways this can be done. One naive way is to perform a recursive Depth First Search (DFS) to ensure that the two nodes we’re looking for are both either on the left or the right of the current node. If they’re on the left we perform a recursive search on the left child. If they’re both on the right we perform a recursive search on the right child. If they fall either side we’ve found the LCA.

The problem with this approach is the search time is quadratic, order O(n^2). Why? Because we have to keep searching the same nodes over and over, only dropping down one level in the tree each time, until we find the LCA. Can we do better? Well, yes we can – as long as we’re prepared to trade a little time for a little space.

Since the DFS is generally performed recursively (although it can be performed iteratively using a real stack) we can, as the stack unwinds, store the node at each level. This will, effectively, allow us to trace out the route from the node in question back to the root node.

If we do this for both nodes we’ll have two lists. If we then compare those lists side-by-side we’ll see that the nodes, up to a certain point, match. Where the last item in the list that matches is the point of divergence and, this, the LCA.

Consider the following tree:

From the tree, above, we can note the following:

If we were to dispatch a search for [1] we’d end up with a list of 8->3->1.
If we were to dispatch a search for [1] we’d end up with a list of 8->3->6->7.
If we compare these two lists we see that 3 is the point of divergence and the LCA!

We’ve effectively traded a little liner space, order O(n), for a quadratic time complexity. The overall time complexity of this is now, also, liner; corresponding to the number of nodes in the tree. I’d say that’s a pretty reasonable trade-off.

Here’s the code.

/**
* Lowest Common Ancestor (non-BST)
*
*/

#include
#include
#include
#include
#include 

namespace evilrix {
   namespace mostlycoding {
      /**
      * @brief A node object for our tree, below.
      *
      * @tparam T Generic type parameter representing our data.
      */

      template
      struct Node
      {
         using Data = T;    ///< The data
         using PNode = std::shared_ptr;    ///< The node

         /**
         * @brief Initializes a new instance of the main class.
         *
         * @param data (Optional) the data.
         */

         Node(Data const & data = 0) : data(data) {}

         PNode plhs; ///< The plhs
         PNode prhs; ///< The prhs
         Data data;  ///< The data
      };

      /**
      * @brief A tree, implemented as a BST.
      *
      * @tparam T T Generic type parameter representing our data.
      */

      template
      class Tree
      {
      public:
         using Data = T;    ///< The data
         using PNode = std::shared_ptr;    ///< The node

         /**
         * @brief Initializes a new instance of the main class.
         */

         Tree() {}

         /**
         * @brief Inserts the given data.
         *
         * @param data The data.
         */

         void insert(Data const & data)
         {
            PNode * pproot = &proot_;

            // see my note in the "find_lca" function as to why I am using an
            // iterative rather than recursive traversal approach.
            while (*pproot)
            {
               pproot = data < (*pproot)->data ?
                  &(*pproot)->plhs : &(*pproot)->prhs;
            }

            (*pproot) = PNode(new Node(data));
         }

         /**
          * @brief Uses a DFS to find a route to the node with data value 'd'
          *
          * @param d              The Data node to file.
          * @param [in,out] route The route.
          * @param ppnode         The start node.
          *
          * @return true if it succeeds, false if it fails.
          */

         bool find_route(Data const & d, std::vector & route, PNode const * ppnode) const
         {
            if (!ppnode || !(*ppnode))
            {
               return false;
            }

            if ((*ppnode)->data == d)
            {
               route.push_back(*ppnode);
               return true;
            }

            if (find_route(d, route, &(*ppnode)->plhs))
            {
               route.push_back(*ppnode);
               return true;
            }

            if (find_route(d, route, &(*ppnode)->prhs))
            {
               route.push_back(*ppnode);
               return true;
            }

            return false;
         }

         /**
          * @brief Searches for the first lca.
          *
          * @param x The Data find.
          * @param y The Data find.
          *
          * @return The found lca.
          */

         PNode find_lca(Data const & x, Data const & y) const
         {
            std::vector routex;
            find_route(x, routex, &proot_);

            std::vector routey;
            find_route(y, routey, &proot_);

            PNode result;

            while (
               !routex.empty() &&
               !routey.empty() &&
               routex.back()->data == routey.back()->data
               )
            {
               result = routex.back();
               routex.pop_back();
               routey.pop_back();
            }

            return result;
         }

         PNode proot_;
      };
   }
}

using namespace evilrix::mostlycoding;

/**
* @brief Main entry-point for this application.
*
* @return Exit-code for the process - 0 for success, else an error code.
*/

int main(/* int argc, char * argv[] */)
{
   /*
   *         8
   *        /
   *      (3)   9
   *      /
   *   [1]   6
   *        /
   *       4   [7]
   */

   Tree tree;
   tree.insert(8);
   tree.insert(3);
   tree.insert(1);
   tree.insert(6);
   tree.insert(4);
   tree.insert(7);
   tree.insert(9);

   Tree::PNode pnode1 = tree.find_lca(1, 7);
   std::cout << (pnode1 ? std::to_string(pnode1->data) : "(null)") << std::endl;
}

That’s all folks

Lowest Common Ancestor (BST)

evilrix — Sun, 03 Nov 2013 19:23:31 +0000

The Binary Search Tree (BST) is a tree like data structure that allows for the quick lookup of data. They work by storing inserted data in a tree like structure that uses the “divide and conquer” approach to locate data. Each node in the tree has, at most, two children. The left hand side (lhs) node will always contain a value that is less than it’s parent. The right hand side (rhs) node will always contain a value that is greater-than or equal to it’s parent.

Using this property we can very quickly traverse the tree to find a value, with the number of look-ups required being no greater than the maximum height of the tree. If we assume that the tree is well balanced this gives us an amortized time complexity of order O(log n). In other words, the look-up time complexity is logarithmic. It’s important to know; however, that this only holds true if the tree is balanced. If it’s not the worse case asymptotic time complexity is order O(n), or linear.

Why is this? Simply because in the worse case an unordered tree is just a link list. Or, put another way, a linked list can be thought of as a special case tree; one that has just one branch and no divergences. Imagine a situation where we have a BST that isn’t self balancing and we inject data into it in an ascending order. Each value inserted will be put to the left of the last, which is just another way of saying we’ve created a list of sorted numbers!

So, from all this we can conclude that for a BST to be useful it really needs to be balanced. It just so happens that there are a number of BST implementations that will take care of this, automatically, when data is inserted. They have a “self balancing” property. Examples of such trees are the AVL Tree (named after the initialise of the inventors) and the Red/Black tree (often the basis of the std::set and std::map).

Of course, this article isn’t specifically about BST implementations but it is important to have a basic understanding of how a BST works before moving on to the topic in hand; namely, how to find the Lowest Common Ancestor of two values in our BST?!

I guess the first thing we should do is actually define what we mean by Lowest Common Ancestor. Put simply, the Lowest Common Ancestor of any node in a BST is the node where traversal of each of the child nodes diverges down different branches. This might sound complicated but it is anything but.

Let’s look as a simple BST.

In this example, the Lowest Common Ancestor of 1 and 7 is 3. This is because node 3 is the node where traversal in locating these two nodes, 1 and 7, diverges into separate branches. Our task is to figure out a simple algorithm for finding the Lowest Common Ancestor that will work for any BST of any size. Sounds complicated, right? Actually, it’s about a simple an operation as things get when it comes to messing with trees. Here’s how it works…

We start at the root node and check to see if 1 and 7 are less than it’s value. If they are we haven’t found the LCA but we do know it’s somewhere down the left hand side branch so we can now move down to that node. At the next node we perform the same check to see if 1 and 7 are less than it’s value. This time around only one of them is less. At this point we can stop since we’ve now found the LCA.

We know this because at this point one number can be found to the Lhs of the current node and the other can be found on the Rhs of the current node. In other words at this point the traversal will diverge. Let’s think about it a little more abstractly by introducing a little pseudo code…

Let N be the current node.
Let X be one child node and Y be the other.

while not done
do
   if X < N and Y < N
   then
      N = Lhs
   else
   if X > N and Y > N
   then
      N = Rhs
   else
      done = True
   end
repeat

This loop will continue to traverse the tree one node at a time, either going left because both values can be found to the left of the current node or going right because bot values can be found to the right of the current node. As soon either of these statements is no longer true we have found the Lowest Common Ancestor.

And now, the real code…

/**
 * @brief Lowest Common Ancestor
 */

#include
#include
#include 

namespace evilrix {
   namespace mostlycoding {

      /**
       * @brief A node object for our tree, below.
       *
       * @tparam T Generic type parameter representing our data.
       */

      template
      struct Node
      {
         using Data = T;	///< The data
         using PNode = std::shared_ptr;	///< The node

         /**
          * @brief Initializes a new instance of the main class.
          *
          * @param data (Optional) the data.
          */

         Node(Data const & data = 0) : data(data) {}

         PNode plhs; ///< The plhs
         PNode prhs; ///< The prhs
         Data data;  ///< The data
      };

      /**
       * @brief A tree, implemented as a BST.
       *
       * @tparam T T Generic type parameter representing our data.
       */

      template
      class Tree
      {
      public:
         using Data = T;	///< The data
         using PNode = std::shared_ptr;	///< The node

         /**
          * @brief Initializes a new instance of the main class.
          */

         Tree() {}

         /**
          * @brief Inserts the given data.
          *
          * @param data The data.
          */

         void insert(Data const & data)
         {
            PNode * pproot = &proot_;

            // see my note in the "find_lca" function as to why I am using an
            // iterative rather than recursive traversal approach.
            while (*pproot)
            {
               pproot = data < (*pproot)->data ?
                  &(*pproot)->plhs : &(*pproot)->prhs;
            }

            (*pproot) = PNode(new Node(data));
         }

         /**
          * @brief Searches for the first lca.
          *
          * @param x The first data item.
          * @param y The second data item.
          *
          * @return The found lca data item.
          */

         PNode find_lca(Data const & x, Data const & y) const
         {
            PNode const * pproot = &proot_;

            // Whilst tree traversal is normally done using recursion I've
            // opted for iterative just because it's easier to visualise what's
            // happening. Also, unless your compiler supports "tail recursion"
            // there is always a danger that the recursive approach could blow
            // up. Even if it doesn't, recursion is has a O(n) space complexity
            // whereas iteration has a O(1) space complexity. Both have a time
            // complexity of O(n), assuming no branching is required.
            while (*pproot)
            {
               // traverse down the left hand side?
               if (x < (*pproot)->data && y < (*pproot)->data)
               {
                  pproot = &(*pproot)->plhs;
               }
               else
               // traverse down the right hand side?
               if (x >(*pproot)->data && y >(*pproot)->data)
               {
                  pproot = &(*pproot)->prhs;
               }
               else
               {
                  // we've reached the divisor so this node is the LCA
                  break;
               }
            }

            return (*pproot);
         }

         PNode proot_;
      };
   }
}

using namespace evilrix::mostlycoding;

/**
 * @brief Main entry-point for this application.
 *
 * @return Exit-code for the process - 0 for success, else an error code.
 */

int main(/* int argc, char * argv[] */)
{
   /*
    *         8
    *        /
    *      (3)   9
    *      /
    *   [1]   6
    *        /
    *       4   [7]
    */

   Tree tree;
   tree.insert(8);
   tree.insert(3);
   tree.insert(1);
   tree.insert(6);
   tree.insert(4);
   tree.insert(7);
   tree.insert(9);

   Tree::PNode pnode = tree.find_lca(1, 7);

   std::cout << (pnode ? std::to_string(pnode->data) : "(null)") << std::endl;
}

Hopefully, this post has shown that finding the Lowest Common Ancestor of a BST is really not actually that hard. Like most algorithms, once you know and understand the logic behind how it works it’s pretty simple stuff. Of course, knowing is 90% of the battle!

Condition Variables

evilrix — Sun, 27 Oct 2013 22:22:53 +0000

You work in a bar, pouring pints for the locals. One of your regulars comes in; he’s looking pretty grumpy today. “Whiskey” he snaps. You put down a glass and pour. You finish pouring and he necks back the drink. “Again”, he snaps. Again, you pour and as soon as you finish he necks it. This repeats two or three more times before the grumpy man slams down the money for his tab and leaves. Congratulations, you have just taken part in a “Producer/Consumer” exchange.

The “Producer/Consumer” (P/C) is one of the most well known and useful design patterns. It has a plethora of uses and, yet, it’s premise is very very simple. You have a “Producer”; an entity that provides something, and you have a “Consumer”; and entity that uses that resource. What makes the P/C pattern special is that access to the resource is mutually exclusive. When the Producer is producing the resource the Consumer has to wait. Just like, when you are pouring the drink the man has to wait. Likewise, when the Consumer is consuming the resource the Producer has to wait; try filling the man’s glass whilst he drinks if it’s not clear why this is so!

This article isn’t about the P/C design pattern, it’s about one of the fundamental thread synchronization objects used to successfully implement it. This article is about a simple but powerful entity called the “Condition Variable” (CV), so named because it is a variable that is shared between threads and used to allow one (or both) to notify the other of a certain condition. Never let it be said that programmers aren’t pragmatic when it comes to naming the tools of their trade!

The basic way a CV works is that it allows thread A to wait until such time that thread B signals that it can move on. You can sort of think of a CV as a smarter mutex, where as the thread in question can explicitly wait for another thread to tell it when it can go. This differs from a standard mutex. You can think of a standard mutex as just a gate, that is either locked or not. If it’s locked you wait and if it’s not you proceed.

A CV is more like a gate with a red light. When you arrive at the gate you press a button and the red light comes on. You then wait until the red light goes back out, indicating the person on the other side of the gate has acknowledged you and is now ready for you to enter. The door may or may not be locked… it doesn’t matter since you don’t enter until the light is gone from on to off.

It will probably not surprise you, then, to find out that the CV is actually implemented with the help of a mutex (it’s actually implemented indirectly via the unique_lock object, but this is just to ensure any locks owned are automatically released in the face of an exception). To use a CV thread A will apply a lock to a mutex and then pass the lock into the CV’s wait method. The mutex, up until this point, will be owned by thread A.

Once the wait method is called ownership of the mutex is released. Meanwhile, thread B will be waiting for ownership of the mutex. Once thread A is waiting, thread B is released. It does what it must (pour the drink!?) and then it calls the notify method.

Once the notify is called thread A is released and can continue on it’s merry way. At this point, both threads are now released. In more complex situations it may be necessary to have B then wait on A and A then wait on B and so on, but when this happens it’s just a repeat of the initial basic steps discussed (although when B wants to wait on A the steps are, of course, swapped so that B starts first).

This is one of those code-flows that is harder to explain than it is to write, so rather than trying to explain things in even more detail and getting everyone (including me) confused, let’s have a quick look at a simple example.

#include
#include
#include
#include 

using namespace std;

int main()
{
   mutex m;
   condition_variable cv;

   // THREAD PROC LAMBDA >>>
   auto tp = [&m, &cv]
   {
      // notify
      cout << "1" << endl;
      cv.notify_one();

      // wait
      {
         unique_lock l(m);
         cv.wait(l);
      }

      // notify
      cout << "3" << endl;
      cv.notify_one();

      // wait
      {
         unique_lock l(m);
         cv.wait(l);
      }

      // notify
      cout << "5" << endl;
      cv.notify_one();
   };
   // THREAD PROC LAMBDA <<<

   auto t = thread(tp);

   // wait
   {
      unique_lock l(m);
      cv.wait(l);
   }

   // =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
   // notify
   cout << "2" << endl;
   cv.notify_one();

   // wait
   {
      unique_lock l(m);
      cv.wait(l);
   }

   // notify
   cout << "4" << endl;
   cv.notify_one();

   // wait
   {
      unique_lock l(m);
      cv.wait(l);
   }

   t.join();

   cout << "6" << endl;
}

Notice how we have a very simple pattern being repeated?

Thread A: get scoped mutex lock
Thread A: wait on CV using mutex?

Thread B: do stuff
Thread B: notify
Thread B: get scoped mutex lock
Thread B: wait on CV using mutex

Thread A: do stuff
Thread A: notify
Thread A: get scoped mutex lock
Thread A: wait on CV using mutex

Thread B: do stuff
Thread B: notify
Thread B: get scoped mutex lock
Thread B: wait on CV using mutex

Thread A: do stuff
Thread A: notify
Thread A: get scoped mutex lock
Thread A: wait on CV using mutex

Thread B: do stuff
Thread B: notify
Thread B: thread exists

Thread A: wait for thread B to join

To summerise

For thread A to wait on thread be it needs to obtain a scoped unique_lock, and then use that with the CV’s wait method.
For thread B to tell thread A when it’s read it calls notify_one
For thread B to wait on thread be it needs to obtain a scoped unique_lock, and then use that with the CV’s wait method.
For thread A to tell thread A when it’s read it calls notify_one

That’s it, it’s a simple as that. Using this very simple pattern of locking, waiting and notifying two threads can dance around each other in a sort of coroutine tango.

But wait, I can hear you screaming… what does this have to do with the PC pattern? Well, here’s some pseudo code to clear that up.

drunk (consumer):
do
   wait for bartender to pour
   get glass
   drink whiskey
   put glass
repeat until passed out or no more whiskey

bartender (producer):
do
   get glass
   pour whiskey
   put glass
   wait for drunk to drink
repeat until drunk has passed out or no more whiskey

Whilst the CV is useful for many reasons when it comes to thread synchronization, the P/C pattern is by far the most obvious place for it to be used. It’s hard to see how this simple but incredibly useful pattern couple be implemented (so easily) without this flexible and effective thread primitive.

Variadic Functions the C++11 way

evilrix — Sat, 12 Oct 2013 22:20:48 +0000

The C++11 standard introduced so called Variadic Templates. These have many uses, one of which is the ability to write functions that take any number of arguments without having to mess around with C-style non-type safe “var-args” and printf like format specifiers.

It has always been possible to emulate variadic functions in C++, without using C style var-args, by using function objects and chained function operator calls. This; however, was never a satisfactory solution for a number of reasons:

The calling syntax is unnatural (it doesn’t look like a standard function call)
You have to write boiler-plate code in the guise of a function object (functor)
You have to cache arguments if you can’t handle them discretely (one at a time)
In which case arguments generally need converting to a single type (eg. string)
It required multiple function calls, which are not cost free in terms of performance
Overall, it was a clumsy solution to a problem

The standard istream and ostream are a good example of emulating variadic arguments. The istream is basically a functor object that does the same job as scanf and the ostream is a functor that does the same thing as printf. In each case, you use the stream operator to invoke the input of output functionality of the stream and you can chain these together to emulate variadic arguments.

Consider the following:

#include 
#include 

using namespace std;

int main()
{
	printf("%s%d%c%.2fn", "hello", 1, 'c', 56.23f);
	cout << "hello" << 1 << 'c' << 56.23f << std::endl;
}

Notice that line 9 is actually a number of chained calls to the streaming operator on the cout stream object? This is semantically identical to the variadic arguments of the printf function call before it. Of course, this is a contrived example but it serves to demonstrate the point. There is, of course, nothing to stop the determined C++03 programmer from creating their own function object class that supports a similar chaining syntax, either by overloading the streaming operator or, as is often the case, the function operator.

#include 
#include 

using namespace std;

struct mystream_t
{
	template 
	mystream_t & operator ()(T const & t)
	{
		// cout used here by way of demonstration
		cout << t;
		return *this;
	}
};

mystream_t mystream;

int main()
{
	printf("%s%d%c%.2fn", "hello", 1, 'c', 56.23f);
	mystream("hello")(1)('c')(56.23f)('n');
}

As you can see this is a bit of a poor substitute for real variadic functions, so let’s see how we’d do this in C++11.

#include 
#include 

using namespace std;

// This is the terminating template
template
void variadic_func(Arg const & arg)
{
	std::cout << arg;
}

// This is the variadic args handling template
template
void variadic_func(Arg const & arg, Args const & ... args)
{
	variadic_func(arg);
	variadic_func(args...);
}

int main()
{
	printf("%s%d%c%.2fn", "hello", 1, 'c', 56.23f);
	variadic_func("hello", 1, 'c', 56.23f, 'n');
}

At this point you are probably looking at this and wondering how on earth this is any better than the previous example? Well, it is. It has many advantages over the chained function call:

The calling syntax is natural (it is just a standard function call)
No boiler-plate code, just your single arg function and the args wrapper version
No need to cache arguments as the “parameter pack” handles that for you
No need to convert arguments into a single type (such as a string)
It’s properly type safe, since each argument is handled as the correct type
It’s (mostly) simple, once you understand variadic templates is easy to follow

Ok, so how does it work?

Firstly, you’ll notice there are actaully two versions of the function, one that takes two arguments and one that takes only one argument. Actually, the one that seems to take two argument actually takes two or more arguments. You see the second argument is a so called “parameter pack“. When you call the function with just a single argument it will call the first form. When you call the function with multiple arguments it will call the second form. The second for receives one argument and a parameter pack, which is a special argument that contains all the other arguments as a single entity.

Now, you can’t access these other arguments directly so to resolve them we use a little bit of recursion trickery. Firstly, we handle the single argument by calling the single argument form of the function, we then call the multiple argument for of the function and only pass it the parameter pack. The function is called again and this time the next item in the argument pack is picked off as the single argument and the rest form the now smaller parameter pack. This repeats until there is only one argument left in the parameter pack, which means the recursion ends because the multiple argument form of the function is no longer called.

This is easier to understand if we walk through what is happened (and once you get it I promise it makes perfect sense and is very easy to understand). Let’s go through it step by step:

variadic_func is called with "hello", 1, 'c', 56.23f, 'n'
	arg = "hello"
	args = { 1, 'c', 56.23f, 'n' }
	variadic_func is called with "hello"
	variadic_func is called with 1, 'c', 56.23f, 'n'
		arg = 1
		args = { 'c', 56.23f, 'n' }
		variadic_func is called with 1
		variadic_func is called with 'c', 56.23f, 'n'
			arg = 'c'
			args = { 56.23f, 'n' }
			variadic_func is called with 'c'
			variadic_func is called with 56.23f, 'n'
				arg = 56.23f
				args = { 'n' }
				variadic_func is called with 'c'
				variadic_func is called with 'n'

So, as you can see, the variadic_func function just keeps calling itself until it runs out of arguments in the parameter pack, at which point the final call is to the overload that only takes a single argument and at that point the recursion ends. Of course, we still have the problem of multiple function calls but because this is all unravelled by the template compiler before the C++ code compiler kicks in the chances are the resultant code will be inlined.

Variadic templates are not just available to functions, they can also be used with classes, too. In fact, the C++11 std::tuple is a great example. In a future article I’ll explore how you can use variadic templates with classes.

Down-casting re-visited

evilrix — Wed, 18 Sep 2013 22:24:24 +0000

In my previous article I discussed the difference between up-cast and down-cast and explained why down-casting is rarely a good idea (or even necessary). That said, there are a few times when down-casting is valid and so this article shows how to do so, safely, in C++.

The first thing to ask yourself when considering a down-cast is, “do I really need to do this?”. With a few exceptions (the curious recurring template pattern being one of them) the need to down-cast is often a sign that your design is either wrong or your inheritance model needs some additional layers of abstraction. If; however, you are confident that the need to down-cast is valid you have a number of options open to you in C++, some are safer than others.

Runtime Down-casting

Static Down-cast

The most basic mechanism for down-casting is the simple static cast. Let’s look at an example:

#include 
using namespace std;

class Base
{
};

class Derived : public Base
{
    public:
        void foo() const
        {
            cout << "Hello, world" << endl;
        }
};

int main()
{
    Base && base = Derived();
    static_cast(base).foo();    
}

In this example, we are creating an instance of Derived and assigning it to an r-value reference of the Base type. We then use the static cast mechanism to down-cast this reference to a reference of the Derived class such that we can call the foo function. This is an inexpensive down-cast in so far as it has no runtime overhead.

The problem; however, is that Derived might not be in the same inheritance line as Base and if that happens the result of this cast is undefined. Undefined behaviour is the C++ standard’s way of saying your code is defective! Unfortunately, undefined behaviour doesn’t mean the application will crash. In fact, you may see no initial ill effects at all but that doesn’t mean all is right and, eventually, the result of an incorrect cast will lead to tears.

Dynamic down-cast

Fortunately, C++ provides a mechanism for making this cast safe. This mechanism is the dynamic cast. One thing to note is that dynamic cast only works if the inheritance model is polymorphic. This means that at least one of the functions in the inheritance tree has to be virtual otherwise the compiler will spit out a compile time error.

Let’s look at a modified version of the previous code that makes use of dynamic cast:

#include 
#include 
using namespace std;

class Base
{
    public:
        virtual ~Base() {};
};

class Derived : public Base
{
    public:
        void foo() const
        {
            cout << "Hello, world" << endl;
        }
};

int main()
{
    Base * base = new Derived();
    if(Derived * derived = dynamic_cast(base))
    {
        derived->foo();
    }
    else
    {
        cerr << "cast failed" << endl;
    }
    delete base;
}

How does this differ from the previous example? There are two changes; the base class now implements a virtual destructor to make the inheritance model polymorphic and the cast is now performed using the dynamic cast. You’ll note that the cast is now wrapped by a try/catch block.

The way dynamic cast works is to use RTTI (Run Time Type Information) to determine if the cast is safe. If it is the cast is performed. If it isn’t then a “bad_cast” exception is thrown. Of course, your code still has to deal with the fact the cast has failed but at least we’re not into the weeds of undefined behaviour and, in the worse case, your program can fail gracefully.

It should be noted that the exception is only thrown when attempting to incorrectly cast a reference type. In the case of a pointer type the result of the cast will be a nullptr rather than the throwing of an exception. The following is an example using pointers rather than references:

#include 
#include 
using namespace std;

class Base
{
    public:
        virtual ~Base() {};
};

class Derived : public Base
{
    public:
        void foo() const
        {
            cout << "Hello, world" << endl;
        }
};

int main()
{
    Base * base = new Derived();
    if(Derived * derived = dynamic_cast(base))
    {
        derived->foo();
    }
    else
    {
        cerr << "cast failed" << endl;
    }
    delete base;
}

It should be pretty obvious what’s happening here. If the cast works the foo function is called, else an error message is sent to stderr. Of course, this is just an example and so your code would need to take the necessary action in the else clause to gracefully handle the situation of the cast failing.

Notice something interesting about this code? Notice how we are able to define a new variable as part of the test expression in the ‘if’ statement and then use that instance in the code-block (in this case to call foo)? This syntax was specially added to the C++ standard to allow exactly this idiom of casting and allowing some specific action if the cast works whilst restricting the scope of the temporary pointer to the place where it’s being used.

So, we now have a safe mechanism for implementing down-casting… that’s it, right? Well, actually, no it’s not. The problem here is that the invalid down-cast is detected at runtime, which means your code must be written to deal with the failure and by then it’s too late (normally) to do anything sensible other than fail gracefully. The dynamic cast is also hugely costly in terms of runtime performance. You do not want to do this in code that is called frequently!

Can we do better? You bet we can!

Compile-time Down-casting

What we really want is a type safe way to perform this down-cast such that if it’s going to fail we’ll get a compile time error. In this way, we know that if our code compiles the cast must be valid and it will never fail at runtime.

But, wait… dynamic casting is a runtime thing isn’t it? Static casting is compile time but that won’t trap errors at all and will just result in a badly behaved app if the cast is not valid. That’s it then isn’t it? All our options are used up? Actually, no – we have one more trick up our sleeve; down-casting using the Visitor Pattern.

Visitor Pattern Down-casting

The Visitor Pattern was originally conceived as a way of divorcing an algorithm from the object to which that algorithm is being applied. Normally, when you implement an object you implement methods for that object that manipulate it. If you need to change the algorithms that perform this manipulation you have to change the object.

The visitor pattern keeps the two things separate such that you can modify the algorithms without needing to modify the objects that they are applied to. This means you can add new algorithms to be applied to an object without needing to make any changes to the object.

The Visitor Pattern works using “Double Dispatch“. Ordinarily, when you call a virtual function on an object the function called depends on the dynamic type of the object to which the function belongs. No account is taken of the dynamic type(s) being passed into the function. This is called “Single Dispatch“.

In case it isn’t clear, the static type is the concrete type of the object and the dynamic type is the type it actually references. In the example code show so far, base has the static type of Base and the dynamic type of Derived since that is the type that it actually references.

In the Visitor Pattern both the dynamic type of the object to which the function belongs and the dynamic type of the object(s) being passed to the function determine which function override gets called. Some programming languages implement this natively; C++ does not. The Visitor Pattern allows us to emulate this by making use of the C++ type system.

The following is the same example modified to use the Visitor Pattern:

#include 
#include 
using namespace std;

class Derived;

class Visitor
{
    public:
        void visit(Derived & derived);
};

class Base
{
    public:
        virtual ~Base() {};
        virtual void accept(Visitor && visitor) = 0;
};

class Derived : public Base
{
    public:
        void accept(Visitor && visitor)
        {
            visitor.visit(*this);
        }

        void foo() const
        {
            cout << "Hello, world" << endl;
        }
};

void Visitor::visit(Derived & derived)
{
    derived.foo();
}

int main()
{
    Base && base = Derived();
    base.accept(Visitor());
}

Ok, I’ll be the first to admit that the visitor pattern isn’t pretty when compared to the elegance of a dynamic cast. It is; however, a significant improvement in terms of type safety. You see, as long as this compiles we know that it will work just fine at runtime. But, how does it work?

As you can see Base declares a pure virtual function called accept, which is then implemented in the derived class. The purpose of this function is to “accept” the visitor object. The visitor is passed into the accept function, which will be called in the context of the dynamic type. In this case, it’ll be called in the context of Derived.

Once in this context the “visit” method on the visitor object is called and into that we pass a reference (or pointer) to the dynamic type of the object accepting the visitor. This means that the visitor is passed the correct context of the dynamic type, so that it can now treat it as a static type. In other words, although we started out with a reference to Base the visitor ends up with a reference to Derived and, as such, can safely deal with it in that context. The down-cast has been completed.

But wait… what if we were to invoke this Visitor-cast on an invalid type? Well, it should be pretty obvious! There is no method on the visitor for accepting any type other than Derived and so we’d just get a compile time error. In other words, if the cast is invalid the code won’t build.

To add support for different types that can be safely down-cast from Base you just need to add additional overloaded visit functions to the visitor. In each case the visit function will be called in the correct context of the dynamic type and, as such, you can perform whatever action is valid for that type.

Let’s look as an example for multiple derived types:

#include 
#include 
using namespace std;

class Derived1;
class Derived2;

class Visitor
{
    public:
        void visit(Derived1 & derived);
        void visit(Derived2 & derived);
};

class Base
{
    public:
        virtual ~Base() {};
        virtual void accept(Visitor && visitor) = 0;
};

class Derived1 : public Base
{
    public:
        void accept(Visitor && visitor)
        {
            visitor.visit(*this);
        }

        void foo() const
        {
            cout << "Hello, Derived1::foo" << endl;
        }
};

class Derived2 : public Base
{
    public:
        void accept(Visitor && visitor)
        {
            visitor.visit(*this);
        }

        void bar() const
        {
            cout << "Hello, Derived2::bar" << endl;
        }
};

void Visitor::visit(Derived1 & derived)
{
    derived.foo();
}

void Visitor::visit(Derived2 & derived)
{
    derived.bar();
}

int main()
{
    Base && base1 = Derived1();
    base1.accept(Visitor());

    Base && base2 = Derived2();
    base2.accept(Visitor());
}

Here we have two different concrete types that both derive from Base. If Base references Derived1 we want to call its member function foo(). If Base references Derived2 we want to call its member function bar(). Since these two functions are not common to Base we need to perform a down-cast to the appropriate type. The visitor take care of that for us.

Of course, the Visitor Pattern isn’t perfect. It necessitates that your objects be modified to support it and the resultant code can be a little convoluted. That all said the times you need to down-cast should be few and far between and in this authors humble opinion the ability to ensure your down-casts are safe at compile time far out weights the cons.

Bloodlines and casting

evilrix — Mon, 26 Aug 2013 20:03:04 +0000

When working with objects that have an inheritance model you basically have an inverted tree that represents your object hierarchy. Contrary to a normal everyday trees, an inheritance tree has its root at the top. In other words, the root of the tree represents the base class and anything below it represents a more derived class.

If we move from a more specialised class to a less specialised class we are moving up the inheritance tree. Likewise, if we move from a less specialised class to a more specialised class we are moving down the inheritance tree. These two methods of traversing the inheritance tree are called up-casting and down-casting, respectively.

In Object Oriented Programming parlance, casting is a process of starting off with a concrete type and then referencing it via another type in its object inheritance hierarchy. As long as said reference is any type from the original concrete type back up through it’s blood-line to the ultimate base class this is a perfectly safe thing to do. Of course, if you try and case outside of the blood-line you are asking for trouble.

Up-casting is going from a derived class to a base class. This is always safe because we are moving from an object that has a more detail to an object that has less detail. It’s like a slim person putting on large clothes, they may be baggy but you can always get inside them.

The contrary is not true. When down-casting, going from a base class to a derived class, we are starting with an object that has less detail and trying to cast to an object that has more detail. It’s like being a large person trying to put on small clothes. You might be able to get them on but there is a real danger they might split and leave you exposed!

There is always a one-2-one relationship when up-casting, since a child only has one parent. This means we know we can safely cast from derived to base because we always know its parent. The same isn’t true when down-casting. A parent might have many children. How do we know that the one we’re about to down-cast two is the child we started off with? If we down-cast to the wrong child it is likely to have a tantrum!

Let’s consider an example to put this into some context. Supposed we have a base class called Animal. From that we derived two classes, one called Mammal and another called Reptile. From Mammal we derive Dog and Cat and from Reptile we derive Snake and Lizard. Our class hierarchy now looks like this:

                         +---------+
                         | Animal  |
                         +----+----+
                              |
       +---------+            |            +---------+
       | Mammal  |<-----------+----------->| Reptile |
       +----+----+                         +----+----+
            |                                   |
+-------+   |   +-------+         +---------+   |   +----------+
|  Dog  |<--+-->|  Cat  |         |  Snake  |<--+-->|  Lizard  |
+-------+       +-------+         +---------+       +----------+

So, looking at this we can see that there are four clear blood-lines:

Dog >> Mammal >> Animal
Cat>> Mammal >> Animal
Snake >> Reptile >> Animal
Lizard>> Reptile >> Animal

If we start off with a Dog we can safely represent it via a Mammal reference or an Animal reference. Put into Object Oriented Programming parlance, we can say that a dog IS_A mammal and that a mammal IS_A(n) animal. The IS_A directive basically means that there is a inheritance relationship that models the concept that a derived class is just a more specialised base class.

In this case a dog is just a special mammal; it’s a mammal that just happens to have the additional characteristics of a dog. For that reason we can safely cast from a dog to a mammal because we’ just going to ignore the additional doggy type characteristics. Of course, because we started off with a dog and then cast to a mammal, we can safely cast back to a dog.

But what if we didn’t know this was the case? I mean, what if we don’t know that the original concrete type was a dog? Consider, I have a reference to a mammal but I have no idea whether it actually refers to a concrete type that is a dog or a concrete type that is a cat. Can I now safely cast it to a dog? What if it actually started out life as a cat? If I try and cast it to a dog it’s likely to end up with multiple personality disorder!

This is why down-casting can be so problematic. With an up-cast you always know what you’re getting because you know a derived type IS_A base type (it just has extra bells and whistles). The converse; however, is not true. A Dog may very well be a mammal but a mammal doesn’t necessarily have to be a dog.

So, the morel of this story is to avoid down-casting. It’s fraught with danger and is almost certainly going to end in tears unless you’re really (REALLY) careful! C++ does give you ways to do this safely (to be discussed in my next article) but just because you can do something doesn’t mean you should!

As it happens, there should rarely be a need to consider down-casting. If your class inheritance hierarchy has been designed properly you should never even need to care about down-casting. In fact, if you do find yourself needing to down-cast the chances are your class hierarchy is wrong and needs to be reviewed. The need to down-cast is nearly always a sign that your inheritance model is wrong.

Ask yourself, why do I need to do this down-cast? If it’s to get access to a member that isn’t in the current class but does exist in a more derived class then either you need to be using a more derived reference or you probably need to consider putting the member into the base class. If the latter doesn’t make sense it’s possible that what you need is an additional level of abstraction.

For example, we could have just had Animal as one class and then Dog, Cat, Snake and Lizard as derived classes from that but what if we then wanted to add a “lactate” method? With a few minor exceptions, only mammals lactate (as far as I know; I am not a zoologist!) so we could not put this in the base class Animal. We can; however, add this as a method in the Mammal class and then specialise it for Cat and Dog. We can then handle mammals differently from reptiles quite easily.

In my next article I’ll focus more specifically on C++ where I’ll explore how, for those very rare cases where it might be necessary, it is possible to safely down-cast. I’ll be looking at two methods: runtime type checking via dynamic_cast and compile time type checking via the visitor pattern.

LRU Cache Implementation

evilrix — Sat, 03 Aug 2013 18:38:14 +0000

One of the problems any developer will eventually have to resolve is one of latency; specifically, being able to retrieve and process data in a timely fashion. This issue can come in many guises but they generally manifest as needing to read data from a backing store that cannot deliver the high performance needed by the application. This can be a tricky problem to solve but the general method is to implement some form of caching. The remainder of this article will discuss one caching mechanism, called the LRU Cache.

Firstly, let’s define what we mean by “cache”. Put simply, a cache is a bounded storage mechanism that allows low latency access to data. In other words, it’s a temporary storage area that can hold a sub-set of the data you’re working with to provide faster access to that data than you’d otherwise expect if retrieving the data directly from the backing store. When I use the term “backing store” I am referring to any mechanism that is able to store data for retrieval, such as local disk, remote database, web service or, in fact, anything that is capable of serving data to your application.

So, what is an LRU Cache? It turns out that most of us have used an LRU without even realizing it. If you’ve ever used Microsoft Word, or, in fact, any application that allows you to open up your recent documents, you are actually making use of an LRU. LRU stands for Least Recently Used and it refers to the eviction strategy employed by the cache. As noted before, a cache is bounded. This means that it has an upper limit on the amount of data it can store. Once this upper limit is met it is necessary to evict some data. With an LRU the least recently used data is purged. In other words, if it’s not been used for a long time the changes are we don’t need it anymore.

There are plenty of strategies for eviction but the LRU is probably the most common as it provides a good all-round strategy for most use-cases and requires no specialize knowledge of the problem domain. In other words, it uses a simple, naive but generally quite sensible strategy that is based upon usage of the data rather than the data itself. Other, more complex caches require specialized knowledge either about the data or the way the data will be used.

One of the most common uses of an LRU is as a Page Replacement Algorithm (PRA). All modern Operating Systems make use of Virtual Memory. This allows the OS to run far more applications that would otherwise fit into memory. It works by using the disk as a temporary memory store; writing memory “pages” to disk when they are not needed. If the physical memory is becoming full the PRA will look for pages that can be paged to disk. This is often done by purging those pages that haven’t been needed for a while. It is reasonable to assume that if they’ve not been used for a while they are likely to be less important than a recently used page.

Whilst this strategy isn’t perfect and there are times where it can backfire (who hasn’t had a computer that has started “thrashing”?) As far as it goes for the most part this is a reasonable strategy that will fit well with most latency problems. In general, an LRU is a good place to start and you should only start implementing more complex strategies if performance profiling with “representative” data suggests otherwise.

Ok, so how complicated is it to implement one of these LRU Caches then? As it turns out it’s actually not that difficult at all. You can think of an LRU Cache as a bounded key value store. Of course, you need a key because you need to be able to find your data once it’s put in the cache. The value is, of course, the data you’re caching. We can implement a simple LRU Cache using nothing more than std::map (or std::unordered_map) and std::list.

The list is used as a queue and represents the LRU. The order of the queue determines the priority and order for eviction. The size of the list is bounded, with recent items being pushed into one end and the least recent items popping out the other. Let’s assume we decide to bound the list to 10 items. We can push 10 items into it no problem. What happens when we push item number 11? Simple, the first item we push in (item 1) is popped off the end of the queue to make space for item 11.

Simple eh? Not quite. You see, we have one special case we need to take care of. What if item 11 is actually item 5? In other words, we’re using something again that we used previously that is still in our list. Well, that’s also quite simple. We just remove the original item 5 from the list and re-queue. Ah, but there’s now another problem. How do we find the item in the queue? Sure, we could do a linear search for it and in a list of only 10 items that’s probably going to work out just fine. What if the list has 5 million items in it? A linear search is not going to work too well; it’ll be way too slow.

The answer is to use an index, which tracks the items in the list so we can find them quickly. We do this by using a map (or unordered_map), that has the item key as its key and an iterator to the item in the list as its value. In other words, we use our index to map the key to the iterator that represents it in the list. The std::list has a nice properly that neither appending nor deleting items will invalidate iterators (other than the iterator of the deleted item). This means that if we need to handle an item we’ve seen again we can use the map to find the item in the list, erase the existing item and re-queue it again.

So, that’s the theory now let’s look at the code!

/*vim: set ft=cpp ts=3 sw=3 tw=0 sts=0 et:*/
/**
* @file lru.hpp
* @brief Least recently used cache
* @author Ricky Cormier
* @version See version.h (N/A if it doesn't exist)
* @date 2013-07-29
*/

#pragma once

#include 
#include 
#include 

namespace amp { namespace container {

   template <
      typename K,
      typename V,
      template  class M = std::map>
   class lru
   {
   public:
      typedef K key_type;
      typedef V mapped_type;
      typedef size_t size_type;
      typedef std::pair value_type;

   private:
      typedef std::list cache_type;
      typedef M index_type;

   public:
      typedef typename index_type::iterator iterator;
      typedef typename index_type::const_iterator const_iterator;

   public:
      /**
      * @brief construct a new lru
      *
      * @param n : is the initial capacity
      */
      lru(size_type const n = 0)
         : capacity_(n)
      {
      }

      /**
      * @brief construct a new lru
      *
      * @param list : initialization list
      */
      lru(std::initializer_list list)
         : capacity_(list.size())
      {
         for (auto const & item : list)
         {
            insert(item);
         }
      }

      /**
      * @brief Move constructor
      *
      * @param other lru
      */
      lru(lru && other)
         : index_(std::move(other.index_))
         , cache_(std::move(other.cache_))
         , capacity_(std::move(other.capacity_))
      {
      }

      /**
      * @brief get beginning and end of lru container
      *
      * @return [const_]iterator
      */
      const_iterator begin() const
      {
         return index_.begin();
      }

      const_iterator end() const
      {
         return index_.end();
      }

      iterator begin()
      {
         return index_.begin();
      }

      iterator end()
      {
         return index_.end();
      }

      /**
      * @brief find an item in the lru
      *
      * @param k : key for wanted item
      *
      * @return [const_]iterator to item (or end() if not found)
      */
      const_iterator find(key_type const & k) const
      {
         return index_.find(k);
      }

      iterator find(key_type const & k)
      {
         return index_.find(k);
      }

      /**
      * @brief erase item from lru using key
      *
      * @param k : key for item that is to be removed
      */
      void erase(key_type const & k)
      {
         erase(index_.find(k));
      }

      /**
      * @brief erase item from lru using iterator
      *
      * @param itr : iterator to item that is to be removed
      */
      void erase(iterator itr)
      {
         if (itr != index_.end())
         {
            cache_.erase(itr->second);
            index_.erase(itr);
         }
      }

      /**
      * @brief clear lru (capacity is unchanged)
      */
      void clear()
      {
         index_.clear();
         cache_.clear();
      }

      /**
      * @brief get the number of items in the lru
      *
      * @return the number of items currently in the lru
      */
      size_type count() const
      {
         return size();
      }

      size_type size() const
      {
         return index_.size();
      }

      /**
      * @brief set the capacity for this lru
      *
      * @param n : the capacity to be set
      */
      void capacity(size_type const n)
      {
         capacity_ = n;

         while (cache_.size() > capacity_)
         {
            erase(cache_.back().first);
         }
      }

      /**
      * @brief get capacity of this lru
      *
      * @return the capacity
      */
      size_type capacity() const
      {
         return capacity_;
      }

      /**
      * @brief swap another lru's contents with this lru
      *
      * @param rhs : the other lru
      */
      void swap(lru & rhs)
      {
         index_.swap(rhs.index_);
         cache_.swap(rhs.cache_);
      }

      /**
      * @brief index operator
      *
      * @param k : the key
      *
      * @return the item that maps to the key value
      */
      mapped_type & operator[](key_type const & k)
      {
         auto itr = find(k);

         if (itr == end())
         {
            itr = insert(std::make_pair(k, mapped_type())).first;
         }

         return itr->second->second;
      }

      /**
      * @brief insert a new item into the lru
      *
      * @param kv : the key/value pair to insert
      *
      * @return pair:
      * first - iterator to the inserted item
      * second - true if new item of false if item already exists
      */
      std::pair insert(value_type const & kv)
      {
         cache_.push_front(kv);
         auto itr = find(kv.first);

         if (itr == index_.end())
         {
            if (cache_.size() > capacity_) { erase(cache_.back().first); }
            itr = index_.insert(std::make_pair(kv.first, cache_.begin())).first;
            return std::make_pair(iterator(itr), true);
         }

         cache_.erase(itr->second);
         itr->second = cache_.begin();
         return std::make_pair(iterator(itr), false);
      }

   private:
      index_type index_;
      cache_type cache_;
      size_type capacity_;
   };

}}

As you can see, it’s a pretty simple data structure to implement, with the code not really doing much more than marshalling between the list and the index to ensure the eviction policy is implemented correctly.

The complete code for this, and other useful classes, can be found in my Amp library, which is hosted on Github.

Geo-indexing problem

evilrix — Mon, 15 Jul 2013 21:30:45 +0000

Imaging you have a map and on that map you define a bunch of geo-locations; polygons, which are defined by their vertices as latitude and longitude co-ordinates. These geo-locations may overlap and may either be very big or very small (or in-between). The problem is to figure out, for any point on the map, which of these geo-locations bound it.

Finding whether a polygon bounds a point is actually pretty simple. The problem here is that the number of polygons I have to work with is huge (millions and millions) and the answer has to be delivered super-duper fast; in the order of a couple of millisecond, max! Of course, it is possible to brute-force a solution by checking the point against each and every polygon but that takes far too long.

What’s needed is a way of producing a “candidate” list; a small sub-set of the complete list that can be checked very quickly.

It turns out there are a number of possible ways of doing this. One way is to use a “quad tree” to section off the map into smaller and smaller segments. It’s, effectively, like performing a 2D binary search; drilling down on the map until you get to a small enough quadrant that you have a candidate set.

This approach does have its down sides!

Depending on the distribution of your polygons in 2D space you could end up with a greatly unbalanced tree, which could be very costly in terms of performance. Also, it’s necessary to test every candidate found and where there are lots of overlapping polygons this could end up being a large number.

I need a better, faster and more efficient solution!

After giving this problem a lot of thought I’ve come up with an approach that, at least on paper (because I’ve yet to code it up) should provide a way of identifying the bounding polygons in near constant time. Unfortunately, it’s slightly complex and is a little difficult to explain and, I suspect, will be even harder to code but if it works it will be quite a nice and elegant solution. I decided to share my idea in the hopes of getting constructive feedback.

Here goes…

This is a proposed framework to build a pre-computed point-in-polygon index, which can be used to perform constant time O(1) translation from Lat/Long to geo-location polygon items. Unfortunately, the process to create this index is a little convoluted; however, it makes use of standard algorithms and so it should be possible to use pre-written libraries for the more complex parts of the process.

The solution works by creating a collection of virtual polygons that represent all the intersections of real polygons such that the virtual polygons never overlap. A good way to think of this is to imagine three glass polygons overlapping. If you focus on the edges of each polygon you’ll be able to see that you can actually visualise a larger set of polygons made up of the edges and where they intersect.

The above diagram demonstrates a simple example of 3 ‘glass’ polygons (rectangles for the sake of keeping things simple) overlapping. As can be seen, the edges of these polygons intersect and these intersection points can be assigned vertices, thus creating a collection of virtual polygons.

What’s interested about these virtual polygons is that they do not overlap; they overlap real polygons but not each other. We can use this property to our advantage such that we can assume a one to many mapping of virtual polygons to the real polygons that they overlay. For example, VP7 overlays RP1, RP2 and RP3 but it does not overlay any thing else. Using this we can’t create a simple virtual polygon to real polygon index.

VP1 = RP1
VP2 = RP2
VP3 = RP3
VP4 = RP1, RP2
VP5 = RP1, RP3
VP6 = RP2, RP3
VP7 = RP1, RP2, RP3

Obviously, figuring out the virtual polygons is by no means a simple task but it turns out that this is actually quite a common thing to want to do and is a staple of most image processing packages. The process is called “clipping” and there are libraries for this that support set operations on the polygons. I am hopeful that one of these libraries might be used to pre-compute the intersection vertices.

Of course, we still needs a way to figure out which of the virtual polygons our point of interest resides. To do this we can use a geohash that, effectively, divides the map into a grid with each sector in the grid being a bucket containing the details of the virtual polygons and which real polygons they overlay within that sector.

The example, above, has a point in the 4,5 sector. In this sector we find two virtual polygons, VP4 and VP7. The point is not in VP4, the dot is in VP7. Our index will map VP7 to R1, R2 and R3 and so these are the real polygons our point resides within. Sectors that don’t contain any polygons do not need to be indexed.

When we are performing a lookup on a specific geo-point we use the same geohash to figure out what bucket it should map to and extract the corresponding virtual polygon candidates. Once we have these it’s a simple case of iterating the list until we find one our point lives within. The candidate list should be very small because virtual polygons don’t overlap.

Since we know none of these virtual polygons overlap we can stop searching as soon as we find a match. Unless we happen to have a grid location that happens to have an intersection of numerous polygons of acute angles there is a fair chance that most of these buckets will contain a small (4-6) number of virtual polygons to test.

This being the case the amortised time complexity for performing a lookup in the computed index is approximately O(1). It’s actually O( n ) where n is the number of virtual polygons in that bucket but as this is likely to be a tiny number in proportion to the complete sample space we can assume that for all intents and purposes that the amortised time is constant.

Of course, it is quite likely the final index will be quite large; however, this can be tuned by changing the granularity of the geohash. The greater the geohash granularity the bigger the index but the less v-polygons that need testing. Whilst this may prove to be yet another problem this is one that is relatively simple to solve as there are more than enough solutions to work around this.

For example, if the index was too large to be kept in memory it could easily be stored in a persisted key/value document store (something like Kyoto Cabinet). Disk and memory are cheap and so the trade-off for the ability to perform geo-location lookups in almost constant time is likely to be worth it.

Doubt and uncertainty!

evilrix — Thu, 09 May 2013 22:03:46 +0000

The C and C++ standards documents can be a bit of a beast to trawl through and quite often you’ll find yourself reading the same sentence a number of times trying to fathom out what it is actually saying. It’s just like when you read the EULA for a software product; lots of big words and long sentences that don’t actually seem to make a lot of sense.

Of course, they do make sense but the language used is necessarily long winded because it has to cover all bases. After all, the standard documents are the final word in how the language is used and defines what compiler vendors need to do to make sure their product behaves correctly and as expected.

Unfortunately, even with all the long sentences and big words the standards documents cannot hope to capture all cases. There are an infinite number of ways the C and C++ programming languages can be used to develop programs and there are a plethora of Operating Systems and hardware platforms that need to be considered.

To this end the standards have a get out of jail card. Put simply, anything that is not specifically defined by the documents as having an explicit behaviour is, by its very definition, implicitly undefined. Basically, if the standard doesn’t guarantee something you cannot safely write code that will depend on it, regardless of what you favourite compiler might do.

Actually, the standards documents do go one step further than this by quite often stating what the expectations should be for something that it doesn’t explicitly define. The standards documents use two different phrases to set the level of expectation and they both have very precise meaning as far as they are concerned:

undefined behaviour
unspecified behaviour

At face value these two phrases look pretty much the same. They both make it clear that the standards don’t provide any guarantees on the behaviour. They are; however, very different in terms of the semantics assigned to them by the standards. Let’s look at the formal wording for both of these as defined by both the C11 and C++11 standards.

Undefined behaviour

The C++11 standard

Behaviour for which this International Standard imposes no requirements. Undefined behaviour may be expected when this International Standard omits any explicit definition of behaviour or when a program uses an erroneous construct or erroneous data. Permissible undefined behaviour ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

The C11 standard

Behaviour, upon use of a non-portable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements. Possible undefined behaviour ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). An example of undefined behaviour is the behaviour on integer overflow.

Unspecified behaviour

The C++11 standard

Behaviour, for a well-formed program construct and correct data, that depends on the implementation. The implementation is not required to document which behaviour occurs. The range of possible behaviours is usually delineated by this International Standard.

The C11 standard

Use of an unspecified value, or other behaviour where this International Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance. An example of unspecified behaviour is the order in which the arguments to a function are evaluated.

As you can see, in both cases the meanings are very well defined but to be sure what they mean is understood let’s try and put them into more every-day language.

Undefined behaviour has no sensible outcome as far as the standards documents are concerned. Things might work as you’d expected (hoped?) but then again they might not. If things do work they are working more by luck and chance than anything else.

If you write code that contains constructs that would result in undefined behaviour your code should be considered defective. That’s not to say things won’t work. They actually might but just because they work today doesn’t mean they’ll work tomorrow. Something that works by luck rather than design cannot really be considered working, can it?

To put this into a littler perspective, let’s consider code that contains a buffer overrun that trashes a function’s stack frame. It might just be that the result of this exhibits no noticeable mal-effects. At some point you add a new variable to the function, which is held on the stack, and as such changes all the off-sets for that stack-frame.

Bingo! The result is that this code may very well now behave in a completely different and random way, the result of which is… undefined!

Unlike undefined behaviour, code that relies on unspecified behaviour isn’t considered erroneous. On the contrary, it is well formed (assuming no other issues). In this case; however, the behaviour will be determined by the compiler, OS and/or hardware. As long as none of these change the behaviour will be consistent. What isn’t consistent; however, is the behaviour when one or more of these is changed. In other words, the code is non-portable, or otherwise platform specific.

In other words, how the code behaves is dependent on the environment it is born and runs within. It will always behave the same as long as the environment isn’t changed but the standards documents take no view on what this behaviour will be. It is necessary to refer to compiler, platform and/or hardware documentation to determine how the code will behave.

Writing code that has unspecified behaviour is perfectly reasonable if you are targeting a particular platform using a particular compiler and the code is designed to run on a particular Operating System. The problem is that because the standards make no promises about how the code will behave if any of these environmental conditions change the behaviour of the code could be broken and it will no longer function as you’ve come to expect.

This means that it will be necessary to undergo intensive testing, for example, each time you upgrade your compiler, install a service pack onto the target platform or upgrade hardware. On the contrary, code that is not platform dependent is guaranteed to always behave the same way as as described by the standards documentation (assuming the compiler is standards compliant and contains, itself, no defects).

Writing code that has “undefined behaviour” is a recipe for disaster. It is almost certainly going to end in tears. If your code doesn’t have behaviour that is explicitly defined by the relevant standards document you should assume it is defective and you should fix it. Likewise, if the standard explicitly states the behaviour is undefined then it is defected and not fixing it is really the coding equivalent of playing Russian Roulette.

If your code contains constructs that are defined as having “unspecified behaviour” at least you can bask in the fact it is not defective; however, don’t be too complacent. It works today and will behave as you expect but you need to have your wits about you. If your environment changes then, so too, might the behaviour of your code.

Avoid relying on unspecified. Avoid it at all costs. If you need to do something that is platform specific consider using a quality library, such as Boost, that provides an abstraction between your code and the task you are looking to achieve. In this way your code remains robust and the problem of ensuring unspecified behaviour works as expected is up to the library publisher.

C++11 r-value references

evilrix — Mon, 18 Feb 2013 01:35:48 +0000

The C++03 standard treats temporary types as r-values (types only meant to go on the right hand side of an assignment expression). As such, it is only possible to bind a temporary to a const reference type. This is a somewhat arbitrary and, often, frustrating rule. The original idea was that there would be no good reason to modify a temporary; however, it turns out that there are plenty of good reasons for doing so and this arbitrary restriction was just a nuisance that served no good cause.

The C++11 standard recognises that there is no good reason to force this restriction on developers and so lifts it with the introduction of the r-value reference. The syntax is slightly different from a standard l-value reference in that it requires the use of two and not one ampersand, which is quite probably to make sure r-value references are not declared accidentally. In fact, the new standard does go to some lengths to ensure you don’t accidentally use an r-value reference. For example, a named type is never an r-value reference candidate and so to create one from it the std::move() function template must be used.

Whilst not quite as seamless as it could be this is a useful change that will avoid, for example, the need to create stack based variables just to fulfil the arguments of a function call. I’ve lost track of the number of times I am calling a function that takes a non-const reference to a variable that it is going to change when I really don’t care about the outcome. The fact I still have to create a stack based object just to take this result is frustrating and makes for unclear code.

A useful side-effect of this new reference type is the fact it has allowed for the support of constructors with “move” rather than “copy” semantics. Prior to C++ 11, if you created an object in a local stack frame that you wanted to return to the called you had to return it by copy. This is, clearly, inefficient but necessary because the local object will go out of scope.

A lot of compilers do just that by implementing an optimisation called N/RVO ([named] return value optimisation). Unfortunately, it’s not part of the C++ standard and so you can’t rely on it. Also, compilers (and different versions of the same compiler) differ in how good they are at implementing this optimisation and so benefits can be inconsistent. Some implement full N/RVO, others only implement RVO and some implement neither!

The C++11 standard now recognises that a constructor that takes an r-value reference of the class type shall represent a constructor designed to implement move semantics. This basically means that instead of having to take a copy of all the contents of the old class to set up the new class you can just move the contents and that can be as simple as moving a pointer (assuming your class holds a pointer that points to all its data)

Consider, you are implementing a new type of vector container class. Before C++11 you would need to implement a copy-constructor that performed a deep-copy of that classes data (assuming you didn’t disable copying by hiding the c_c-tor). With a move constructor you can just copy the pointer that references the classes data and then set the pointer of the old class to nullptr (or NULL for those of us used to C++03) to indicate your new class has taken ownership of the resource.

In the case of returning something that is defined on the local stack frame, your move constructor will be called and all that does is copy the pointer to the recourse and not the resource itself. If the resource is a huge blob (technical term!) of memory this is the difference between copying a pointer and copying all that memory.

In C++, any type returned by value generates a temporary. In C++11 this will be treated as an r-value reference and as such the move constructor, if defined, will be called otherwise the copy-constructor is called. In C++ , where there is no such type as an r-value reference the standard behaviour of calling the copy-constructor will prevail.

In other words, to benefit from the more efficient move behaviour in C++11 there is no need to change any existing code other than to add a move constructor to your class. Once done C++ 11 compiled code will benefit from move semantics whilst C++03 compiled code will continue to behave the way it always has; implementing copy semantics.

Below are some examples r-value references in action.

#include 

using namespace std;

struct object
{
    object()
    {
        cout << "default constructor" << endl;
    }

    object(object const & o)
    {
        cout << "copy constructor" << endl;
    }

    object(object const && o)
    {
        cout << "move constructor" << endl;
    }
};

object foo()
{
    object o; // default constructor
    return o; // move constructor in C++11 and copy constructor in C++03
}

void bar(object & o)
{
   cout << "bar by l-value reference" << endl;
}

void bar(object && o)
{
   cout << "bar by r-value reference" << endl;
}

int main()
{
    object o = foo();              // constructs a new object via factory
    bar(o);                        // l-value reference
    bar(object());                 // r-value reference
    object & lvro = o;             // l-value reference
    object && rvro = std::move(o); // r-value reference from named type
}

Technical Debt

evilrix — Mon, 11 Feb 2013 00:59:57 +0000

Regular readers (do I actually have any, I wonder?) of my blog may be wondering why I’ve not posted any new content for the last few weeks. First off, let me apologise for this. Secondly, let me explain why: I’ve been busy… in my new job! That’s right, ladies and gentlemen, everyone’s favourite evil one has finally found himself employment again.

I am now Head of Engineering & Information Technology at a mobile media software company, which specialises in real-time bidding for mobile media advertisements. I actually interviewed for a Senior Software Engineer position but, somehow, bagged the managerial role. I’m still not quite sure how one turned into the other but, basically, the company needed to fill both roles and I just happened to have the right level and type of practical experience in setting up and implementing engineering best practice and process so they offered me the choice of either role. Of course, I took the head role and I am not complaining

As it happens, I’ve landed myself a right sweet little number. The company is still quite small and has been trading for only a few years so there is lots of room for growth (both for the company and, I hope, me too). Also (and the main reason I accepted the role), the people there are absolutely lovely. A right crazy bunch; but then it is in the media advertising business space so I guess that is to be expected. This is a new industry for me (in terms of the vertical market the business operates in) but it’s a pretty exiting one and the overall sense for me is that it’s probably a good place to be right now in terms of future potential and prospects.

As far as my role goes, I do have my work cut out for me. Whilst the Engineering team have done a sterling job to get the product to the place where it is, it’s a typical story of a company that is (or, rather, was) still very much start-up mode. Basically, it’s a case of a JFDI (Just F…ing Do It) approach to delivery schedule management with best practice in terms of engineering protocols and processes taking second place to the wants and needs of the commercial side of the business. Of course, this is quite understandable since this is what pays the bills but this is also an unsustainable way of operating a growing Engineering team; it’s a house of cards just waiting to collapse.

The term for this is “Technical Debt”. Basically, an engineering team, unless allowed to proactively keep on top of maintenance and development schedules, will fall into debt with their workload. Actually, I am slightly abusing the term as has a slightly more refined definition that than; however, the semantics apply here in the sense that if a team is overworked and understaffed (or just under-organised) it will build up a backlog of work that it struggles to get done. The backlog will just grow and grow (like debt), meanwhile, the team often ends up “borrowing from Peter to pay Paul” by making promises to deliver one item at the cost of failing to meet other deadlines. This trick only holds for so long before all your creditors call in their loans.

My job, initially, is to negotiate better terms for the debt incurred. This isn’t always easy since there is still a number of legacy issues that demand immediate attention; however, a lot of the time handling this comes down to expectation management. In other words, when people are used to getting engineering to drop everything for them because they have shouted loudly it can be quite a culture shock when they suddenly discover that shouting, no matter how loudly, no longer works. The trick is to try and demonstrate that a little bit of planning on both sides of the fence goes much further in getting things done.

Of course, you are asking for something (less shouting and more time) and so in return you must give something. You must give your word that your new processes and delivery schedules will meet the reset expectation and then, of course, you must deliver on your promise. Failure to do so will not only make you look utterly stupid it will also ensure your word no longer has any meaning. You only get one chance at this so… don’t blow it!

The trick to this is to ensure that your team are isolated from all of this as much as possible. Let them focus on the deliverables whilst you, the team lead, absorb all the shouting and fallout from the angry clients. A distracted team is a non-productive team; let them focus on getting the debt down whilst you handle the bailiffs. Fortunately, I am both thick skinned, quite large (both in personality and shear bulk) and not easily intimidated. These are all quite useful attributes when telling an angry sales manager that, “no we will not drop everything just for you because you wrote a cheque to a customer than my team cannot cash”.

In the end, I view this situation being like paying off a repayment mortgage. Initially, most of your payments go towards the interested but slowly but surely, month by month, you pay off less interest and more capital until, one day, you are debt free. This is the same with technical debt. Initially, you do a lot of hard work for very little payback; there will continue to be lots of angry conversations with sales managers and mountains of work that still needs doing. You still end up being quite a lot more reactive than you’d like but as long as you’re chipping away at the technical debt you eventually get to a point where you can be more and more proactive, leading to less and less reactive confrontation.

It’s fun but, boy oh boy, is it exhausting. Still, the challenge is to work hard now so you can work smart going forward; eventually getting to a point where we’ve proactively done everything that needed doing so you and your team can pop down the pub every Friday lunchtime for the rest of the day! Cheers!

The basics of Spam detection

evilrix — Tue, 15 Jan 2013 00:07:45 +0000

During my numerous years as a software engineer I have spent many an occasion developing solutions to combat Spam. This article introduces the origins of spam and then looks at a number of ways it can be detected.

It’s important to note that nothing I write here is new. A lot of this information can be found if you are prepared to do enough Google searching. The point of this article is that it coalesces these ideas into one article.

It should also be noted that whilst I have made every effort to ensure these details are correct it is inevitable that it will contain errors and/or omissions. If you do find something that doesn’t appear to be correct please let me know and I will update the article, accordingly.

So, with that out of the way let us begin our journey into the wonderful world of Spam detection…

Spam, spam spam spam…

What is Spam?

It would seem fitting to start with a basic definition of Spam.

The classical definition of Spam is Unsolicited Bulk email (UBE). This definition, by today’s standards, is no longer adequate and it can be more appropriately defined as “flooding the Internet with many copies of the same message, in an attempt to force the message on people who would not otherwise choose to receive it”. Pretty much anywhere on The Internet that allows for the propagation or creation of user generated content can and will be targeted by spammers.

Spam. It’s a problem. In fact, it’s a huge problem. There is probably not a single person on the planet who has access to The Internet who has not been affected by Spam in some way. Most (over 80%) of today’s Spam either originates from or is at least facilitated by organised Spam gangs and most of these either fund or are funded by organised crime.

Why Spam?

The point of Spam (normally) is to convince a victim to part with their hard earned cash. This can be directly, such as trying to get them to purchase something (normally worthless) or indirectly by tricking them into parting with information or signing up to something that has hidden costs they are not aware of at the time. More often than not spam is the virtual version of a con and spammers are the confidence tricksters.

Often, a spammer will try and convince you to purchase something. The kind of vile things spammers will happily pedal are: pharmaceuticals that are at best placebo and at worse poisonous, pornography (including kiddie porn) and various scams (such as 419 scams or stock).

Other times the point of the spam is not to sell but to trick you into giving away personal information (phishing) and/or installing a Trojan program. The malware can be anything from spyware, keyboard loggers or a stealth program that turns the victims computer into part of a Spam botnet so as to percolate even more spam.

It’s hard to understand the psyche of a spammer but one thing is very clear, to them Spam equals £££ (or $$$). Given that it costs next to nothing to propagate Spam and given The Internet is now so pervasive the ratio of cost to number of messages sent means that even if 0.1% of a spammers victims are taken in that makes it worth their while. Put simply, Spam is big business and if there is a way a spammer can exploit a system that allows them to get their message across they will.

The Spam War

In the “good ol’ days” email was a very insecure protocol. Mail Transfer Agents would happily forward on anything they were sent. The Internet was a young a trusting place. Then, the spammers arrived and they soon realised they could abuse this trust by churning out vast amounts of unsolicited email through the Open Relays. So began the downfall of email and the start of the war on spam!

For years the spammers abused the email system, literally to the point where it practically became useless for anything because it was so saturated with Spam. Over time the good guys learned to fight back. They closed the open relays and invented techniques such as Sender Policy Framework (SPF) to make life for the spammers hard.

The spammers fought back and started setting up their own relays that were dedicated to forwarding spam. Once again the good guys retaliated by setting up Real-time Black-hole List (RBLs). Of course, the spammers then fought back by spreading malware that turned victims computers into botnets that allowed them to propagate their vile wares indirectly. The Anti-Malware community countered by adding detection for these Trojans into their products.

The war email Spam is being won. The combination of improvements in anti-spam filter techniques, various techniques, such as reputation services, for blocking spam at the source and the pervasive introduction of Webmail mean we’re in a position where the amount of email Spam is finally declining.

Spammers have tried to get past these filters by using techniques such as image Spam and Tag Soup spam using malformed HTML. Detection techniques; however, have reached a point where spammers can rarely penetrate competent filters.

A New Frontier

And so here we are, 2013. The arms race has been won, right? The end of Spam is nigh, yes? No. Sadly, not even close! The recent boom of social networking services on The Internet has given the spammers a whole new vehicle to peddle their vile wares.

One of the reasons for the decline in email Spam is that email just isn’t as pervasive as it once was. Although most people do have an email account the amount of time they spend interacting with their account is generally far less than the amount of time they will spend on a social network.

For example, on an average day how long do you spend logged into your Facebook account compared to monitoring your (personal) email? Combine the decline in the usage of email with the boom of social networking and it doesn’t take much to realise the spammers now have a new agenda.

Nowadays, spammers are more often than not targeting social networking sites. The reasons should be obvious. They have a captive audience. Most sites allow users to interact with user developed content and apps.

Anyone can get an account. The whole point of a social networking site is to make as many connections as you can — something the spammers rely on. There is not a day that goes by where most Facebook users are not blighted with spam and often they don’t even realised it!

For example, consider the various applications that people sign up to on Facebook, often indiscriminately. A lot of these require you to enter personal information or click on links that take you to websites with advertising where you must sign up.

Although a lot of these are reputable a lot are rogue-ware taking advantage of social engineering techniques. They will fool unsuspecting victims into granting access permission to access personal information on their Facebook account so they can then target them and their friends with more direct spam by with sending private messages or posting on your or friends walls without their consent.

It’s not just social networking sites that are proliferated with spam. Spammers have also realised that any site that accepts user generated content is also fair game. For example, do you have a blog? Chances are that unless it is protected by an anti-spam service (such as Akismet) the user comments section will be riddle with spam. Often, these are disguised as legitimate feedback, praising

In short, sites that allow user generated content are a veritable breeding ground for Spam. Spammers can easily create an account (depending on where they sign up from it is almost trivial to create a new account) and once in they can post content pretty much anywhere on the site and send private messages to whoever they like.

Fighting Back

For the purposes of simplifying the rest of this document the term spam will be used to generically refer to unwanted user generated content that may be either for commercial gain, offensive material of just vandalism. Each of these have one common attribute; we can automatically detect and prevent to a high degree of statistical accuracy.

Welcome to the era of anti-spam. A dream world where spam is no more. But is this really just a dream? Probably but what about if we could eradicate 98% of spam? Would you at least want to consider the possibility? If you’ve just answered, “yes”, well done; that was the correct answer. Read on!

It’s time we took control. It’s time we fought back against these spammers (and vandals); the minority that ruin The Internet for the majority. It’s time we started implementing techniques to automatically control and manage the sites user generated content.

The rest of this document will present different spam filtering techniques and discuss how they could utilised to tackle the every growing problem of spam on The Internet. Each technique, individually, could help automatically reduce spam but if used together, in a blended approach, the detection rates should be incredibly accurate.

Statistical Filtering

Statistical filtering is based on the idea that certain words will appear more frequently in spam than in ham (non-spam) and vice versa. A statistical filter will analyse the word content and then using a database of previously generated metrics it will calculate the probability that the email can be classified as either ham or spam.

Spammers Weakness

Question: what is a spammers weakness? Answer: his message.

That’s right. A spammer can do many things to evade detection. They can obfuscate their message they can try by-passing reputation systems with botnets they can write bot programs that will spam a site but the one thing they cannot do is remove the content that is ultimately destined to be read by the end-user.

This gives us, the anti-spammer a significant advantage. We can turn this weakness against them. We can use the very content of the spam itself to our advantage.

Words Are Unique

The written word has a unique fingerprint. No two people write in the same way. Cyber-crime scientists have recently come up with a way of detecting who the author of an email is with an 80% to 90% certainty by processing just 10 examples of all candidates and using statistical analysis to figure out who the original author was.

Of course, to defeat spam we don’t need to go into that level of analysis. All we need is a way to be sure to a reasonable degree that a message is actually spam. The higher the degree of certainly the more emphatic we can be in the action we take; action that would be automatic and require no human intervention!

For example, let’s assume we have a black box application that can parse a message and give it a probability grade from 0 to 10 where 0 is absolutely not spam and 10 being absolutely spam. Anything that scores 5 or less we could assume is not spam. Anything scoring between 5 and 10 we could assume might be spam and take action (such as slow-tracking or blocking if the user sending the message is sending more than a couple in any period of time). If we get a score of 10 we know it must be spam (or, at least we can be sure to a high degree of probability) and so we can take decisive action like blocking the message and if the user tries again we can (temporarily at least) mute or even block their account automatically.

Fine. “Sounds wonderful”, I can hear you scream, “but where do we get such a black box?”. That’s simple. They already exist. They are called Bayesian Spam Filters and they use statistical filtering techniques that are well known in the anti-spam industry.

A Plan For Spam

Bayes

The idea of using statistics for filtering spam was first conceived by Paul Graham in his 2002 “essay” entitled A Plan For Spam. The techniques discussed were later improved upon and published in his 2003 “essay” called Better Bayesian Filtering. Since the publication of these articles many well know anti-spam products such as Spam Assassin and Spam Bayes have implemented variations of the techniques Paul discusses.

Since then many improvements have been made to the techniques but they all follow the basic principle that given a collection of examples of both good and bad email a classifier will be able to analyse a candidate email and estimate the probability that it is either good or bad. Since an email is nothing more than text (normally) and since statistical filtering will work with any text it follows that the same techniques could be used to detect (nearly) any spam.

Training

With a well trained database of both good and bad text a statistical filter is capable of incredible accuracy (in the order of 98% or greater). But, herein lays the problem. A Bayesian classifier needs to be trained. That’s right, it needs to have a database that is populated with enough representative examples of both good and bad to be able to make a classification.

This has the potential to take a lot of time an effort. On top of that a classifier isn’t a static entity that, once trained, will work forever detecting spam. On the contrary, spam changes over time. Spammers learn new detection evasion techniques so a good classifier will need to be re-trained on a regular or on-going basis.

Self-learning

On-going? That’s right… on-going. A classifier can learn from itself. Providing it has a reasonable database to start with it can then learn from its mistakes and get better. The basic premise is that if the classifier scores very high (say 90% probability) it will automatically add the message to its database of bad messages. Likewise, if the message scores very low (say 10% probability) it will automatically add the message to it’s database of good messages. Over time, the databases get more and more accurate but also if the format of good or bad messages changes over time the classifier will learn and keep up with these changes.

On top of this a regular check should be made on how the classifier is doing. Any false positives or false negatives should be used to train the classifier so that it learns from its mistakes. Unfortunately, this does require some manual intervention but providing it is something that is done on a regular basis it shouldn’t end up being too big of a chore and should be considered general good house-keeping.

Poisoning

Great. A solution to our training problems! Well, not quite. You see the very fact a classifier can learn can be used against it by an unscrupulous spammer using a technique called Bayesian Poisoning.

Put simply, the spammer will include a load of random words (known as word salad) or paragraphs from a novel (Shakespeare is a favourite) in their email. The hope is that there is enough non-spammy words to trick the classifier into making an incorrect classification.

In very simple terms, if a spammer sends spam that contains lots of (often random) non-spammy words — a trick often used is to quote random passages from novels — it might fool a statistical filter into classifying it as good and given a high enough score that email may get added to the good database thus increasing the likelihood of false negatives (marked as not spam when it isn’t) for future spam containing the same spammy words.

The other (possibly worse) scenario is that email is classified as bad and gets added to the bad database. This would then increase the chances of emails containing those words in the spam that were not really spammy actually contributing to the spam score of a non-spammy email thus creating a false positive (marked as spam when it isn’t). In the world of anti-spam this is the worse possible outcome as it means real (and possibly important) message get blocked.

In reality, this isn’t actually as bad as it sounds and to some extent should be considered a strawman argument against using statistical classification as it fails to take into consideration that fact that an email that contains poison is, in itself, actually a spammy trait. Consider; how many real emails contain word salad? Not many!

A good classifier will not only take on board all of the words in an email but it will also consider tokenising phrases (one of the reasons a good tokenising strategy is important) and preserving other traits such as word count, word order and even grammar and semantics. The very action of attempting to poison a classifier can work against the spammer since they have actually created a very unique statistical fingerprint for their spam that is easier to detect.

Heuristics

Sometimes, a message may score low on a statistical filter but may still be spammy. Whilst statistical filtering can be very accurate its Achilles’ heel is that it is a token based analysis classifier and if there are not enough tokens in the candidate message the classifier may be unable to make a determination. Since false positives are the worse case scenario for any spam detector (falsely identifying a message as spam when it is not) the normal thing to do is treat an unknown result as not being spam.

Spammers are not stupid (well, at least technically) and so they have a number of tricks up their sleeves to try and by-pass statistical classifiers. Here are some examples:

Content only contains an embedded image
Content has been purposefully obfuscated or malformed
Content contains only a (normally obfuscated) URL

Rules of engagement

So, what is heuristic filtering? Simply put, it is a way of detecting spam using a set of “rule of thumb” detection techniques. Put another way we are roughly saying, “if a message contains trait X, Y or Z or any combination of them there is a good probability it is spammy”. The more traits that match the higher the probability.

When other filtering/classification techniques fail we have the option to fall back on heuristic filtering. Unlike a statistical filter a heuristic filter doesn’t rely on any one specific approach to clarify content, it will use a collection of “fuzzy rules” to make a distinction.

When content is scanned by a heuristic filters it will apply each of the rules and those that “trigger” will contribute towards a final spamminess score. If that score crosses a certain threshold it can be classified as spam. Clearly, the higher the score is the more confident we can be about this classification.

Of course, statistical classification and heuristic (or any other type of) filtering are not mutually exclusive. We can aggregate the results from different classifiers and use that to draw a conclusion on how likely a message is to be spammy. The more techniques we use the more confidence we can have in the final classification. It’s really just like being a CSI, where we are analysing all the available evidence (damning or not) to try and establish if content is likely to be spam.

Heuristic in action

Let’s have a look as some of the more obvious techniques spammers might use to evade spam filters and how heuristics might be used to combat them.

Mark-up Obfuscation

This technique requires the message format to support some kind of mark-up language. Generally, that will be HTML in the case of email spam but for user generated content (such as on blogs or bulletin boards) it may be BBCode.

This kind of technique relies on a spam filter working on the raw mark-up code. The spammer will include loads of mark-up tags to separate letters that make up words. With HTML this is pretty simple; spammers will use “Faux-HTML”. These are artificial HTML tags that won’t be rendered by the message client but are invisible and break up words to obscure their meaning.

In the case of BBCode it’s not so straight forward for a spammer since unknown tags are generally rendered as part of the parsed output. It is; however, still perfectly viable for the spammer to include real BBCode tags providing they leave the rendered text human readable once it’s been parsed.

Let’s look at a simple example.

[url="http://bit.ly/mm8KEk"][u]t[/u][b]h[/b][i]i[/i]s [b]i[/b][u]s[/u][i] [/i]s[b]p[/b][i]a[/i][u]m[/u][/url]

When rendered, the BBCode above says, “this is spam”. Further the text will be rendered as a link to [1]. As you can see, we’ve used BBCode to obfuscate the text that will, ultimately, be presented to the end user. Of course, this is a trivial example and it’s pretty obvious to the human eye what is going on here but it’s not so obvious to a filter.

In reality, this is a pretty simple problem for a non-heuristic filter to get round as long as it knows how to decode the mark-up to get at the human readable version, which can be processed by your filter of choice. Also, the mark-up itself is a very telling sign this is probably spam. So much so that if our self-learning statistical filter tokeniser included mark-up tags there is a very good chance that eventually those tags would start to score high, indicating probably spam.

From a heuristic point of view, mark-up obfuscation is a relatively simple thing to detect. Of course, the very fact that the message contains such a high ratio of tags to message text is also a very telling sign that this message is probably spam. So is the fact that the number of letters between each tag is so small, suggesting an attempt to obfuscate words.

Our heuristic scanner can simple generate a score based upon the number of tags vs. the number text characters. The higher the ratio the higher the probability of spam, especially if one or more of those tags is a “url” tag. To allow this filter to “self learn” if could keep track of the average ratio of tags to letters of good vs. bad and use these are a benchmark for when generating a probability score.

CAPITAL LETTERS

Spammers love to SHOUT about their wares. For this reason you will often find spam has a high ration of upper case to lower case letters. A statistical filter that considers case will probably end up scoring quite high for upper case letters. Unfortunately, by considering case in a statistical filter we dilute the value of the word semantics.

For example, SpAm and sPaM will not be comparative and so the spammer could use random combinations of upper and lower case letters to by-pass a statistical filter. On the other hand, if they do this enough a self learning statistical filter will come to realise that these different variations of words are likely to be spammy and more so than the same words that are just all one (lower) case. So, it’s swings and round-abouts; in the short term the statistical filter will probably be poor at detecting such spam but after a while its accuracy will improve and will actually be more accurate than ignoring case.

Heuristically, we can do a similar thing to we did with detecting Mark-up Obfuscation. If we examine the ratio of upper to lower case text in non-spam it’ll generally be much lower than that of spammy content. For this reason, we can assume that the higher the ration the more likelihood the message is spam. Further, we can track the average rations of good vs. bad to allow the heuristic scanner to be self learning.

Obfuscation

To try and avoid simple filtering system spammers will often transpose letters in words or even miss our the vowels. The human brain is amazing. It can read text even when it’s seriously obfuscated. Let’s look at an example:

Take a look at this paragraph. Can you read what it says? All the letters have been jumbled (mixed). Only the first and last letter of each word is in the right place:

“Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in waht oredr theltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at therghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe.”

Another technique is one that uses symbols to represent letters. For example, l33t is very popular on The Internet and most if not all people can read and understand it. Of course, it’s just one more way to obfuscate text in an attempt to make it hard to classify the content.

Ok, so no one is claiming spammers will send out text quite that obfuscated but they only have to change a few letters around or miss out a few vowels and a rules based classifier is likely to fail to detect certain key words.

As well as purposeful obfuscation, spam is often riddled with poor grammar and spelling. This is because most spam originates from countries where English may not be their first language. Spammers want to target the widest audience possible and whilst it would be wrong to assume all spam is in English it is probably not wrong to assume a majority of it will be.

How can we detect this then? Simple, we don’t. Eh? That’s right we don’t. At least we don’t do anything special. There is no need. We let the statistical filter learn from the grammar and spelling mistakes made by spammers so that it can use them to help classify the content.

URL redirection

A URL redirect is where a URL doesn’t point to the final content but, instead, directs you to a service (this includes URL shortening services) that will then redirect you to the content (or maybe even another service). Spammers love these because it means they can frequently change their URLs by changing between redirect services. In effect it is the URL equivalent of Money Laundering. The redirection “cleanses” the URL — at least, that’s what the spammer hope!

One way to handle this is to perform a HTTP header lookup in real time and try to resolve the redirect chain. Unfortunately, this is quite an expensive thing to do and for a real-time detection mechanism it’s certainly not feasible to do this (at least, not in real time). Another way to handle this is to maintain a table of known redirect services, flag those that are reputable (for example tiny URL are very active in blocking spammers) and then immediately block any that are not in that list.

The alternative is to “slow-track” known redirects. In essence, add the to a queue for a separate service to investigate and then make a classification. Meanwhile, if we are seeing a lot of the same redirect URL it will get bumped up the queue. This won’t catch the initial postings but it will eventually contribute towards classification and is likely to stop a spam campaign pretty quickly.

Meanwhile, the URBl services will also be looking to blacklist any redirect URLs so there is a good chance that redirects will quickly end up on real-time blacklists.

URL Obfuscation

Spammers will often try and obfuscate URLs to prevent rules based detection. For example, they may encode the URL or add unnecessary parameters. None of these are really that hard to deal with. There are simple rules that can be used to get a [canonical URL http://en.wikipedia.org/wiki/URL_normalization]. Once these rules are applied all various forms of a URL will be normalised to one canonical form.

Challenge-Response

Spammers rely on the fact they can pump out hundreds, if not thousands, of messages as quickly as possible. A spammers most precious resource is time. If they can’t bang out messages unhindered they are likely to give up and move on.

A Challenge-Response (C-R) system is designed to inconvenience the spammer whilst minimising the impact on a legitimate user. A common C-R mechanism is to send am email to a user when it is their first time posting with a link to a page where they have to enter a unique code. This relies on the fact that spammers generally don’t have valid email addresses so will never get the C-R request. Generally, this only needs to be done the once; however, if a user is showing an unusual pattern of sending messages it could be repeated (for example, if they send more than 10 messages in 24 hours).

Of course, another C-R that is popular these days is CAPTCHA(Completely Automated Public Turing test to tell Computers and Humans Apart). These can be very effective but they are also generally disliked by legitimate users as they can be difficult to complete and are not a great user experience for partially sighted users. The CAPTCHA definitely has its place in fighting spam but it is a blunt instrument and should be used only when other, more user friendly options, have failed..

Reputation

A reputation service is one that grades how much a certain entity can be trusted. The greater their “reputation” the more we can trust them. Using simple rules it’s possible to award or remove kudos points from an entity and, thus, build a profile of just how trust-worthy that entity is over a period of time.

User

User reputation can be measured using a range of metrics:

How long have they been a member?
When is the last time the account was active?
Does the account’s registered email address map to another account?
If it does is the other account trustworthy?
Is the users IP address known to us (for the wrong reasons)
Has the user been issued with any warnings
How frequently do they post comments
What is their aggregate spam score for their previous postings?
Do they post lots of URLs
Have they previously been banned

…and so on.

None of these attributes are specifically spammy but over time and in combination with spam detection we can balance a users reputation against the spam score of a posting to add more weight to the final classification.

RBL (Real-time Black List)

Using Real Time Blacklists (RBLs) we can see if a message contains any content that has a bad reputation. For example, SURBL provides a URBL (Url Real-time Black List) that can be used to check the reputation of a URL.

Unfortunately, most RBLs are geared towards email content and not website content (so called Comment Spam); however, there are a number of services (some free, some subscription) that specialise in website content:

It’s not clear yet just how useful these will be so some analysis of example data will be necessary to decide if it is worth putting effort into developing an interface for such services.

Rule Based Filtering

There are going to be key words or phrases that are going to be an immediate indicator of unwanted content (which may of may not also be classified as spam).

For example, content that contains (excess?) profanity or racially extremist content. For detected stuff we consider to be more of less black and white a simple rules based pattern detection mechanism (using regular expressions) is a very simple way to filter out unwanted content.

Of course, some rules will be more emphatic than others, so for that reason each rule should be given a score and only if a score threshold is reached should the message be considered undesirable content. For example, very offensive swear words may have a very high score whereas words like Viagra may have a medium score and words like crap or idiot may have a very low score. The combination of the score of all rules that fire will be the overall score.

Conclusion

Spam is a real problem. Probably the best way to handle it is using a statistical classifier. Unfortunately, statistical classification is not a panacea. There is a lot of up-front investment in both implementing the system and then training it. There is also some on-going effort required to monitor the classifiers activities and to aid in the classifiers self-learning. A better way is to use a blended approach to spam detection; a combination of a number of well known and proven techniques to detect and eradicate spam.

You can also read this article on the Experts Exchange technical blog.

This is an adaptation of an internal article I wrote whilst working for Last FM.

Building Boost on Windows

evilrix — Sat, 12 Jan 2013 00:54:15 +0000

In case you’ve never heard of it before, Boost is a set of peer reviewed libraries for C++. They provide a lot of features that are sorely missing from the standard C++ libraries and are probably the closest C++ developers have to a standard development toolkit. In fact, Boost is so useful that a number of the projects were included in the C++11 standard.

To install the official upstream version of boost you need to build it using bjam. This isn’t that hard but it’s not as simple as just installing a package. The good news is that most Linux distributions include native package installers. Windows developers are covered by the wonderful installer packages that are maintained by BoostPro.

The downside to using these packages is that you are tied in to the release cycle of the package maintainer. Often, Linux packages can be 2, 3, 4 or even more releases behind the official upstream. BoostPro does a pretty good job of keeping packages up-to-date but it’s still normally a release behind. Also, BoostPro may not have a package for your compiler or version.

The answer, of course, is to build Boost yourself but this can be a daunting prospect if you’ve never done it before. This article will cover building Boost for Windows for use with Visual Studio (it may be followed up by a similar article explaining how to build Boost for Linux).

Why am I focusing on Windows? A few reasons:

I actually need a Windows installation so now is a good a time as any to write this guide.
BoostPro doesn’t currently have an installer for Visual Studio 2012.
My experience is that building Boost on Windows is that it’s generally more problematic that doing so on Linux.
Linux developers are generally more used to messing around with the lower-levels of compilers, linkers and make files and so are less likely to find building Boost from scratch a daunting task.

So, here is a simple step-by-step guide to building Boost for Windows.

Download boost and extract it somewhere.
Open the Visual Studio command prompt (NOT cmd.exe).
Change directory to the root folder of Boost.
Run “bootstrap”, which will prepare the Boost Build engine.
Run “.b2”, which will start building the Boost libraries.
Go and have a few cups of tea (you’ll be waiting quite a while).
Add Boost root folder to the compiler’s standard include path.
Add /stage/lib to the compilers standard library path.
If all went well you should now be able to build against Boost.

NB. this was tested using Boost 1.52 and later or newer versions may require different steps. Please consult the official Boost installation documentation for more details.

Meta Template Programming – where to start?

evilrix — Thu, 06 Dec 2012 15:23:52 +0000

A good friend of my asked me how to get started in meta-template programming. Of course, the first thing is to know C++ and know it well. Other than that, I think my best advise is to ensure you completely understand how the C++ template generation process works. For example, if you don’t know what SFINAE stands for you’re probably not really to start writing meta-templates (of course, that doesn’t mean you are not ready to start learning).

There are three books that, in my opinion, you need to read before you do anything else. By read I mean to have read and understood. They are:

C++ Templates: The Complete Guide

This is basically the bible as far as template programming is concerned. If you’ve not read this you don’t know C++ templates!

Modern C++ Design: Applied Generic and Design Patterns

Basically, everything you’ll ever want or need to know to write meta-template code.

C++ Template Metaprogramming: Concepts, Tools, and Techniques from Boost and Beyond

Really advanced stuff. Shows you how to use the Boost MPL (Meta Programming Language) to do really really clever things. Don’t even think about reading this until you’ve at least read the first book, otherwise you head may just explode.

Finally, I am now taking requests for articles on my blog so if there is anything specific you wanted to know just make a request on there and I’d be more than happy to try and write a nice article for you

I hope this helps.

Set union problem

evilrix — Wed, 05 Dec 2012 21:51:13 +0000

Today I had the privilege of a job interview with one of the leading companies in the online streaming music space. I’d like to think the interview went well, although I was incredibly nervous and my brain decided it was going to operate in a way that suggested it was wading through treacle; but I digress. During the interview I was asked an algorithmic question and I have to admit I was initially quite flummoxed. This post is about that question.

I spend quite a lot of time trying to solve algorithmic programming puzzles. Those that I find interesting I blog about. For whatever reason I’d never come across this problem before. It was a great question but the solution was not immediately obvious to me. Anyway, with the help of some (lots!) of prompting from the interviewer I think we got there in the end.

I can’t say this was my finest hour and I left the interview feeling slightly miffed with myself. Still, as amazing as I am I don’t know everything and so to make sure I never forget how to solve this problem I now present it here along, with the solution we eventually derived during the interview.

The problem

Abstract

It’s a little tricky to explain (the interviewer had to explain it to me about 4 times before I understood it – which probably says more about me than anything else), but here goes…

You have a collection of strings. Implement an algorithm that will union any string with any other string that has any matching character. Continue this process until there are no more matches to be found and what remains is a smaller collection of strings that contain unique characters. Any strings that can’t be unioned with any other string shall remain in the collection untouched.

Detail

To make that clearer let’s look at an example.

Suppose have the following vector V that contains the strings S1 through Sn.

V = [
   S1{ "abc" },
   S2{ "def" },
   S3{ "cf" },
   S4{ "ghi" },
   S6{ "jkl" },
   S7{ "mno" },
   S8{ "pqtl" }
]

Following the transformation applied by the algorithm, this should be the result.

V = [ S1{ "abcdef" }, S4{ "ghi" }, S6{ "jklpqt" }, S7{"mno"} ]

We get to this point because S1 shares a common character with S3 (‘c’), which in term shares a common character with S2 (‘f’) and so are merged. Likewise, S6 shares a common character with S8 (‘l’) and so are merged. The remaining strings share no common characters so just stay as they are in the collection.

The solution

Initial attempt

My initial thoughts on how to solve this were to use recursion (you can solve anything with recursion, right?). The idea being that you pass the vector and current string position to a function. It then checks the rest of the vector for matches with the string whose position you passed in. If it finds a match it calls itself with the vector and the position of that string. If it finds no other match it just returns the string at the original position found.

In this way, as the stack unwinds you then just merge the current string with the one coming down the stack and then return the newly merged string. Once the stack has finished unwinding you’ll have all the valid merges for the target string and you can then move on and start checking the next one.

In fact, this probably wouldn’t work very well for a number of reasons:

To prevent infinite recursion (well, until your stack blows up), you’d have to remove the current string out of the vector before making the recursive call. For example, from S1 I match S3, I make the recursive call and I now match S2, I make a recursive calls and I now match S3, I make a recursive call and I match S2 and so on… I think you can probably see where this is heading.
Removing items from the vector isn’t necessarily that simple. You’re already in the process of enumerating the vector and there is a good chance that if you’re not careful you’ll invalidate it and you’ll end up iterating up your own tail pipe.
Even if you manage to resolve problems 1 and 2 you still have the problem that unless you come up with a fancy tail-recursion implementation, your recursive solution will quite probably blow the stack if the number of strings was large

So, in summary, I suspect it is quite possible to implement a recursive solution for this but it doesn’t take too long thinking about it to realise that it’s probably not really the way to go.

A better attempt

On this occasion, the best solution is iterative. The idea is to focus on one target string at a time and keep looking for matches until no more can be found and then move on to the next. Let’s walk it through.

We need 3 loops. The outer loop is a control loop, that will keep repeating until we find there are no more candidates for unification. The next loop will iterate the strings in the vector and, likewise, so will the inner loop. We should have something like this.

while not done
   done = true
   for outer in vector
      for inner in vector
         if no merges then
           done = false
         end
      done
   done
done

So, upon initialisation outer is S1 and inner is S1. There is no point in comparing these as they are the same thing so we skip inner on one. Next, inner is S2 and there is no match so we skip on again. Now inner is S3 and there is a match so we union S1 (outer) and S3 (inner) and store the result in S1 (as this is currently the target). We now remove S3 because we’ve dealt with it. We try S4 through S8 and find no more matches so this stage is done.

Now we have to process S1 again because during the last stage we merged new stuff into it so we might now be able to find another match. As it happens, S2 will now match because S1 now contains the contents of S3, which means it contains the letter ‘f’. Again, we merge S2 into S1 and then trip on through the remaining strings. We should find no more matches.

Again, we’ve updated the target string so once again we need to iterate through and make sure nothing else matches. This time there are no more matches so S1 is now complete. We can now move on to the next remaining string, which is S4. We now repeat this process again and keep going until we’ve exhausted all the target strings (ie. there are no more matches to be found). At this point we should have a much smaller set of strings, all with unique letters in them.

I hope that makes sense. As you can see, it’s not that easy to explain. Hopefully, some code will help. Today I’ve chosen C++ as the language. Whilst I normally try and present algorithms in Python I wanted to do this in C++ as it was a new problem for me and I tend to think new things through better when coding them in C++, just purely because that is the language I know best. As some point I might also add a Python version

#include
#include
#include
#include
#include

using namespace std;

class merger
{
public:
   merger(vector< string > const & vv)
      : vv_(vv)
   {
      process();
   }

   vector< string > const & get() const
   {
      return vv_;
   }

private:
   void process()
   {
      vector< string >::iterator oitr = vv_.begin();
      string tmp;

      bool more = true;
      do
      {
         more = false;

         while(oitr != vv_.end())
         {
            vector< string >::iterator iitr = oitr + 1;
            bool changed = false;

            while(iitr != vv_.end())
            {
               if(find_first_of(
                  oitr->begin(), oitr->end(),
                  iitr->begin(), iitr->end()) != oitr->end())
               {
                  sort(oitr->begin(), oitr->end());
                  sort(iitr->begin(), iitr->end());

                  tmp.resize(oitr->size() + iitr->size(), char());

                  string::const_iterator itr = set_union (
                     oitr->begin(), oitr->end(),
                     iitr->begin(), iitr->end(),
                     tmp.begin());

                  tmp.erase(itr, tmp.end());

                  oitr->swap(tmp);

                  iitr = vv_.erase(iitr);
                  changed = more = true;
               }
               else
               {
                  ++iitr;
               }
            }

            if (! changed)
               ++oitr;
         }
      }
      while(more);
   }

private:
   vector< string > vv_;
};

int main()
{
   vector< string > vv;
   vv.push_back("abc");
   vv.push_back("def");
   vv.push_back("cf");
   vv.push_back("ghi");
   vv.push_back("jkl");
   vv.push_back("mno");
   vv.push_back("pqtl");

   merger m(vv);

   copy(vv.begin(), vv.end(), ostream_iterator(cout, " "));
   cout << endl;

   copy(m.get().begin(), m.get().end(), ostream_iterator(cout, " "));
   cout << endl;
}

And we’re done

So, there you have it. I hope you found this as interesting as I did. If you have your own solution for this problem I’d welcome you posting in the comments below.

Thinking about efficiency

evilrix — Tue, 04 Dec 2012 00:58:41 +0000

This article is going to cover a typical interview test question, which asks you to find the missing number in an array. The array is N elements in size and contains all the numbers 1 to N. The numbers can be in any order but will never repeat. One of the numbers is missing and your task is to find the missing number as efficiently as possible. There are a number of different ways we could tackle this problem, which we’re going to explore. The focus of this article isn’t so much about how to solve this problem and more about the (in)efficiency of different algorithms we might use.

Asymptotic analysis

To compare the differences between the efficiency of each algorithm we’re going to attempt to calculate the asymptotic time complexity. This is a big and fancy word that just means we’ll be evaluating the relative efficiency of each algorithm based on the amount of data input. For example, if we input N items and the algorithm has linear time complexity the asymptotic time will be directly proportional to N. In other words, if N is 10 we know it will take 10 units of time.

The actual units are abstract and all we’re doing is comparing like for like. The unit is just the measure of cost in terms of time but it does not represent seconds, or minutes or hours or, in fact, any specific time at all. The units have no specific meaning other than being comparable with each other.

For example, let’s say I have another algorithm that has a quadratic time complexity and N is 10. The total units of time will be 10^2 (quadratic basically means to the power of two) so I know that for 10 input data it’ll take 100 units of time. Compared to the linear algorithm for the same number of inputs I know that it’ll take 90 more units of time or, put another way, it’s 10 times slower.

There is a good reason the units have no specific meaning and that’s because complexity isn’t just about time. As well as an algorithm having a time complexity it also has a space complexity. For example, if an algorithm takes N inputs and needs to store each of them it has a liner space complexity, where the amount of memory used is directly proportional to the number of inputs. Again, the units are irrelevant, what matters is that we can compare like for like when comparing time and space complexities.

The fact that we measure both time and space complexities is because we care about both when we’re considering how efficient an algorithm is. In general, a bad algorithm will cost a lot of time and space. A reasonable algorithm will only cost us space or time (or a reasonable trade-off of each) and a good algorithm costs little of either. As hinted, this is often referred to as the Space/Time trade-off. Generally speaking you can make things faster by using more space. Put another way, we can often improve how something takes to run at the cost of the amount of memory it uses.

Big O

When comparing time and space complexities we need to use a notation that is both simple and consistent. For this we’re going to use the Big-O notation. This is a very simple notation that represents time and space complexity as a function of O. Using this notation linear time is represented as O(n) and quadratic time as O(n^2).

Amortized time

When calculating the time complexity we’ll be considering its amortized time complexity rather than it’s worse case time complexity. Although the two are mostly the same the latter is more helpful as the worse case isn’t necessarily going to happen or even if it does it’s still not going to be an accurate reflection of the actual complexity.

For example, when pushing back data into a C++ vector we might reach a point where more memory needs to be allocated. Although the C++ Standard doesn’t prescribe the allocation strategy to be used it is often just a simple case of the allocated memory being doubled. Without going into the details, this results in a linear time complexity that involved allocating a memory block twice the size of the existing, copying the existing data from the old to the new block and then freeing the old block. This means that on some occasions the complexity to push back isn’t constant; however, because this reallocation of memory happens only occasionally (and assuming geometric memory reallocation) we can ignore these occasional anomalies and just treat all push backs as constant.

Brute Force

Starting from the number 1, iterate through the array and look to see if the number can be found. If we reach the end of the array we know that is the missing number. If we find the number in the array we start the process again, this time looking for the number 2. We repeat this process, looking for each number in turn, until we are unable to locate the number in the array. At that point we’ve identified the missing number.

How efficient is this?

We start with 1 and we have to search the whole array to see if it’s there. On average, we will have to search at least 50% of the array before we find the number so we can assume our average time complexity for searching just for one number is O(N/2). We fail to find 1 so we now repeat the process for the number 2. We do this until we hit the number that’s missing .

What is the total time complexity? Well, on average we’ll have to search of at least 50% of the numbers before we find the one that is missing. For each number we look for it we need to iterate through, on average, at least 50% of the array. Therefore the time complexity for this is going to be O(n/2) x O(n/2).

We can simplify that to O((n/2)^2) and since we want to know the amortized time we can simplify further to O(n^2). This is called quadratic time and it isn’t really what one could all efficient. Just adding one extra element to the array means we have to search the whole array one more time.

We can do better than this!

Sorting

Sort the array and then iterate through it. Since we know that each and every number from 1 to N must be there as soon as we find a gap we’ve found what would be the missing number. For example, if the current number is 5 and the next is 7 then it is clear the missing number is 6.

How efficient is this?

Assume we use a sorting algorithm that gives us O(n log n) time complexity (for example, quick sort) . We then need to iterate through the array, which is O(n) so our total time is O(n log n) + O(n). Since we want amortized time we can simplify that further to be just O(n log n). Put another way, we have to first of all wait for the array to be sorted and then we can start looking for the missing number.

We can do better than this!

Bit field

Create a bit field, where we have one bit for each element in the original array. Iterate the original array and for each number we find set the corresponding bit in the bit field. For example, if we find number 4 we set the 4th bit in the bit field. Once we’re finished iterating the array we’ll have a bit field where all but one bit is set. The bit that is not set corresponds to the missing number.

How efficient is this?

We’re iterating the array and we only need to do this once. That has a time complexity of O(n). For each iteration of the array we need to set a bit, which has a constant time complexity O(1). We can simply that to O(n x 1) and further to O(n). In other words, we now have a constant time algorithm. W00t! But, hold on. Efficiency isn’t just about performance. We now have linear time complexity but we also have linear space complexity.

In other words, the size of the space we require to execute this algorithm is directly proportional to the size of the data we’re processing. If we have a 500 element array we need 500 bits. But, what if this is a large amount of data? What if the size of the array was huge. Do we really want an algorithm where the memory requirements scale up in direct proportion to the data we’re processing?

We can do better than this!

Triangles

The solution lays with triangles, or more specifically triangular numbers. You see, if we picture all the numbers from 1 through 9 as a number of dots representing that number we can imagine them laid out such that they look like a triangle.

1:         *
2:        * *
3:       * * *
4:      * * * *
5:     * * * * *
6:    * * * * * *
7:   * * * * * * *
8:  * * * * * * * *
9: * * * * * * * * *

Right, so far so good but how does this help us solve our original problem? Well, if you look you’ll see that the result is actually an equilateral triangle. Each side has the same number of dots. It just so happens that there is a nice simple mathematical formula we can use to calculate how many does there would be given the number of dots on one side.

T(n) = n(n + 1) / 2

If we plumb the numbers into this formula we get this.

T(9) = 9(9 + 1) /2

T(9) = 9(10) /2

T(9) = 90 /2

T(9) = 45.

So, for an array of 9 elements the sum of all the numbers should be 45. Let’s see if that’s right.

1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 = 45

Yay! It works. Um… how does that help?

Ok, let’s perform that last sum again, but let’s remove the number 5.

1 + 2 + 3 + 4 + 6 + 7 + 8 + 9 = 40

Notice the relationship between the number removed and the result? Of course, we removed 5 so the result if 5 less, or put another way 45 – 5. That’s right, we can find the missing number simply by calculating how many dots should be in the triangle and then subtracting from it the number of dots that are actually in the triangle. Put another way, we can figure out the sum of all the items that should be in the array and then subtract the sum of the items that are actually in the array. the result is the number we are looking for.

Let’s write some code. As always, I’m going to use Python for this as it’s a nice simple language, which will allow us to focus on the problem and not the syntax of the language. I’ve annotated the code with liberal comments so it should be pretty easy to follow.

#-------------------------------------------------------------------------------
# Name:        missing
# Purpose:     find the missing number in an array
#
# Author:      ricky
#
# Created:     01/12/2012
# Copyright:   (c) ricky 2012
# Licence:     MIT
#-------------------------------------------------------------------------------

# T(n) = n(n + 1) / 2
def t(n):
    return n * (n+1) / 2

# This array has the number 5 missing
a = [1,2,3,4,6,7,8,9]

# Size of the array plus one to account for the fact that one that is missing
n = len(a) + 1

# Calculate the triangle number for n
x = t(n)

# Get the sum of the items actually in the array
s = sum(a)

# The difference between T(n) and sum(a) is the missing number
r = x - s

# Display the result
print 'The missing number is {0}'.format(r)

How efficient is this?

The solution has to iterate through the array to sum up all the values, which is linear O(n) time complexity. Calculating the triangle number takes constant amortized time O(1). The memory requirements are also O(1) because no matter how big the array we never need more than a handful of variables to store the result of the math. In fact, using variables is just a convenience, we don’t actually need to use any really.

So, is this efficient? It’s about as efficient as it’s going to get. I am not aware of a way of doing this in less that O(n) time and as far as I know there is no better way of doing this. If you know better please do post a comment and let me know your secret sauce.

Conclusion

We’ve see here that there is often more than one way to do something but we’re also seen that not all algorithms are equal. On face value the problem posed is trivial and yet when looking at the various ways of implementing a solution we’ve seen that the most trivial way to solve it is probably not the way to go. In fact, the best solution turns out to require a little bit of lateral thinking and some simple algebra.

Not all simple problems have simple solutions and my advice would be that Google is your friend. There is rarely a programming problem you’ll face that hasn’t already been solved before. Google for this question and you’ll find plenty of solutions. Some good and some poor. Generally speaking, the one I’ve demonstrated here is the most popular.

Encore

For a bit of fun I decided to take the things discussed in this article and write a small program that does a little magic. Actually, it doesn’t do any magic but I just thought it would be fun to do something semi-practical with the final solution.

#-------------------------------------------------------------------------------
# Name:        card trick
# Purpose:     magic card trick using python
#
# Author:      ricky
#
# Created:     01/12/2012
# Copyright:   (c) ricky 2012
# Licence:     MIT
#-------------------------------------------------------------------------------

# for shuffling
import random

# all the cards in a full deck ([A]ce, [H]earts, [D]iamonds, [S]pades, [C]lubs)
cards = [
    'AH', '2H', '3H', '4H', '5H', '6H', '7H', '8H', '9H', 'JH', 'QH', 'KH',
    'AD', '2D', '3D', '4D', '5D', '6D', '7D', '8D', '9D', 'JD', 'QD', 'KD',
    'AS', '2S', '3S', '4S', '5S', '6S', '7S', '8S', '9S', 'JS', 'QS', 'KS',
    'AC', '2C', '3C', '4C', '5C', '6C', '7C', '8C', '9C', 'JC', 'QC', 'KC',
    ]

# names of rank and suit
names = {
    'A':'Ace', 'H':'Hearts', 'D':'Diamonds', 'S':'Spades', 'C':'Clubs',
    '1':'one', '2':'two', '3':'three', '4':'four', '5':'five',
    '6':'six', '7':'seven', '8':'eight', '9':'nine',
    'J':'Jack', 'Q':'Queen', 'K':'King'
    }

# create a deck of cards
deck = range(1, 52)

# and now let's make some magic...
print 'choose a card, any card'
card = random.choice(deck) - 1
rank = names[cards[card][0]]
suit = names[cards[card][1]]

print 'Show the audience... but not me'
print '* the audiences sees you have chosen the {0} of {1} *'.format(rank, suit)

print 'Mark your card and put it back in the deck'
deck[card] = 0

print 'Finally, shuffle the desk'
random.shuffle(deck)

print 'I will now try and guess your card'
siz = len(deck)
exp = siz * ((siz + 1) /2)
val = sum (deck)
xcard = (exp - val) - 1
xrank = names[cards[xcard][0]]
xsuit = names[cards[xcard][1]]

print 'Was your card the {0} of {1}?'.format(xrank, xsuit)
print '* the audiance goes wild! ... applause ... cheer! *'

Virtual Function Defaults

evilrix — Sun, 02 Dec 2012 17:31:27 +0000

Virtual functions and default parameter arguments are a staple of all C++ programmers, but you might get more than you bargained for if you decide to mix and match them. Try this little quiz and see if your coders intuition is correct.

Question: Without using a compiler, what is the value of X?

struct Foo
{
   virtual size_t func(size_t st = 0) const = 0;
};

struct Bar : Foo
{
   virtual size_t func(size_t st = 999) const
   {
      return st;
   }
};

int main()
{
   Foo const & foo = Bar();
   size_t const X = foo.func(); // What value does X have?
}

Answer: It will be set to 0 and not, as you’d expect, 999.

Why?

The reason for this is because default parameters are always bound to the static and NOT dynamic type. This can be a very hard to track down source of defects so be very careful when mixing and match default parameters with virtual functions as you might just get more than you bargained for.

More info: Guru of the week: Overriding Virtual Functions

Template syntax mind warp!

evilrix — Sun, 02 Dec 2012 17:25:07 +0000

Templates are a hugely powerful feature of C++. They allow you to do so many different and cool things. Unfortunately, templates do have a bit of a reputation for having rather nasty syntax and for the most part this reputation is quite well deserved. This little quiz shows an example of some template syntax that you’ll only rarely come across but if you don’t know about it you could literally be left scratching your head in disbelieve, convinced you’ve uncovered a compiler bug.

Question: This code will fail to build on most compilers, why?

struct Bar
{
   template 
   void func() const { }

   template 
   static void sfunc() { }
};

template 
void Foo1(T const & t)
{
   t.func();
   T::sfunc();
}

template 
void Foo2(T const * pT)
{
   pT->func();
}

int main()
{
   Bar bar;
   Foo1(bar);
   Foo2(&bar);
}

Answer: The compiler can’t figure out what t, T and pT are during the first pass template instantiation so you need to tell it that what follows is a template function. You do this using the ->template, .template and ::template operators.

struct Bar
{
   template 
   void func() const { }
 
   template 
   static void sfunc() { }
};
 
template 
void Foo1(T const & t)
{
   t.template func();     //<--- The compiler doesn't know what t is, so you need to tell it
   T::template sfunc();   //<--- The compiler doesn't know what T is, so you need to tell it
}
 
template 
void Foo2(T const * pT)
{
   pT->template func();   //<--- The compiler doesn't know what pT is, so you need to tell it
}
 
int main()
{
   Bar bar;
   Foo1(bar);
   Foo2(&bar);
}

Reversing a linked list

evilrix — Sun, 02 Dec 2012 17:15:02 +0000

Linked lists are an interviewers favourite subject matter. Whilst they are pretty easy to understand, at least in principle, they do require a little bit of brain warping to get your head around what’s going on under the hood. Of the common questions about linked lists I’ve come across, this quiz tackles the most common: how to reverse a linked list.

Question: Given the following function signature, write a function to reverse a linked list?

void reverse_single_linked_list(struct node** headRef);

Answer: Ideally, something like the following…

void reverse_single_linked_list(struct node** headRef)
{
   struct node* result = NULL;
   struct node* current = *headRef;
   struct node* next;

   while (current != NULL)
   {
       next = current->next; // tricky: note the next node
       current->next = result; // move the node onto the result
       result = current;
       current = next;
   }

   *headRef = result;
}

Starting from the head of the list for each iteration we make the next pointer of the ‘current’ node point to the previous node, we have to keep track of the ‘next’ node so as not to lose it as well as the ‘result’, which will end up being the new head once the complete list has been reversed.

Let’s see how that works…

KEY

-------------------

H : Head

N : Node

C : Current

X : Next

R : Result

0 : Null terminator

Iteration 0

X ---> ?

C ---> H

R ---> 0

H ---> N1 ---> N2 ---> 0

Iteration 1

X ---> N1

C ---> N1

R ---> H

0  N2 ---> 0

Iteration 2

X ---> N2

C ---> N2

R ---> N1

0 <--- H  0

Iteration 3

X ---> 0

C ---> 0

R ---> N2

0 <--- H <--- N1 <--- N2

NB. Using this technique to reverse a list you can find out if a linked list is self referential in linear 0(N) time. I’ll leave it as an exercise for the reader to figure out how this works but try repeating the steps I show above but have an additional N3 node that references back to N1 and it should be pretty obvious why. Of course, there are better ways to do this.

Which STL Container?

evilrix — Sun, 02 Dec 2012 16:44:34 +0000

Included as part of the C++ Standard Template Library (STL) is a collection of generic containers. Each of these containers serves a different purpose and has different pros and cons. It is often difficult to decide which container to use and when to use it.

This article is a guided tour of the STL containers. It is not intended to teach you how to use each of these containers; rather, it will help you decide when and where to use each of them. It will also show you a few tips, tricks and snippets of information that are not normally documented elsewhere but may come in handy.

The target audience for this article is intermediate and above. I don’t necessarily discuss every new concept I introduce but every time a new and important concept is introduced I will be sure to provide a link to a reference where you can read more if you need to.

This article discusses the following STL containers:

vector
deque
list
set
map
multiset
multimap
bitset

Vector

The vector is a sequence container that represents an abstraction of a dynamic one dimensional array. It manages an internal buffer which can automatically grow to accommodate the contents. The allocation strategy used to allocate the internal buffer is not defined by the C++ Standard and is dependent upon the allocator. Unless a user defined allocator is provided it will default to using a vendor specific allocator, which will provide a general allocation strategy for best case general use. Typically, the default allocator will just double the size of the internal buffer when current capacity is reached; however, different compilers may use different strategies so you are advised to consult your compilers documentation.

The internal buffer of a vector is guaranteed to be binary compatible with a standard C array. This means it can safely be used with legacy code. There is no specific operator to provide a C array compatible pointer; rather, you take the address of the internal

buffer as though you were taking the address of the first element in an array.

An example of accessing the internal buffer of a vector

vector v(10, 0); // create a vector with 10 ints, each set to 0
int * p = &v[0]; // Get a pointer to the internal dynamic array

This buffer is guaranteed to be perfectly safe to use and 100% compatible with a C style array as long as no mutable methods are called on the instance the buffer belongs to. In much the same was that mutable methods on a vector may invalidate iterators so, too, can they invalidate the internal buffer. Why is this? Well, put simply, the internal buffer may need to be reallocated to allow for extra capacity. When this happens the vector creates a whole new buffer, which will be bigger than the existing one, all data from the old buffer are copied to the new buffer and the old buffer is then destroyed. It should be clear that if this happens your C style pointer will be pointing to invalid memory.

A specific exception to be aware of to the C array guarantee is vector, which the C++98 version of the C++ Standard specifies should be specialised such that each Boolean element only occupies one bit. There are other considerations with this specialisation too. For example, the subscript operators do not return a reference to a value within the vector, they return a copy of a bool that is constructed when the operator is called, which represents the Boolean state for the bit it represents.

The vector class is very efficient in terms of memory usage. As well as the size of the allocated buffer it has a very small overhead in terms of additional members of the class to manage its internal state. Typically a an empty vector will utilise between 16 to 24 bytes (depending upon how it‘s implemented, which is not defined by the C++ Standard).

The size of the buffer may very well be greater than the size of the data held within the vector. This is because the capacity of the vector (over actual size of the internal buffer) will always be at least as big as the size (the amount of the capacity in use) but will usually be greater.

When a vector needs to grow it must allocate a new buffer, copy the old to the new before destroying the old. This means that during mutable function calls it is possible that the amount of memory consumed by the vector may, for a short period, be twice as much as the previous capacity.

The internal buffer of a deque may not shrink when the container is cleared. This can be a problem if you have released a large quantity of data and want to reclaim the memory being used. There is a simple trick, called The Self Swapping Idiom that can be used. Basically, swap the vector to be cleared with a temporary empty one. When you use the swap method on a vector (this does not, necessarily, apply when using std::swap) all that happens is that the pointer values to each of the buffers is swapped, making a swap a very efficient process. When the other vector is destroyed all the memory it now has goes with it. Your original vector will now contain an empty buffer.

The vector is most efficient when appending items (pushing back).If you think of the vectors internal buffer as a stack of plates with the top plate being its back it’s easy to add new plate, you just put them on top. What if we want to insert a plate in the middle or at the bottom? Well, now it’s a little more complex. We have to lift up the plate to make space to insert a new one. It’s the same with a vector. If we want to insert a new item every item after it must move back to make space. The nearer the front you insert the longer it takes as the more items there are to move. Likewise, deleting anywhere other than the back of the vector is equally inefficient.

When appending to or removing from the back of a vector the time complexity is O(1). In other words, constant time as long as the buffer does not need to be reallocated.

If the number of elements being inserted into or removed from a vector is know beforehand the time complexity is O(N+M) otherwise the time complexity is O(NM), where N is the number of items being inserted or removed and M is the number of items that need to be moved.

In all cases of adding to a vector, the time complexities do not take into account the fact that the vector may also need to reallocate the internal buffer. This has a time complexity of O(N+M), where N is the original buffer size and M is the new buffer size. Where possible, you should use the reserve method on vector to pre-allocate the internal buffer to minimise the requirement to reallocate the internal buffer. Although vector will grow dynamically it is most optimal to use when you know in advance how big the buffer needs to be.

The vector is random access, meaning you can read any element of a vector in constant time, or O(1). This makes vector ideal for implementing constructs that require low latency reads.

Consider using a vector if:

you need to store data when you have a rough idea, in advance, of the number of items
data can be either added all in one go or can be appended to the existing data
if you want to be able to access the contents in any order

Avoid using a vector if:

you need to do frequent inserts or deletes to anywhere other then the back of the vector
you do not know, in advance, roughly how much data you plan to put in it

A simple example of using vector

#include 
#include 
#include 
#include 

int main()
{
	std::vector c;
	c.reserve(10);

	for(int x = 0 ; x < 10 ; ++x)
	{
		c.push_back(x);
	}

	std::copy(c.begin(), c.end(), std::ostream_iterator(std::cout, "n"));
}

Deque

The deque is a sequence container that represents an abstraction of a dynamic one dimensional array. It manages an internal buffer which can automatically grow to accommodate the contents. The default allocation strategy used to allocate the internal buffer is not defined by the C++ Standard. Unless a user defined allocator is provided it will default to using a vendor specific allocator, which will provide a general allocation strategy for best case general use.

The size of the buffer may possibly be greater than the size of the data held within the deque. This is because the capacity of the deque (over actual size of the internal buffer) will always be at least as big as the size (the amount of the capacity in use) but may be greater.

The actual internal implementation of the deque buffer is completely implementation dependent so we cannot draw any assumptions about how costly it is to grow the buffer; however, it is reasonable to assume that the worse case is going to be linear, of more specifically O(N) where N represents the size of the new buffer.

Unlike vector, deque provides no mechanism to reserve a buffer. However, this isn’t actually such a big deal since deque doesn’t have to comply with the contiguous memory requirements of vector (which it needs to remain backwards compatible with C arrays). Since this is the case compiler vendors are able to implement memory allocation strategies that are far more Operating System friendly. Or, to put it another way, it does not suffer from the same reallocation bottleneck of vector.

The deque gets its name from a contraction of “double ended queue” and is most efficient when appending items to the front or back (vector is only efficient when appending to the back). If you think of the deque’s internal buffer line of cups it’s easy to add new cup at either end. What if we want to insert a cup in the middle? Well, now it’s a little more complex. We have to move some of the cups to the left or the right to make space to insert a new one. It’s the same with a deque. If we want to insert a new item every item after it must move backwards or forwards to make space. Likewise, deleting anywhere other than the back or front of the deque is equally inefficient.

When appending to or removing from the back and front of a deque the time complexity is O(1). In other words, constant time as long as the buffer does not need to be reallocated.

If the number of elements being inserted into or removed from a deque is know beforehand the time complexity is O(N+M) otherwise the time complexity is O(NM), where N is the number of items being inserted or removed and M is the distance or number of items that need to be moved.

In all cases of adding to a deque, the time complexities do not take into account the fact that the deque may also need to reallocate the internal buffer.

The deque is random access, meaning you can read any value of a deque in constant time, or O(1). This makes deque ideal for implementing constructs that require low latency reads.

Consider using a deque if:

you need to store data and are likely to can be appended to the existing data at either end
you want to be able to access the contents in any order

Avoid using a deque if:

you need to do frequent inserts or deletes to anywhere other then the back or front of the deque
you need to have compatibility with C style arrays (use vector)

A simple example of using deque

#include 
#include 
#include 
#include 

int main()
{
	std::deque c;

	for(int x = 0 ; x < 10 ; ++x)
	{
		if(0 == (x%2))
		{
			c.push_back(x);
		}
		else
		{
			c.push_front(x);
		}

	}

	std::copy(c.begin(), c.end(), std::ostream_iterator(std::cout, "n"));
}

List

The list is a sequence container that represents an abstraction of a doubly linked list. The list consist of nodes, which are allocated on a need too basis. The memory foot print of an empty list is very small; however, the memory footprint of each node is actually quite large and can, for example exceed the size of the data being stored in the node.

For example, if we store 32 bit int in the node on a 32 bit platform the overhead for each node will probably be at least 64 bits (the next and prev pointer, three times as much as the size of the data)! This means that list is, generally, a poor choice if you are looking to store small data items especially if you plan to store lots of them. Put another way, if you were storing 1 GB of 32 bit int values in the list it would have a memory footprint of at least 3GB.

Although the cost of a node, in terms of memory, is quite high the list makes up for this by virtual of the fact that memory is only allocated when a node is created and can be released when the node is destroyed. This means that it’s a good candidate for storing an unpredictable number of items, especially if the number of items is likely to fluctuate.

The downside is that unless the allocator used takes care to avoid it, rapid allocation and de-allocation of nodes in a list can lead to bad memory fragmentation.

Most STL implementations try to get around this by allocating a memory in chunks and using this to allocate the nodes. When a node is released the memory it used in the chunk becomes free. This is not a C++ Standards requirement so you should check your compilers documentation to see if this behaviour is supported.

Since a list is, essentially, a sequential chain of nodes where one node points to the next and that in-turn points back it follows that if a node is remove or inserted into the list it will not invalidate the rest. This means that it is perfectly safe to iterate through a list whilst it is being modified; only the iterator to the modified node become invalid in the case of a node being removed.

It also follows that since a list is just a sequential chain of nodes access to them is also sequential and not direct as in the case of, for example, the items in a vector. In fact, access to a node within a list takes liner time, or more specifically it has a time complexity of O(N) where N is the distance from the node being accessed from the point we started from. This makes list unsuitable for certain algorithms such as a binary search.

On the other hand, inserting, moving (even between type identical lists) or removing a node requires nothing more than the modification of the next and previous pointers in the three nodes affected (the previous, the current and the next) and so takes constant time, or more specifically it has a time complexity of O(1).

Consider using a list if:

you need to store many items but the amount cannot be predicted
you need to perform lots of inserts or deletes that are not at the start or end of the sequence of data

Avoid using a list if:

the size of your data items is small, especially if you need to store many of them
you need constant time random access to the items

A simple example of using list

#include 
#include 
#include 
#include 

int main()
{
	std::list c;

	for(int x = 0 ; x < 10 ; ++x)
	{
		if(0 == (x%2))
		{
			c.push_back(x);
		}
		else
		{
			c.push_front(x);
		}

	}

	std::copy(c.begin(), c.end(), std::ostream_iterator(std::cout, "n"));
}

Set

The set is a sorted associative container that represents an abstraction of an ordered unique key. Although the exact implementation of the data structures that make up a set are not specifically defined in the C++ Standard they are typically implemented as a Binary Search Tree (more specifically, a common implementation is a Red/Black tree).

Just like list, set is a node based container meaning iterators are not invalidated when items are added or removed with the exception of the iterator of the specific item being removed.

The keys in a set must be unique, attempting to add the same key more than once just replaces the existing item with the end result being nothing more than wasted time.

Accessing, adding and removing items in a set can be done directly via the key value although, unlike vector and deque, the access does not take place in constant time. In fact time taken is generally logarithmic, or more specifically it has a time complexity of O(log N), where N is the distance from the first node to the target node.

Just like list, one needs to consider the possible size overhead of the framework that makes up a node versus the size of the data being stored. If the size of the data is small, say the size of a 32 bit int on a 32 bit platform it is likely that the size of the node will be at last 3 times that of the original data item.

Since set is a sorted associative container based on a Binary Search Tree it follows that keys are stored in a predetermined order. By default the order of the keys is determined by thestd::less comparison predicate. You can override this behaviour by providing your own comparison predicate as one of the template parameters.

As an alternative to set you might want to consider Google’s sparse_hash_set anddense_hash_set. Both of these are unsorted associate containers that are implemented using has tables rather than Binary Search Trees. They have several advantages over set, the mains ones are constant time lookup (best case, worse case can be linear!) and much lower cost in terms of data to framework overhead for each data item stored. Unlike set, the data is not sorted making hash table a poor choice if you need to access data in sorted order.

Consider using a set if:

you need to store data items in a sorted order and you need this sorting to be enforced in real time
you need to filter out duplicates from a collection and wish to retain that list for further use
you cannot use a 3rd party hash set
the nature of your data means it doesn’t hash well, resulting in many collisions

Avoid using a set if:

the size of your data items is very small especially if you need to store lots of them
you cannot afford to accept the fact that access is not done in constant time
you can get away with using a hash set

A simple example of using set

#include 
#include 
#include 
#include 
#include 
#include 

int main()
{
	std::set c;

	srand((int)time(0));

	for(int x = 0 ; x < 10 ; ++x)
	{
		c.insert(rand());
	}

	std::copy(c.begin(), c.end(), std::ostream_iterator(std::cout, "n"));
}

Map

The map is a sorted associative container that represents an abstraction of an ordered unique key, with a related value. Although the exact implementation of the data structures that make up a map are not specifically defined in the C++ Standard they are typically implemented as a Binary Search Tree (|more specifically, a common implementation is a Red/Black tree).

Just like list, map is a node based container meaning iterators are not invalidated when items are added or removed with the exception of the iterator of the specific item being removed.

The keys in a map must be unique (although the related values do not have the same restriction), attempting to add the same key more than once just replaces the existing item and its value. Of course, if the new key has a different value then this will modify the value of the key in the map.

Accessing, adding and removing items in a map can be done directly via the key value although, unlike vector, the access does not take place in constant time. In fact time taken is generally logarithmic, or more specifically it has a time complexity of O(log N), where N is the distance from the first node to the target node.

Just like list, one needs to consider the possible size overhead of the framework that makes up a node versus the size of the data being stored. If the size of the data is small, say the key size size of a 32 bit int and, likewise the value, on a 32 bit platform it is likely that the size of the node will be at last 2 times that of the original data item.

Since map is a sorted associative container based on a Binary Search Tree it follows that keys are stored in a predetermined order. By default the order of the keys is determined by the std::less comparison predicate. You can override this behaviour by providing your own comparison predicate as one of the template parameters.

As an alternative to map you might want to consider Google’s sparse_hash_map anddense_hash_map . Both of these are unsorted associate containers that are implemented using has tables rather than Binary Search Trees. They have several advantages over map, the mains ones are constant time lookup (best case, worse case can be linear!) and much lower cost in terms of data to framework overhead for each data item stored. Unlike map, the data is not sorted making hash table a poor choice if you need to access data in sorted order.

Consider using a map if:

you need to store data items in a sorted order of the key and you need this sorting to be enforced in real time
you need to filter out duplicates from a list and wish to retain that list for further use
you cannot use a 3rd party hash map
the nature of your data means it doesn’t hash well, resulting in many collisions

Avoid using a map if:

the size of your data items is very small especially if you need to store lots of them
you cannot afford to accept the fact that access is not done in constant time
you can get away with using a hash set

A simple example of using map

#include 
#include 
#include 
#include 
#include 
#include 

namespace std
{
	ostream & operator << (ostream & os, pair const & val)
	{
		return (os << val.first << "=>" << val.second);
	}
}

int main()
{
	std::map c;

	srand((int)time(0));

	for(int x = 0 ; x < 10 ; ++x)
	{
		c[rand()] = x;
	}

	std::copy(c.begin(), c.end(), std::ostream_iterator >(std::cout, "n"));
}

Multiset

The multiset is a sorted associative container that represents an abstraction of an ordered non-unique key. The only difference between set and multiset is that a multiset is allowed to have duplicate keys.

A simple example of using multiset

#include 
#include 
#include 
#include 
#include 
#include 

int main()
{
	std::multiset c;

	srand((int)time(0));

	for(int x = 0 ; x < 10 ; ++x)
	{
		c.insert(rand());
	}

	std::copy(c.begin(), c.end(), std::ostream_iterator(std::cout, "n"));
}

Multimap

The multimap is a sorted associative container that represents an abstraction of an ordered non-unique key. The only difference between map and multimap is that a multimap is allowed to have duplicate keys.

A simple example of using multimap

#include 
#include 
#include 
#include 
#include 
#include 

namespace std
{
	ostream & operator << (ostream & os, pair const & val)
	{
		return (os << val.first << "=>" << val.second);
	}
}

int main()
{
	std::multimap c;

	srand((int)time(0));

	for(int x = 0 ; x < 10 ; ++x)
	{
		c.insert(std::pair(rand(), x));
	}

	std::copy(c.begin(), c.end(), std::ostream_iterator >(std::cout, "n"));
}

Bitset

The bitset isn’t really a container since it doesn’t expose any iterators; however, I list it here for completeness. The bitset is basically a fixed size bitfield, where the size is defined at compile time as a template parameter. It is useful if you need a bitfield that is of a size that doesn’t match a standard intrinsic integer. It’s also useful as it can automatically convert a string of an appropriate format to a bitfield value and vice-verca.

A simple example of using bitset

#include 
#include 

int main()
{
	std::bitset<8> c(0xA7);
	std::cout << c.to_string() << std::endl;
}

Final word

That concludes our guided tour of the basic STL containers. Now you have a good grounding in STL containers you might want to read An Introduction to STL Algorithms. This excellent article by, “w00te”, goes into some depth about STL Algorithms and how they can be used with containers.

If you wish to find out more about STL containers please refer to the following excellent guide.

C++ Smart pointers

evilrix — Sun, 02 Dec 2012 16:23:37 +0000

This article is a discussion on smart pointers, what they are and why they are important to C++ programmers. Following the primary discussion I present a simple implementation of a reference counted smart pointer and show a simple example of using it. Although this article does not go into detail about how to develop a reference counted smart pointer the example code at the end is very well commented and that should be enough to aid understanding.

This article is targeted at an intermediate level C++ programmer; however, anyone who develops using C++ (even as a student) would benefit from reading this article and taking to heart the principles it discusses even if you don’t follow all of the technical concepts introduced. Not all the terms I use are necessarily explained in this article (I’ve kept it focused on the core subject); however, anytime a new term is introduced it will be linked to a reference where you can find out more.

Please note, all the code shared in this article is my own; however, during the development of my smart pointer I used the Boost shared_ptr as a basis for the interface to ensure I had captured all the necessary ingredients to provide a fully working smart pointer. If you have access to Boost then, please, do use the range of high quality, peer reviewed smart pointers provided in preference to the one I discuss here. My example, although fully working (and bug free I hope), is meant for educational purposes rather than use in production code and doesn’t, for example, implement support for thread safety or intrusive reference counting.

What is a smart pointer? That is a very good question… I’m glad you asked. First, though, allow me to introduce you to my cleaner RAII (pronounced Rye). Her job is to clean up after me. I’m pretty messy, I often get things out and forget to put them back but good old RAII follows me around and when I’m done with something she puts it away for me. RAII can be your cleaner too if you ask her. Isn’t she nice? Would you like me to introduce you to her? Yeah? Okay, say hello to RAII or Resource Acquisition is Initialisation to give her her full and rather grand title.

Okay I admit it, RAII isn’t a real person; rather, she’s a C++ idiom also, sometimes, referred to as a design pattern. Basically, anytime you allocate a resource you immediately initialise a RAII object with this resource and for the lifetime of the RAII object your resource is accessible but as soon as the RAII object is destroyed it automatically cleans up your resource too. So what does this have to do with smart pointers? Well a smart pointer is just a specialised RAII pattern. It specialises in managing the lifetime of heap allocated memory and will dispose of it when it is destroyed.

So, how does that work? Good question, I am so glad you asked! Let’s take a look. Firstly, the following is a very simple example of allocating heap memory in a function.

void foo()
{
   int * pi = new int;
   // Do some work
}

Not much going on there, except did you spot the defect? That’s right, the memory isn’t deleted when the function ends. See how easy it is to forget? What about this example?

void foo()
{
   int * pi = new int;
   // Do some stuff
   delete pi;
}

Great, memory is deleted no defects there right? Wrong! What if “Do some stuff” throws an exception? When an exception is thrown unless there is a catch handler to handle it the function it is thrown from will immediately exit, the function that called it will also immediately exit if that doesn’t have a catch handler and so on until an appropriate catch handler is found or the application terminates. This is called Stack Unwinding. Great, makes sense right? You’d hope so since this is ingrained within the C++ standard!

So, can you see the problem (and I’m not talking about the exception causing the program to terminate, let’s assume for the sake of this discussion somewhere further down the stack the exception is caught and handled)? That’s right, who deletes the memory allocated in the foo() function? Answer — no one does! Once again we have a memory leak. So how do we handle this? The obvious solution is to catch the exception, deleted the memory and re-throw it, right? Ok, let’s try that.

void foo()
{
   int * pi = new int;
   try
   {
   // Do some stuff
   }
   catch(...)
   {
	   delete pi;
	   throw;
   }
   delete pi;
}

Great, leak plugged… except… well it’s a bit messy isn’t it? We now need to have the same pointer deleted twice in one block of code. The general rule of thumb is don’t duplicate code as it just makes for extra maintenance. Is there a better way? Well, yes… again we turn to the C++ standard and we find those nice people who provide the language (The C++ Standards Committee) have thoughtfully provided a smart pointer called std::auto_ptr (it lives in the header file).

A smart pointer will automatically delete the memory it is managing once it goes out of scope. How? Well, remember how destructors of classes are automatically called when the class is being destroyed? Well, all that happens is we pass the smart pointer a real pointer that is pointing to heap allocated memory (normally via its constructor, hence the idiom RAII) to manage and when it goes out of scope (and, thus, is destroyed) its destructor deletes the memory for us. Neat eh? So, does this help? Let’s see.

#include 
void foo()
{
   // Note, the constructor of auto_ptr is explicit so you MUST use explicit
   // construction (pi = new int; will cause a compilation error)

   std::auto_ptr pi(new int); 

   // Do some stuff
}

Fantastic, problem solved! No need for catch handlers, no duplicate code and the auto_ptr will automatically delete the memory allocated to it when it goes out of scope, when the foo() function ends. Great, time for a cup of tea and feet up, right? Um, no. You see auto_ptr has a couple of unfortunate problems that can catch out the unwary programmer. Let’s take a look.

Problem one: The auto_ptr type can only be used with scalar heap allocations

That’s right, when you allocate memory using new and delete you have to use different syntax for scalars vs. arrays. Let’s take a look.

int * pi = new int; // Allocate a scalar
delete pi; // De-allocate a scalar

int * pi = new int[10]; // Allocate an array
delete [] pi; // De-allocate an array

You can’t mix up new and delete with new [] and delete []. You must pair them off correctly, otherwise the result is undefined (assume this is bad!). Now auto_ptr is designed specifically to delete scalars.

NB. The C++ Standard provides a better solution for allocating dynamic arrays, it’s called avector.

Problem two: The auto_ptr type has the concept of ownership, also know as move semantics. Let’s take a look.

#include 
void foo()
{
   std::auto_ptr pi(new int); 

   bar(pi); // this function accepts a std::auto_ptr by value

   // Do some stuff using pi
}

The moment you call the bar() function and pass it pi, the ownership of the pointer is passed from the original pi to the one that is within the stack frame of the bar() function. When this function returns ownership is not transferred back to the original pi auto_ptr. What does this mean in simple words? When you assign pi to another auto_ptr ownership is transfered to the new auto_ptr and the original auto_ptr no longer contains a pointer to the memory it was managing, instead it now points to NULL and any attempt to use it after that will result in undefined behaviour (assume this is bad!). This is like giving your mate the money from your wallet and then going to the shops — you have nothing to pay with so it’ll end in tears!

In fact, this is one of the more obvious examples, where it’s clear to see what’s happening. Imagine the auto_ptr was a member of another class and this was passed by value to another function, unless you’re written your own copy-constructor and/or assignment operator (note, you should always implement them in pairs) to perform a safe deep-copyyou’ll hit the same problem. The auto_ptr member will move ownership to the new copy and the current auto_ptr member will no longer point to valid memory. It’s fair to say that auto_ptr can be very dangerous indeed!

Ok, so what do we do now? It’s clearly too dangerous to use heap allocation in C++ so we’ll all just become .Net managed code programmers right? Eeeek, no! Arrrrr! Quick… it’s time for me to introduce our saviour, the reference counted smart pointer. Now, straight off the bat let me say that there is no default implementation of a reference counted smart pointer in C++ (yet) but the one that comes with Boost is ubiquitously use and looks to become part of the next C++ Standard, C++0X.

So what is a reference counted smart pointer? Well, unlike auto_ptr a reference counted smart pointer doesn’t transfer ownership. You can copy it as many times as you like and each smart pointer will contain the same pointer to the same object.

How does it work? Well, as well as containing a pointer to the memory it’s managing a reference counted smart pointer also contains a pointer to a counter and when you copy it the counter is incremented. The copy will point to the same object being managed and the same counter and when this goes out of scope it will decrement the counter but NOT delete the memory being managed unless the counter indicates this is the last reference. Let me say that again. Every time a copy of the smart pointer is made the counter is incremented and every time a copy goes out of scope the counter it is decremented.

When the counter reaches 0 that means the current copy going out of scope is the last one to reference the pointer being managed so it can safely delete the memory pointed to by the pointer without fear that another smart pointer will try to use it. It’s like a bus driver who counts all the people getting on the bus and all the people getting off and when the bus is empty he can park up and have a nice cup of tea.

Still not convinced for the case for using reference counted smart pointers over raw pointers? Think you’re too good for them? You never forget to release memory, right? Wow, you’re a tough audience! Okay, what’s wrong with this then?

class foo
{
public:
	foo() : pi1_(new int), pi2_(new int) {}
	~foo() { delete pi1_; delete pi2_; }
private:
   int * pi1_;
   int * pi2_;
};

Nothing wrong there, right? Wrong! What if the allocation of pi2_ fails? That’s ok foo’s destructor will delete pi1_ right? Um no! You see destructors aren’t called if the construction fails due to exception percolation. Ok, shall we try again? Sure, how about this?

class foo
{
public:
	foo()
	{
		try
		{
			pi1_ = new int;
			pi2_ = new int;
		}
		catch(...)
		{
			delete pi1_;
			delete pi2_;
			throw;
		}
	}
	~foo() { delete pi1_; delete pi2_; }

private:
   int * pi1_;
   int * pi2_;
};

Great, no more leaks right? Right! Except… um, now if pi1_ throws you’ll try and delete pi2_ as well and since that currently contains an uninitialized value we’ll try and delete an invalid pointer and corrupt the heap (this is really bad!). Ok, so we can get around this by initialising both pointers to be NULL in the constructor initialisation list (it’s safe to delete NULL) but look what a mess we now have. Imagine if the class was more complex than just these 2 pointers!?

class foo
{
public:
	foo() : pi1_(0), pi2_(0)
	{
		try
		{
			pi1_ = new int;
			pi2_ = new int;
		}
		catch(...)
		{
			delete pi1_;
			delete pi2_;
			throw;
		}
	}
	~foo() { delete pi1_; delete pi2_; }

private:
   int * pi1_;
   int * pi2_;
};

Ok, this solves the problem but what a mess? Also, what happens if the re-throw was accidentally omitted from the catch block? Well, construction wouldn’t fail (since we blocked its failure) and the destructor will still be called. Meanwhile you’d probably (but not necessarily, sometimes we want to block failure) then try to use a class that wasn’t correctly constructed and make a right old jolly mess!

Did someone scream function try block at me? Okay, I’ll humour you, let’s try again.

class foo
{
public:
	foo()
	try: pi1_(new int), pi2_(new int)
	{
	}
	catch(...)
	{
		delete pi1_;
		delete pi2_;
	}

	~foo() { delete pi1_; delete pi2_; }
private:
   int * pi1_;
   int * pi2_;
};

Fantastic, a solution that doesn’t use smart pointers! Two things though. First, how many C++ programmers even know about function try blocks? Actually not very many! The syntax can be quite confusion for someone who isn’t well versed in them. Second, the catch block in this example never re-throws (just like I discussed above) so this constructor can’t fail then right? Oh if only the C++ standard were that simple and consistent. The answer is (come on you knew I was going to say this right?) yes it most certainly can!

You see with a constructor try block if an exception is thrown during construction it will always (yes, that’s right, always) percolate it even if you don’t re throw it yourself. Actually, that’s a good thing normally, but not always. You might have a valid case to not re-throw, the exception might not necessarily mean construction failure. The only way to solve this is to nest yet another try block. Wow, what a tangled web we weave.

Ok, let’s look at the very simple way to solve this using smart pointers.

class foo
{
public:
	foo() : pi1_(new int), pi2_(new int) {}

private:
	// These are reference counted (soon to be discussed below)
   smart_ptr pi1_;
   smart_ptr pi2_;
};

Wow, now that is simple. Each smart pointer is allocated a memory to manage and when the class goes out of scope each will delete that memory. If the construction fails any memory allocated will be correctly deleted and if the smart pointer was never constructed because the object being constructed before it throws an exception it’ll never try and delete memory that it was never constructed to start with. Simple eh?

What I’d like to do now is to introduce you to my very own hand crafted reference counted smart pointer. Now it’s important to note that although the following code is a fully functional, it is really for educational purposes only. It has not been tested in a production environment (I wrote it just for this article) and if you do decide to heed the advice and use smart pointers please do refer to the Boost implementation as your first port of call.

So, without further ado, here she is in all her majestic glory (*cough*)…

// =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
// Basic smart pointer for scalar and array types
// Note: this class is not thread safe
// evilrix 2009

namespace devtools {

   // =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
   // This functor will delete a pointer to a scalar object
   template 
   struct delete_scalar_policy
   {
      void operator()(ptrT * & px) { delete px; px = 0; }
   };

   // =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
   // This functor will delete a pointer to an array of object
   template 
   struct delete_array_policy
   {
      void operator()(ptrT * & px) { delete [] px; px = 0; }
   };

   // =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
   // This is the smart pointer template, which takes pointer type and
   // a destruct policy, which it uses to destruct object(s) pointed to
   // when the reference counter for the object becomes zero.
   template <
      typename ptrT,
      template <
      typename ptrT
      > class destruct_policy
   >
   class smart_ptr
   {
   private:
      typedef delete_scalar_policy delete_policy_;
      typedef void (smart_ptr::*safe_bool_t)();
      typedef int refcnt_t;

   public:
      //=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      // Make a nice typedef for the pointer type
      typedef ptrT ptr_type;

      //=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      // dc_tor, c_tor and cc_tor
      smart_ptr() : px_(0), pn_(0) {}

      explicit smart_ptr(ptr_type * px) :
      px_(px), pn_(0) { pn_ = new int(1); }

      smart_ptr(smart_ptr const & o) : 
      px_(o.px_), pn_(o.pn_) { ++*pn_;}

      //=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=	
      // d_tor, deletes the pointer using the destruct policy when the
      // reference counter for the object reaches zero
      ~smart_ptr()
      {
         try
         {
            if(pn_ && 0 == --*pn_)
            {
               delete_policy_()(px_);
               delete pn_;
            }
         }
         catch(...) 
         {
            // Ignored. Prevent percolation during stack unwinding.
         }
      }

      //=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      // Assignment operator copies an existing pointer smart_ptr, but
      // in doing so will 'reset' the current pointer
      smart_ptr & operator = (smart_ptr const & o)
      {
         if(&o != this && px_ != o.px_)
         {
            reset(o.px_);
            pn_ = o.pn_;
            ++*pn_;
         }

         return *this;
      }

      //=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=	
      // Performs a safe swap of two smart pointer.
      void swap(smart_ptr & o)
      {
         refcnt_t * pn = pn_;
         ptr_type * px = px_;

         pn_ = o.pn_;
         px_ = o.px_;

         o.pn_ = pn;
         o.px_ = px;
      }

      //=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=	
      // Resets the current smart pointer. If a new pointer is provided
      // The reference counter will be set to one and the pointer will
      // be stored, if no pointer is provided the reference counter and
      // pointer wil be set to 0, setting this as a null pointer.
      void reset(ptr_type * px = 0)
      {
         smart_ptr o(px);
         swap(o);
      }

      //=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      // Returns a reference to the object pointed too
      ptr_type & operator * () const { return *px_; }

      //=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      // Invokes the -> operator on the pointer pointed too
      // NB. When you call the -> operator, the compiler  automatically
      //     calls the -> on the entity returned. This is a special,
      //     case, done to preserve normal indirection semantics.
      ptr_type * operator -> () const { return px_; }

      //=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=	
      // Get the pointer being managed
      ptr_type * get() const { return px_; }

      //=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      // Conversion to bool operator to facilitate logical pointer tests.
      // Returns a value that will logically be true if get != 0 else
      // and value that is logically false. We don't return a real
      // bool to prevent un-wanted automatic implicit conversion for
      // instances where it would make no semantic sense, rather we
      // return a pointer to a member function as this will always
      // implicitly convert to true or false when used in a boolean
      // context but will not convert, for example, to an int type.
      operator safe_bool_t () const { return px_ ? &smart_ptr::true_ : 0; }

   private:
      //=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      // A dummy member function used to represent a logically true
      // boolean value, used by the conversion to bool operator.
      void true_(){};

   private:
      //=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
      // Poiners to the object being managed and the reference counter
      ptr_type * px_;
      refcnt_t * pn_;

      //=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
   };

   // =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
   // Facility class to simplify the creation of a smart pointer that
   // implements a 'delete scalar policy'.
   template 
   struct scalar_smart_ptr
   { 
      typedef smart_ptr type;
      private: scalar_smart_ptr();
   };

   // =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
   // Facility class to simplify the creation of a smart pointer that
   // implements a 'delete array policy'.
   template 
   struct array_smart_ptr
   { 
      typedef smart_ptr type;
      private: array_smart_ptr();
   };

   // =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

}

You’ll see that I have fully documented the code with in-line comments so I won’t be repeating myself here regarding the specifics of how she’s implemented. Instead, let’s take a look at her in action.

#include 

using namespace devtools;

int main()
{
	// This smart pointer will manage a pointer to a scaler
	scalar_smart_ptr::type pi_scalar(new int);

	// This smart pointer will manage a pointer to an array
	array_smart_ptr::type pi_array(new int[10]);
}

Pretty simple eh? In fact, it’s as simple to use as auto_ptr except it solves both problems one and two discussed above, it can handle pointers to scalars and arrays and it can be copied as many times as you like without fear of the current pointer losing ownership.

So, what have we learnt? Well, heap memory management isn’t as simple in C++ as one would hope and that the tools provided by the (current) C++ standard are wholly inadequate. Trying to write code without using smart pointers if you are allocating heap memory is a recipe to disaster (or at the very least a debugging nightmare waiting to happen).

Finally, we discovered there is a solution in the form of the reference counted smart pointer and although the current C++ standard has no such concept right now it is coming but, meanwhile, you can use the excellent ones provided by Boost or roll your own — it’s not really that complicated.

Exceptions to exceptions

evilrix — Sun, 02 Dec 2012 16:01:56 +0000

Sometimes, just because you can do something doesn’t mean you should. Unfortunately, the C++ Standards Council missed that memo when they ratified the exception specifiers. To find out why, try this little quiz.

Question: This code contains a fatal flaw (literally), but can you see why?

void bar(); // declaration

void foo() throw()
{
   bar();
}

Answer: The call to bar() cause the application to terminate.

Why?

The function foo() has an empty throw exception specifier, meaning it guarantees it will not percolate any exceptions. So far so good. Now, look at bar(). That doesn’t have a throw exception specifier, meaning it makes no guarantees about whether it will or won’t percolate any exceptions. So, what happens if bar() does percolate an exception?

Well, under normal circumstances the behaviour is very well defined; your application will be immediately terminated. That’s right, terminate! No passing go, no collecting your £200. It’s goodnight Vienna!

You see, a throw specifier doesn’t actually guarantee a function won’t percolate an exception. All it does is indicate that the function shouldn’t throw an exception. If an exception does percolates from the function the C++ runtime calls the unexpected() function. By default this function just calls the terminate() function, and it should be pretty obvious what that will do.

Now as it happens you can change this behaviour by making a call to the set_unexpected() function to register your own callback, but what exactly are you going to do? There is no way to just continue unwinding the stack as if nothing happened so you’re pretty stuck with the only sensible thing being to terminate the application (hence, this being the default behaviour).

Considering all this, why oh why bother using exception specifiers? They don’t actually guarantee anything. All they do is make your code unsafe as you are basically subscribing to a time-bomb waiting to happen!

NB. Some compilers vendors (Microsoft, I am looking at you) don’t actually implement the correct behaviour for exception specifiers. Basically, they just ignore them. Now, one might argue that’s probably not a bad thing but I disagree. Regardless of how stupid exception specifiers are the fact is that at least the behaviour is well defined by the C++ Standard. A compiler doing it’s own thing means you don’t know what’s actually going to happen and this is worse! As far as I’m concerned, if a compiler isn’t standards compliant it is broken, end of!

Storing pointers in STL containers

evilrix — Sun, 02 Dec 2012 15:28:08 +0000

Need to store objects in an STL container? Need polymorphic behaviour meaning you’ll need to store pointers to a base class? Want to use a smart pointer to avoid memory leaks? Planning on using auto_ptr because it’s available as part of C++? Before you go any further, try this little quiz.

Question: I should use auto_ptr if I want to store pointers to heap allocated objects in STL containers, right?

Answer: No… no no no!

Why?

The C++ standard explicitly states that an STL element must be “copy-constructible” and “assignable”. It also explicitly states that the behaviour of storing an auto_ptr in an STL container is undefined. Ok, fair enough… but why?

The auto_ptr has move semantics and not copy or reference semantics. This means, if you assign it to another auto_ptr the pointer the new auto_ptr takes ownership of the resource being managed and the old auto_ptr has its internal pointer set to NULL. Things in an STL container have a habit of getting copying about (either explicitly or implicitly) and so the auto_ptrs you store in there end up being arbitrarily set to NULL.

If you need a good reference covering smart pointers, take a look at the shared_ptr, which is part of Boost.

Unsafe use of smart pointers

evilrix — Sun, 02 Dec 2012 13:24:56 +0000

Ubiquitous use of smart pointers can prevent memory leaks and make for much easier to read and understand code. Unfortunately, as with most things C++, there are some caveats you need to be aware of otherwise your attempts to write robust code could very well come back to bite you. This little quiz shows how careless misuse of auto_ptr could open up a big can of worms.

Question: What’s wrong with this?

std::auto_ptr pInt(new int[10]);

Answer: The result of auto_ptr deleting the memory allocated to it is undefined.

Why?

The auto_ptr calls non-array delete on memory allocated using array new. The C++ standard defines the result of this as undefined (even for intrinsic types, contrary to popular belief). It is important to match the correct allocator with its counter-part. Scalar new should always use scalar delete and array new should always use array delete. Since there is no array version of auto_ptr you cannot not use it to manage the allocation of arrays from the heap.

Of course, the question is why would you even bother? The C++ standard provides you with the vector type, which is basically a managed array. The C++ Standard even goes as far as to explicitly guarantee that its internal memory layout (of its data buffer) is compatible with the C-style array.

NB. As from C++11 the same is true for a string type; however, prior to C++11 there were no guarantees placed on the internal memory layout of the string type.

The dangers of iterators

evilrix — Sun, 02 Dec 2012 12:34:49 +0000

When working with STL containers we generally use iterators to access, manipulate and enumerate the contents. This almost becomes second nature and it’s very easy to go on auto-pilot and end up coding an innocuous looking bit of code that can contain a rather nasty surprise. This little quiz shows just one example of how such a surprise might come back to bite you.

Question: What’s wrong with this code?

typedef std::vector vec_t;
vec_t v(100);
vec_t::const_iterator i = v.begin() + 50;
v.push_back(5);
(*i) = 5;

Answer: We’re trying to dereference an (almost certainly) invalid iterator.

Why?

When we create vector v it is pre-sized with 100 elements. Line 3 gets an iterator to the 51st element (begin + 50). We then add something to the vector and after that use the iterator to change the 51st element to have the value 5. So, what’s the problem?

The problem is that when we add a new item to the vector there is a chance its internal memory will need to be reallocated. This is because when we originally created the vector we asked for 100 items and by adding another item the vector will need space for 101 items. The only way it can make this space is by allocating more memory.

So far so good. The problem is that the C++ Standard guarantees the internal memory layout of a vector has to be compatible with the memory layout of a C style array. This means that when a vector needs to allocate more memory it needs to completely replace its internal buffer with a completely new on. Our iterator references the original buffer, which is semantically the same as having a dangling pointer – oops!

To avoid this memory reallocation, we could have reserved more memory for the vector before starting to manipulate it. Providing the manipulations don’t cause the internal memory to be reallocated the iterator will not be invalidated.

typedef std::vector vec_t;
vec_t v;
v.reserve(200); // reserve enough memory for 200
v.size(100); // pre-size vector to contain and use the memory for 100 items
vec_t::const_iterator i = v.begin() + 50;
v.push_back(5); // now we're using memory for 101 items out of 200
(*i) = 5; // since there was no memory reallocation necessary this is now safe

How to add properties to standard C++ classes

evilrix — Sun, 02 Dec 2012 06:18:29 +0000

One feature missing from standard C++ that you will find in many other Object Oriented Programming languages is something called a Property. These are like data members except they can have preconditions imposed on them prior to getting or setting their value.

In C++ the general way to implement property like behaviour is to have a getter and/or setter member function. For the most part this suffices but there is an issue with this approach: you lose the syntax and semantics of a data member and, instead, have to deal with the syntax and semantics of a member function.

What do we mean by this? Let’s take a very simple example class called “account” that contains an int called “balance_”.

class account
{
public:
   int balance_;
};

As it currently stands “balance_” is a public data member. Although this gives us access to “balance_” it’s uncontrolled – no preconditions can be imposed. This is bad OOP design. It means “account” has no control over the value of “balance_” and so cannot guarantee the value is sane. In other words we could set balance_ to any rogue value that may or may not be appropriate for what it represents. Let’s make a change to ensure this is no longer the case.

class account
{
private:
   int balance_;
};

That’s it, now “balance_” is private so only “account” can change it. Of course, this isn’t much use if we do want the outside world to change the value of “balance_”. What we need is a way to get and set the value but in a way that “account” can ensure things are sane. Enter the getter and setter function.

class account
{
public:
   int get() const
   {
      return balance_;
   }
   
   void set(int balance)
   {
      balance_ = balance;
   }
private:
   int balance_;
};

Now we have functions getting and setting “balance_”, which means we can put additional code in there to ensure, for example, that when we set “balance_” it cannot be negative (no one wants one of those!). Let’s do just that.

class account
{
public:
   int get_val() const
   {
      return balance_;
   }
   
   void set_val(int balance)
   {
      if(balance < 0)         
      {
         // ERROR!
      }
         
      balance_ = balance;
   }
private:
   int balance_;
};

Great, finally a class that contains a member that we can get and set but in a controlled way. It’s all good right? Well, yes and no. You see we are now stuck with using function syntax and semantics.

This presents two issues:

Syntax: When writing generic code we need to rely on a class implementing a get or set method; if it doesn’t the code won’t build.
Semantics: You can’t freely use get and set functions in an expression.

Let’s deal with syntax first. We are going to write a generic function that takes an object that models the pair concept and sets both their values (first and second) to a value.

template 
void func(pairT & mypair)
{
   mypair.first = 320;
   mypair.second = 240;
}

This will work very well with std::pair.

std::pair mypair;
func(mypair);

Let’s say we wanted to implement our own pair object so we can implement some sanity checks. We could do this by aggregating std::pair.

class foo_pair
{
public:
   int get_first() const
   {
      return mypair_.first;
   }
   
   int get_second() const
   {
      return mypair_.second;
   }

   void set_first((int val)
   {
      if(val < 1 || val > 100)
         throw std::invalid_argument(
            "value must be between 1 and 100"
         );
         
      mypair_.first = val;
   }

   void set_second((int val)
   {
      if(val < 1 || val > 100)
         throw std::invalid_argument(
            "value must be between 1 and 100"
         );
         
      mypair_.first = val;
   }
private:
   std::pair mypair_;
};

Now, let’s try using this code with our function…

foo_pair mypair;
func(mypair);

What’s happened here? Simple. It won’t compile because the function doesn’t know it has to call set_first and set_second. Major fail!
Now, let’s take a look at the issue of Semantics. Consider this small code expression…

int x;
int y;
int z = x = y = 10;

If x and y were objects that represented special ints (a bounded int maybe) that implemented a set method. Would this still work? Of course not

int x;
int y;
int z = x.set(10) = y.set(10); // this makes no sense

Let see what we must do to make it work.

foo x;
foo y;
x.set(10);
y.set(10);
int z = 10;

Notice how we are forced to use functions, which breaks the nice free semantics of using assignment operators? Even if we make set return the value it has just set this still doesn’t work very well.

foo x;
foo y;
int z = x.set(y.set(10));

It’s just ugly and non-intuitive.

I hope that I’ve managed to demonstrate that although the getter and setter functions do serve a purpose they are a poor replacement for direct access to a variable. At this point I hope you are wondering, “great, how would a property help us”? Let’s find out!

First, to give us a starting point, let’s take a look at how a property could be defined using a language that has native support. We will use C# and re-implement the foo_pair class.

class foo_pair
{
public:
   int first
   {
      get
      {
         return first_;
      }
      
      set
      {
         if(value < 1 || value > 100)         
         {
            // ERROR!
         }
            
         first_ = value;
      }
   }
   
   int second
   {
      get
      {
         return second_;
      }
      
      set
      {
         if(value < 1 || value > 100)         
         {
            // ERROR!
         }
            
         second_ = value;
      }
   }
   
private:
   int first_;
   int second_;
};

How neat is that? This implements a “first” and “second” property. Notice how each property has a get and set clause? You can implement either or both of these to make a property read/write, read-only or write-only. In this case we’ve implemented both so the property is read/write. Implementing only get makes it read-only and implementing only set makes it write only. As you can see the set clause can perform validation.

Notice there is no function syntax so there is no actual value passed in? Instead C# provide access to a special implicit variable call “value”, which will contain the actual value being set. The property allows for the sanity checks whilst preserving data member syntax and semantics.

So, the question is, can we do this in C++? The answer is sort of. The fact is there is no specific native support for properties in standard C++ (although some compilers to have vendor specific extensions) but the language does give us the tools we need to implement our own. Of course, we will never get the nice neat solution like we have in C# because properties are not part of the C++ language but we can get a reasonable facsimile.

Let’s look at the code that will allow us to implement properties in C++.

#define property_(TYPE, OWNR, NAME, IMPL) 
   private: 
      class NAME ## __ 
      { 
      friend class OWNR; 
      public: 
         typedef NAME ## __ this_type; 
         typedef TYPE value_type; 
         NAME ## __ () {} 
         explicit NAME ## __ (value_type const & value) 
            : NAME ## (value) {} 
         IMPL 
      private: 
         value_type NAME ##; 
      }; 
   public: 
      NAME ## __ NAME;

#define get_ 
   operator value_type const & () const 
   { 
      return get(); 
   } 
   value_type const & get() const

#define set_ 
   this_type & operator = (value_type const & value) 
   { 
      set(value); 
      return *this; 
   } 
   void set(value_type const & value)

#define xprop_(NAME) 
   NAME ## . ## NAME

Pretty scary eh? That’s ok – it’s macros and they normally are! This article isn’t going to teach you how to read or write macros but it’s not that hard and a

good tutorial will explain all you need to know to follow this code.

So, how does this work? We then have four macros to facilitate adding a property class to a host class. All we do is use these macros to both define and implement a property.

Let’s take a look at what each macro does:

property_ : this macro defines a property, which is really nothing more than a local class with the cast and assignment operators defined to call ‘get’ and ‘set’ respectively

This macro takes the following parameters:

* TYPE – this is the type of the property (for example int)

* OWNR – the host (parent) class name, required to be made a friend of the property allowing unprotected access

* NAME – this is the name of the property (for example “first” or “second” as per the example above)

* IMPL – this is the code that defines get and/or set, use add either or both of the get_ and set_ macros here

get_ : this macro defines the non-const get clause – you need to add your own logic to specialise its behaviour between a { and a }

set_ : this macro defines the set clause – you need to add your own logic to specialise its behaviour between a { and a }

xprop_ : this macro allows the parent class to access the raw value unchecked variable bypassing the get and set methods

The astute reader may note that some operator semantics such as ++ and += will not work (my thanks to EE’s Dan Rollins for pointing that out). These can be added after the get and set macros in the property definition, as required. When you add them make sure they callset to ensure proper value validation. We could trivialise this by defining additional macros for the operators we wish to support. I’ll leave that as an exercise for the reader.

The solution presented here tries to find a balance between simplicity and safety to provide property semantics – without proper language support there will always be a compromise we will need to make.

That’s it – pretty simple really. Here is the canonical form for using these macros to add a property to a class.

#include 

class foo_pair
{
public:
   foo_pair() :
      first(9), second(5) // initilising properties (optional, if not used they will default construct)
      {}

   // Implement get, set and ++ operators
   property_(

      int, foo_pair, first, // int foo_pair::first

      get_
      {
         return first;
      }

      set_
      {
         if(value < 1 || value > 100)         
         {
            // ERROR!
         }

         first = value;
      }

      this_type & operator ++() { set(first +1); return *this; }

      this_type operator ++(int)
      {
         this_type tmp(first);
         set(first +1);
         return tmp;
      }

   );

   // Implement get, set and += operator
   property_(

      int, foo_pair, second, // int foo_pair::second

      get_
      {
         return second;
      }

      set_
      {
         if(value < 1 || value > 100)         
         {
            // ERROR!
         }

         second = value;
      }

      this_type operator += ( int rhs )
      {
         set(second + rhs);
         return *this;
      }

   );

private:
   void bypass() // for internal use by foo_pair only!
   {
      // accessing property members, circumventing set and get
      xprop_(first) = 0;
      xprop_(second) = 100;
   }
};

template 
void func(pairT & mypair)
{
   mypair.first = 320;
   mypair.second = 240;
}

int main()
{
   foo_pair mypair;

   func(mypair); // Look, we can now call the generic function

   int z = mypair.first = mypair.second; // Look, we can now perform expressions using variable and not function syntax

   // Some additional operator manipulation
   mypair.first++;
   ++mypair.first;
   mypair.second+=10;
}

So, that’s it. With a few simple macros we’ve managed to add properties to C++. Ok, in reality using some macro trickery we’re just adding member classes with some specialised behaviour but in reality this is more or less what a property is anyway. These simple macros just simplify the process of adding the necessary boilerplate to the class to implement the property.

Is there a cost of using this code? Not really. Everything is passed around as a reference and the size of the property class should be no larger than the variable it represents since it has no other members. As there is no dynamic polymorphism involved the compiler should be able to inline most (if not all) of this and, thus, optimise away the function calls involved. Of course, if the get or set are too complex this may not be the case but, then, this would be no different from implementing get or set function anyway.

Even if you decide using this code isn’t for you, I do hope reading this article has, at least, given you an insight into another aspect of OOP development and design – one that is, sadly, missing from the current version (C++03) of C++.

IRC (Internet Relay Chat) for absolute beginners

evilrix — Sun, 02 Dec 2012 05:58:57 +0000

Introduction

This article is aimed at someone who has never used IRC before. It covers the very basics of setting up and configuring a client and a little bit about starting a channel and the basics of being a channel operator. What it doesn’t do is review or promote any of the various IRC clients available. Nor does it cover power users; however, suitable reference material is suggested at the end for eager readers.

Also note, there are many IRC clients available and all are different. The details given in this article are completely generic, providing commands that should operate on all IRC clients.

So, firstly, what is IRC and why would you want to use it? Well, put simply IRC is the daddy of the various Instant Messaging (IM) services we use today. It’s been around for almost as long as The Internet. The main difference between IM and IRC is that IM is mainly focused on one to one conversations, like a telephone call, whereas IRC is mainly focused on group conversation, just like a meeting.

The main benefit of IRC over IM is the fact it is collaborative. This makes it ideal for companies to implement as a communications medium for teams that may be located on different sites. Each team can join their own specific channel. If teams need to discuss sensitive information these channels can be password protected so no one else can join without being invited to do so.

With IRC you join a channel and when you do you are able to talk on that channel with everyone else who is on it. A channel is basically like a communal room, where people can pop in and have a chat. There are infinite channels; all you have to do is come up with any name you like as long as it starts with a ‘#’ and join it. If it doesn’t already exist, it will be created for you.

Unlike IM, you don’t have to be invited to a channel to join in; anyone (with the correct password if it’s key protected) can pop in. You can, if you wish, start a one to one conversation with another IRC user, but normally you join a channel first. It’s also possible to send files to each other over IRC along with some other interesting client to client actions using a special protocol call DCC, but that will not be discussed here.

Getting Started

So, how do you use IRC? Well, I’m glad you asked or this would be a very short article! To get started you need two things. First, you need an IRC server to connect too. The fact you are reading this article suggests you already have this. Second, you need an IRC client. There are plenty available for all Operating Systems so a good start would be to check Google. The right IRC client is a personal choice so I suggest trying a few before you settle on one.

So, assuming you now have your IRC client and are ready to connect to the server, let’s begin. When you start your IRC client for the very first time there are a few things you’ll need to configure.

Server/host: You will need to provide the IP address or the host name of your IRC server

Port: IRC Servers normally serve IRC from port 6667 and most clients will default to using this unless you change the setting. If your server uses a different port you will need to change the standard configuration of your client.

Ident: Ident is a protocol that allows a server you are connecting with to query your machine to find out your true identify. IRC servers often use this to confirm you are not trying to log in as a privileged user (such as root — it’s a Unix thing!) or as someone who doesn’t have permission to use the server. This is mainly a historical option, rarely used anymore yet some IRC servers will refuse connection unless the service is available. Others will retry a number of times before letting you in anyway, but this can make the time to connect very slow. This being the case most IRC clients have the ability to act as an ident server. Generally, if your IRC client has an ident setting, just enter your normal username. Once this is set you can generally forget about it.

Username: Every IRC user needs a unique username. You can make up anything you want but, generally, the name can be no longer than 9 standard ASCII characters. If a username is already taken you will be told by your client to choose a different one before you can connect to the IRC server.

Those are the minimum settings you’ll need to set to get connected. Other settings you might have will probably pertain to firewalls, such as SOCKS settings or client specific settings. These will not be discussed here as they are IRC client specfic.

So, get connected! On most GUI clients there is a button you click to connect. If not, use the /connect command to connect to the server.

/connect [:port]

Piff… pufff. pooof… and if all has gone well you will be presented with connection details and then the MOTD!!!

Message Of The Day

Ok, so you configured your client and connected to your server. Now what? Well, the first thing you should do is read the Message Of the Day (MOTD). This will provide you all the details you need to know about using this server, including what is and isn’t allowed and who to contact if there are any problems. In cast you missed the MOTD you can request it again using the /motd command.

/motd

Note that if you fail to comply with the restrictions of using the server your host will probably be k-lined. Basically, this is a setting an IRC Oper (and IRC Server Operator) to block your from connecting. IRC Oper’s tend to be pretty touchy, think BOFH and you know where I am coming from!

Changing Your ‘Nick’

Are you happy with your username, or ‘nick’ as it’s known in IRC parlance? Well, if not you can change it using the /nick command.

/nick

Let’s change our username to ‘expert’.

/nick expert

You should now see your name change. If not, it’s probably because it’s already in use and you’ll get an error message telling you so. You’ll have to choose something else. Remember, most IRC servers restrict usernames to 9 characters and only printable 7 bit ASCII values – this is because IRC was conceived back in the days when no other type of character set existed.

Note that when you change your username you are only changing the nickname that other users refer to you as, you are not changing your host or ident details so the server will still, for example, enforce any bans against you on a channel. The only way to change your host is to connect from a different hostname or ip address.

Private Messaging

The main reason to use IRC is to collaborate on a channel but before we get onto that let me just show you how simple it is to start a private discussion with another user. To do this you just use the /msg command.

/msg {message}

So, let’s send our friend sue a hello greeting.

/msg sue Hello sue, I’m now online.

When you issue this command Sue will see your message (assuming she is online). It’s up to the IRC client to decide how this message is delivered but nearly all GUI clients will open a new window that she can then type into to respond to you. Likewise, if you are using a GUI client your client will probably open a new window too for you to type into so that you and Sue can have a one-2-one chat. This chat is private (but not secure unless you are communicating over a secure protocol such as SSL) and cannot be seen by anyone else.

Joining A Channel

Right, so MOTD read and we’ve said hi to Sue. Maybe we should think about joining a channel? There would be little use in using IRC if we didn’t.

To join a channel we need to issue a client to server command. There are a number of basic commands you will need to learn when using IRC. Although most of these commands can be carried out via the GUI of your client, each client does it differently and it’s far easier to remember the few simple commands. There aren’t many so it’s not that big of a deal. All IRC commands start with a forward slash ‘/’ and the one we’re interested in right now is the ‘/join’ command. The syntax is as follows:

/join {,} [{,}]

The channel parameter is the name of the channel you are going to join. All channels in IRC start with a hash (or pound as it’s sometimes called — I’m from the UK so it is definitely a hash!) ‘#’. You’ll also see an optional password can be provided. This is for when a channel requires a password (or key) to join. Let’s try and join a channel. For example, we might try and join the #ee channel.

/join #ee

Now, on execution of this command one of three things will happen:

1. The channel currently doesn’t exist so you have just created it and become a channel operator (more on that in a bit)

If you are the first person to join the channel you will create the channel and become a channel operator. As a channel operator you are in control of the channel. This includes being able to set the channel topic, set a password to prevent unauthorised access, kicking out others who are being a nuisance and banning people who you no longer wish to have access to the channel. These various privileges will be discussed shortly.

2. The channel exists and you have joined it

If there are already people in this channel you join it as another normal user. Providing the channel hasn’t be assigned special settings, such as users requiring a “voice”, you will be able to chat to the other members.

3. The channel exists but it requires a password to join (mode +k, more on this shortly)

Unless you know the password, which you append to the end of the join command, you cannot join a password protected channel without an invite from a member already in that channel.

Channel And User Modes

Channels and users can be assigned certainly privileges, modes. Below is the full list as specified by the IRC standard.

Assigning a channel mode

/mode {[+|-]|o|p|s|i|t|n|b|v} [] [] []

           o – give/take channel operator privileges;

           p – private channel flag;

           s – secret channel flag;

           i – invite-only channel flag;

           t – topic settable by channel operator only flag;

           n – no messages to channel from clients on the outside;

           m – moderated channel;

           l – set the user limit to channel;

Assigning a user mode

/mode {[+|-]|i|w|s|o}

           i – marks a users as invisible;

           s – marks a user for receipt of server notices;

           w – user receives wallops;

           o – operator flag.

As you can see there are a plethora of modes that can be set for channels and users; however, only a few are really of interest to the beginner but you can find more information by looking at the technical specification, RFC1459, for IRC. Some of these will be covered in the rest of this article.

Channel Operators

So, you connected to an IRC server, joined a channel and discovered you’re the channel operator! Arrrrg! ok, don’t panic. Breath, relax, let me guide you…

If you have become a channel operator (either because you started the channel or another user assigned you the privileges) there are a number of additional things you can do above and beyond a normal channel member. These will now be discussed briefly.

As discussed above, channels and users can be assigned ‘modes’. These are single letter characters that assign properties to that user or channel. Users who are channel operators are added to the channel’s mode ‘o’ list, which is a list of users for that channel who are operators. The operator flag is channel specific so being an operator on one channel does not automatically make you an operator on another channel. When you are a channel operator you are, basically, in control of the channel. There are a number of things a channel operator can do but what follows are the most basic and useful actions:

Op a user.

You can assign other trusted users the channel operator privileges. You do this using the mode command, applying a +o flag to the channel for their username. So, let’s say the user jack is someone we trust. let’s op him on channel #ee.

/mode #ee +o jack

Most IRC clients implement a shortcut to this in the form of the /op command.

/op [channel]

Deop a user.

You can also deop someone by removing the o flag.

/mode #ee -o jack

Notice how we use a + prefix to add a flag and a – prefix to remove a flag?

Most IRC clients implement a shortcut to this in the form of the /deop command.

/deop [channel]

Kick a user.

If a user is disrupting the channel you can kick them out using the /kick command.

/kick

Ban a user.

To ban a user you add a mask for their host to the channels ban list. Users are banned by mask to prevent them from trying to rejoin by just changing their username. The idea is you take their IRC host string and use wildcards to build a mask. You can get their host name by using the /whois command.

/whois

You set a ban using the mode command, applying a +b flag to the channel for their host mask. So, let’s say the user lamer is logged in from IRC client someclient@somedomain.com their full host name would be lamer!someclient@somedomain.com. So, a suitable host mask would be something like *!*@somedomain.com. So, let’s ban them from channel #ee.

/mode #ee +b *!*@somedomain.com [reason]

Most IRC clients implement a shortcut to this in the form of the /ban command.

/ban [channel] [reason]

If that command is run on the channel the user will normally be kicked and banned.

Losing your voice

Channels can be moderated so only users with a ‘voice’ can speak. Users that do not have this flag are unable to send messages to the channel. To set the channel as moderated we need to add the +m flag to the channels list of flags.

/mode +m

Again, you can remove this flag using the same command but with a ‘-‘ rather than a ‘+’.

To give the user a voice assign them a +v flag for that channel.

/mode +v [channel]

Set a channel password

You may wish to restrict access to your channel. You can do this by assigning a password (or key as it’s actually known). We do this by assigning a +k flag to the channel. Let’s password protect #ee with the password “askme”.

/mode #ee +k askme

It’s A Security Risk Though Right?

IRC has a pretty bad reputation for being an insecure protocol and to some extent this is justified. However, this is like blaming a hammer for hitting your thumb. Ultimately, as long as you are sensible and follow common sense when using IRC you are no less secure than, for example, using IM or e-mail. So, never accept any files being sent to you over IRC unless they are being sent from a trusted source and you have previously agreed to receive them. Always make sure you are running an up-to-date anti-virus product and always ALWAYS scan any files before opening them.

Even this won’t guarantee you won’t fall victim to malware so if possible just never accept files! Ensure you have a reputable firewall set up. Don’t just rely on your router’s hardware firewall. This will protect you from any external nasties but if just one of your machines gets infected with a worm it’ll travel around your internal network in seconds, infecting all your machines! Never give out your personal or private details to anyone over IRC, including credit card details. As mentioned before, IRC is not a secure protocol and everything is sent in plain text. Anyone Packet Sniffing your network may see everything you type and send to the server.

The Gentleman’s Guide To Complaining

evilrix — Sun, 02 Dec 2012 05:53:45 +0000

Complaining. It’s easy right? Anyone can do it. You just raise your voice and talk loudly, or maybe even shout at the object of your frustration until your problem gets sorted. If that is all it takes, then that’s it, end of article. Wow, that was easy!

OK, if you read the first few lines and agree then this article is for you. Trust me, you will get nowhere fast if that’s your tactic.

A few things I should say first. I do not promise if you complain in the way I suggest here that you will always get your own way. You probably won’t but you will stand a better chance and, more importantly, you’ll lose less hair (and self respect). Also, I draw some references to UK law, I’m sorry but I don’t know how your domestic law operates, so you’ll have to go figure that bit out yourself. For most of the article I use a situation in a restaurant as an example, but the same rules apply to any complaining you need to do. Finally, I am not a management consultant guru type (can you tell?) so all I describe here is what I’ve found works for me so I am willing to share. Your mileage may vary!

How should we complain? Well, let’s take a step back first and ask ourselves a simple question. Do I have the right to complain? I mean, say you buy a £10 pair of running shoes and after 3 months they have fallen to bits. Do you think you got value for money? What about if you spent £300 and they fell to bits after 2 weeks? Do you think you then have a right to complain?

The point is be reasonable. If you’re going to complain about something be sure that your original expectations were realistic. If you’re going to demand something be sure that what you’re demanding isn’t unreasonable. If you order food in a restaurant and you ate it don’t expect to get the food for free when you, afterward, complain it wasn’t up to scratch. That’s just not reasonable. You ate it, you pay for it. Simple!

We’ve established we have a valid reason to complain. We go and hurl abuse at the appropriate person and they fix it, right? OK, let’s try it. You’re in a restaurant (yeah, you’re hungry again!) and the waiter brings you food that isn’t quite up to scratch. Yell at him, go on… it’ll be fun to humiliate him. After all, it’s his fault he deserves it. There you go, off he trots back to the kitchen and within 5 minutes he’s back. Your new plate of food looks yummy, I bet you won’t even taste where he spat in it because he thinks you’re a jerk!

OK, how about we try a different approach? Firstly, the waiter is just doing his job. He didn’t cook it, he’s just bringing it to you. Sure if it looks crap maybe he could have told chef but, come on, you’ve seen Gordon Ramsey (right?), would you want to tell him he’s serving crap food if you worked for him? No, of course you wouldn’t (well, I wouldn’t!). Also, this poor waiter’s probably been on his feet for about 6 hours now without a break and he’s still smiling and being attentive to you.

So how about this? How about we smile back at him, thank him for bringing you the food but point out to him that, for whatever reason, you can’t accept it. Explain to him why, ask him to return it to the kitchen for a replacement. Thank him. If there is an issue, if the waiter says he can’t return it that’s fine. He’s still just doing his job. It’s not his fault, he’ll get shouted at by chef if he returns it! Remember, don’t blame the messenger if they bring you bad news! Ask to speak with the manager. At this point, you’re talking to someone who doesn’t have the authority to make a decision so take it to someone who can.

The waiter wanders off and gets the manager. Now we have the right person to yell at. Sure, we could but let’s not forget the manager also has the authority to just throw us out. We’re hungry, we don’t want that. When the manager comes up to the table, again, explain to him (or her) why you wish to return the food to the kitchen. Be polite, keep smiling but… and this bit is important… be firm. Make it clear you are not prepared to accept this plate of food but you are willing to accept a replacement, after all you’re not being unreasonable you just want what you’re paying for.

At this point, as long as you’ve been polite you’re going to get new food. And since you’ve been so nice about it I promise it will be spit free. The manager is not going to argue over a plate of food, as long as you’ve been reasonable, at the risk of causing a scene. Of course if he does, ask yourself, “do I really want to eat in this restaurant when there are so many others I could eat in?”. Get up and politely inform them you are leaving and you are not paying. In the UK you only have to pay for what you eat (assuming you are rejecting the food due to a problem), so just go elsewhere that will appreciate your customer (and good manners). If your law differs you may have to find a different resolution… but I’d be very surprised if it does.

What else do we need to know about complaining? Well, as well as making sure what you’re complaining about is reasonable it also helps to know what your rights are. For example, a lot of shops will try and tell you that once you’ve purchased something any problems must be taken up with the manufacturer. Well, this may or may not be the case where you live but in the UK your contract is with the vendor so it is down to them to repair or replace your faulty item. It is down to them to send it away for repair and it is down to them to bear the full cost of doing so.

What about if you return something and it’s faulty? Well, again, this depends upon your domestic law but in the UK for the first 6 months after purchase it is up to the shop to prove the item wasn’t faulty when you purchased it and not down to you to prove it was. Make sure you tell them you know this, most shop assistants either hope you don’t or don’t even realize it themselves.

Again, be polite but be firm and if necessary demand to speak with the manager. I once refused to leave a shop where I was returning a faulty mobile phone because I knew my legal rights and I knew they had no choice but to replace it. The manager and I were at logger heads for nearly three hours and all I did was politely but firmly repeat my legal rights to him over and over. Eventually, I ground him down and I got the replacement phone and some free gifts too. It pays to be nice

Successful complaining can be summarized as three key things.

Be reasonable, don’t set your expectations higher than your should
Be nice (smile!), never swear or be rude — but do be assertive
Know your rights before you complain otherwise you will probably just be fobbed off

The art of complaining is all about winning over the person you are complaining to. It’s almost like a sales pitch, you need to convince them that you are right and that your expectations are reasonable and, thus, they should be met. If you are instantly dislikeable due to a poor attitude you will not win, even if you should (especially when being rude to waiters).

Peace.

Definition or a declaration?

evilrix — Sun, 02 Dec 2012 05:47:57 +0000

The C++ language is a context sensitive language, which means a compiler cannot always decide the semantics of a line of code in isolation. Sometimes, though, it is impossible for the compiler to make up it’s mind so it just guesses. Yup, that’s right, it guesses. To find out more try this little quiz.

Question: Is the following a definition or a declaration?

Foo f(Bar());

Answer: It could be either!

Why?

More specifically, it could be either a function declaration or an object definition:

A declaration of a function that takes type Bar and returns type Foo
A definition of f as a type Foo, which has a constructor that takes type Bar.

The problem is the syntax for both is identical so to resolve this problem the C++ standard states that a compiler must prefer function declarations to object definitions where it is unable to make a distinction! This can make for some rather entertaining compile time errors when you think you are creating an instance of an object and the compiler is convinced you are declaring a function.

Order of initialization

evilrix — Sun, 02 Dec 2012 05:34:16 +0000

In general, it’s pretty obvious what order the compiler will initialize variables: it’s the order in which they appear in the translation unit. What happens, though, when you have a global variable in one translation unit depending on the the initialization of a global variable in another translation unit? This little quiz explores just that.

Question: What is the result of the following code?

// Translation unit "foo.cpp"
int const FOO = 1;

// Translation unit "bar.cpp"
#include "foo.h"

int FOO int const BAR = FOO

Answer: The result is undefined.

Why?

Translation units are not compiled in any predefined order. If bar.cpp is compiled before foo.cpp then the value of BAR will be undefined.

Can we resolve this? Well, yes!

There are many ways to solve this issue but by far the simplest is to use the Mayer’s Singleton pattern (named after Scott Mayer the acclaimed author of Effective C++). This relies on the fact that although we can’t control the order of initialization of global objects we are guaranteed that if a global function is called all members of that function (including static members) are constructed during the lifetime of that function call and if the function then returns that static member, even if it is in a different translation unit, the object being referenced will be fully constructed.

This is easier to understand with a simple code example

// Translation unit "foo.cpp"
int const FOO() { static int const FOO_ = 1; return FOO_; }

// Translation unit "bar.cpp"

#include "foo.h" // declares int FOO()

int const BAR = FOO()

Of course this is a very simplistic example using an int, but this principle works with all types including user defined objects (class instances).

When simple arithmetic isn’t so simple!

evilrix — Sun, 02 Dec 2012 05:25:25 +0000

What could be as simple as incrementing a variable by one? Ignoring overflow, what else could possibly go wrong? As it turns out, quite a lot as this little quiz demonstrates.

Question: What is the value of a and i after this code runs?

int i = 0;
char a[2] = { 0 };
a[i++] = ++i;

Answer: The behaviour of the code in undefined.

Why?

A variable cannot be modified more than once in any expression unless the modification is punctuated with a sequence point. A sequence point (of which there are 6 in standard C++) guarantees that all the side effects of the previous expression are complete and no side effects from sub-sequence expression have been performed. Since i is modified and read twice in this expression the result is undefined.

The dangers of casting pointers

evilrix — Sun, 02 Dec 2012 05:18:57 +0000

There are various dangers when casting pointers to different types but as a general rule, casting to a void pointer and back to the original pointer is considered safe. Unfortunately, this is not always the case as this little quiz demonstrates.

Question: What is the result of this code?

struct Foo
{
   void func(){}
};

typedef void (Foo::*func_t)();

int main()
{
   func_t fp1 = func_t(&Foo::func);
   void * p = (void*) fp1;
   func_t fp2 = (func_t) p;

   Foo foo;
   (foo.*fp2)();
}

Answer: The result is undefined.

Why?

A pointer to a member is not a pointer to an object or a pointer to a function and the rules for conversions of such pointers do not apply to pointers to members. In particular, a pointer to a member cannot be converted to a void pointer.

The same is also true for function pointers, which cannot be safely cast to a void pointer.

More info: C++ FAQ Lite (member function pointers)
More info: C++ FAQ Lite (function pointers)

String literals and pointers

evilrix — Sun, 02 Dec 2012 05:10:50 +0000

Do you know how to access c-style literal strings? Try this two-part quiz and see if you are able to unravel the different semantics of character pointers and arrays.

Question:

a) What is wrong, if anything, with the following code?

char * p = "foobar";

b) How do these 2 lines of code differ?

char const * p = "foobar";
char const s[] = "foobar";

Answer:

a) Since a literal string decays to a pointer type of char const * and NOT char * this is not strictly speaking legal C++; however, to maintain backwards compatibility with C, compilers will allow this but they (should) produce a warning since this construct is now deprecated in C++ and will be removed from future versions of the standard.

b) Line one is a pointer to a literal string, which is immutable and attempting to modify the string will result in undefined behaviour. Line two is an array that is initialised with a string value, which means you are allowed to remove the const specifier and modify the array if you so wish (since you own the memory that represents the array).

Exceptions to the rule

evilrix — Sun, 02 Dec 2012 05:01:00 +0000

The C++ Standard is a pretty large and complex document; however, it is the bible as far as writing C++ code is concerned. The standard is full of exceptions that prove the rule, and this quiz demonstrates just one trivial example.

Question: What, if anything, is wrong with the following code?

int main(){}

Answer: Nothing, it is perfectly valid

Why?

In C++ the return is an optional statement in (and only in) the main function, with 0 being implicitly returned if the return statement is omitted. This is a special case that only applies to the main function.

Comparing structs

evilrix — Sun, 02 Dec 2012 04:45:25 +0000

This little quiz explores the pitfalls of trying to compare structs. Do you know the right way to check if two structs are the same?

Question: What is the value of bSame and why?

#include 

struct S
{
   float f;
   char c;
   int i;
};

int main()
{
   S s1 = { 1.1f, 'a', 99 };
   S s2 = { 1.1f, 'a', 99 };

   bool bSame = memcmp(&s1, &s2, sizeof(S)) == 0;
}

Answer: The value of bSame is undefined.

Why?

The reason is that compilers are allowed to put padding into struct and class types for data alignment to preserve word boundary alignment and for efficiency reasons they are not obliged to initialize this padding to any specific value. This being the case the result will be compiler specific and, therefore, undefined.

This code is not portable and although it might work on your compiler or even your specific version of the build (for example, in a debug build in Visual Studio the compiler does null out structures to facilitate debugging) there is no guarantee this will work on other platforms or other compilers.

You can mitigate this by using memset to nullify the memory first but this should be avoided on non POD types as it can have unexpected side effects (such as obliterating the v-table of a class with virtual functions). In short, the only safe way to compare structures is to perform a member by member comparison (preferably by adding the comparison operators).

Copying to stdout using STL

evilrix — Sun, 02 Dec 2012 04:37:49 +0000

The STL (Standard Template Library) is a collection of generic algorithms and data structures. This little quiz demonstrates one of the many useful things one can achieve with just a few lines of code when utilizing the power of this library.

Question: What is the output of the following code?

#include 
#include 
#include 

struct g
{
	g():n(0){}
	int operator()() { return n++; }
	int n;
};

int main()
{
	int a[10];
	std::generate(a, a+10, g());
	std::copy(a, a+10, std::ostream_iterator(std::cout, " "));
}

Answer: 0 1 2 3 4 5 6 7 8 9

Why?

The function main() uses the generate algorithm to initialise the int array using functor g. This functor will be called for every element within the array, with the result of the call used to initialise that element. Once initialised, the whole array is copied to the standard output stream using the copy algorithm.

Tip: Most of STL makes good use of functors and there are some neat reusable generic algorithms, which can be used to make code simple and easier to maintain.

Separating C++ template declaration and implementation

evilrix — Sun, 02 Dec 2012 03:37:13 +0000

The following question was the inspiration for this short article:”Splitting a template and class into definition and declaration.“. In this question the asker asks, “I have the code below, which is all well and good but I’d like to move the definition of the setListener method to the cpp file, yet I seem to be having difficulty doing this as I get complaints about the template needing arguments?”.

template<typename TEventHandlerClass>
class CEventRaiser
{
public:
   typedef void (TEventHandlerClass::*TEventHandlerMethod)();

   void setListener(TEventHandlerClass *aEventHandlerClass, TEventHandlerMethod aEventHandlerMethod)   
   {
       eventHandlerClass=aEventHandlerClass;
       eventHandlerMethod=aEventHandlerMethod;
   }
};

It’s a fair question but, unfortunately, the answer isn’t straightforward. Let’s see if we can unravel this mystery.

One of the things that often confuses an inexperienced C++ programmer, when first using templates, is why they can’t put the declarations in the header file and the implementation in the .cpp file, just like they can with normal function or class definitions.

When C++ programs are compiled they are normally made up of a number of .cpp files with additional code included via header files. The generic term for a .cpp file and all of the headers it includes is “translation unit“. Roughly speaking, a compiler translates the translation unit directly into an object file, hence the term translation unit.

Once all the translation units have been turned into object files it is the job of the linker to join all these object files together into one executable (or dynamic library). Part of the linking process is to resolve all symbols to ensure, for example, that if an object file requires a function, that it is available in one of the object files being linked and that it doesn’t exist more than once (it should only be defined by one object file). If a symbol can’t be resolved by the linker a linking error will result. Up until the point of linking each translation unit and resultant object file are completely agnostic, knowing nothing about each other.

So what does this have to do with templates? Well to answer this we need to know how the template instantiation process works. It turns out that templates are parsed, not once, but twice. This process is explicitly defined in the C++ standard and although some compilers do ignore this, they are, in effect, non-compliant and may behave differently to what this article describes. This article describes how template instantiation works according to the current C++03 standard. Let’s take a look at what each of these passes does:

1. Point of Declaration (PoD)

During the first parse, called the Point of Declaration, the template compiler checks the syntax of the template but does not consider the dependent types (the template parameters that form the templates types within the template). It is like checking the grammar of a paragraph without checking the meaning of the words (the semantics). Gramatically the paragraph can be correct but the arrangement of words may have no useful meaning. During the grammar checking phase we don’t care about the meaning of the words only that the paragraph is syntactically correct.

So consider the following template code…

template
void foo(T const & t)
{
   t.bar();
}

This is syntactically sound; however, at this point we have no idea what type the dependent type T is so we just assume that in all cases of T it is correct to call member bar() on it. Of course, if type T doesn’t have this member then we have a problem but until we know what type T is we don’t know if there is a problem so this code is ok for the 1st pass.

2. Point of instantiation (PoI)

This is the point where we actually define a concrete type of our template. So consider these 2 concrete instantiations of the template defined above…

// this will fail the 2nd pass because an int (1 is an int) does not have a member function called bar() foo(1); // Assuming b has a member function called bar this instantiation is fine foo(b);

NB. it is perfectly legal to define a template that won’t be corrected under all circumstances of instantiation. Since code for a template is not generated unless it is instantiated the compiler will not complain unless you try to instantiate it.

Now both the syntax and the semantics of the template are checked against the known dependent type to make sure that the generated code will be be correct. To do this the compiler must be able to see the full definition of the template. If the definition of the template is defined in a different translation unit from where it is being instantiated the compiler has no way to perform this check, so the template will not be instantiated.

Remember that each translation unit is agnostic; the compiler can only see and process one at a time. Now, if the template is only used in one translation unit and the templated is defined in that translation unit this is not a problem. Of course, the whole point of a template is that it is generic code so there is a very good chance it will be used in more than one place.

So, let’s recap where we are so far. If the template definition is in translation unit A and you try to instantiate it in translation unit B the template compiler will not be able to instantiate the template because it can’t see the full definition so it will result in linker errors (undefined symbols). If everything is in one place then it will work. but it is not a good way to write templates. Sooner or later you’ll probably end up using the template in other translation units because it is highly unlikely (although not improbable) that you’d go to all the effort of creating a generic template class/function that you’ll only ever use in one place.

So how do we structure our code so that the compiler can see the definition of the template in all translation units where it is instantiated? The solution is really quite simple, put the templates definition somewhere that is visible to all PoIs and that is, of course, in a header. The header file can be included in both translation unit A and translation unit B so it will be completely visible to the template compiler in both.

It’s interesting to note that the C++ standard does define the “export” keyword to try and resolve this issue. The idea is that you prefix the declaration of the template with the export keyword, which will tell the template parser to remember the definition for later reuse. It was introduced as a last minute addition to the standard and has yet to be adopted by any main stream compiler.

From a style point of view, if you want to preserve demarcation between declaration and definition with template classes you can still separate the class body and the member functions all in the same header.

First class declaration

// First class declaration
template <typename T>
struct CTimer
{
   void foo();
}

// Followed by function definitions
template < typename T>
void CTimer <T>::foo()
{
}

On the rare occasion that your template class/function is only going to be used in one translation unit then the declaration and definition should go in there together in an unnamed namespace. This will prevent you, later, from trying to use the template somewhere else and scratching your head trying to figure out why you have linker errors about unresolved symbols Putting the template fully in the translation unit means it won’t even compile if you try to reference it and the reason for that will be far more obvious.

// My .cpp with a template I never plan to use elsewhere
namespace
{
   template <typename T>
   struct LocalUseOnly
   {
      void foo();
   }

   template < typename T>
   void LocalUseOnly<T>::foo()
   {
   }
}

Now, as usual with C++, things are not as straight forward as they could be because there is an exception to this rule about putting template code in headers. The exception is specializations. Since specializations, unlike templates themselves, are concrete entities (not templates that describe to the compiler how to instantiate a concrete entity implicitly) they have associated linker symbols so they must go into the .cpp file (or be explicitly declared inline) otherwise they’ll breech the “One Definition Rule“.

Also, specializations of template class member functions must be outside the class, they cannot be implicitly inline within the class body. Unfortunately, Visual Studio doesn’t enforce this… it is wrong, the C++03 standard clearly states they must go outside the body.

As a final note, there are other ways this issue of template declaration/definition seperation can be resolved (such as putting the template definition in a .cpp file, that you then include when needed); however, none of these are as simple or straightforward as just leaving the code definition in the header, which is the generally accepted best practice.

I hope this article has helped demystify why templates are generally defined in header files, contrary to normal good coding practice.

For more on C++ templates I recommend the C++ Templates FAQ

Function pointers vs. Functors

evilrix — Sun, 02 Dec 2012 01:18:34 +0000

Often, when implementing a feature, you won’t know how certain events should be handled at the point where they occur and you’d rather defer to the user of your function or class. For example, a XML parser will extract a tag from the source code, what should it do now that it has this tag?

The simplest way to handle this would be to invoke a Callback function that knows how to handle tags. This is exactly how SAX (Simple API for XML) style XML parsers work. There are, of course, other reasons for invoking a callback function: implementing logging, enumerating windows and many more. All these are examples ofevent driven programming; when you encounter an event and you call an event handler to handle it.

In C, if you want to provide a callback mechanism you must implement a callback function and then pass the address of the callback function to the invoker (the code that will call your callback when it’s needed). Unfortunately, C style function pointers have a number of drawbacks:

1. A function contains no instance state, mainly because there is no such thing as multiple instances of a function; there will only ever be one instance and that is global. Sure, it is possible to declare a local static within the function that can retain state between calls, but since a function only has one instance there can only be once instance of the static member and that must be shared between function calls.

A function with local static state

MyDataClass & foo()
{
   static MyDataClass data;
   // Do some work
   return data;
}

2. If you try to maintain state in the function by using a local static variable the function will not be reentrent, so it cannot be safely called on multiple threads without the additional overhead of thread synchronisation to ensure access to the local static data has mutual exclusion semantics. This effectively means the function can only allow one thread into it at any one time and that will create quite a bottle neck of thread contention (multiple threads all fighting for access to a single resource). Furthermore, if access is required to the static state local variable after the function has finished the caller must continue to block access until the state has either been copied for use or it is no longer required otherwise it’ll be read by one thread whilst another is potentially trying to modify it, resulting in a race condition.

A function with local static state using a mutex to attempt to make the function reentrant

MyDataClass & foo(Mutex m)
{
   ScopedLock sl(m); // Ensure no other thread can get in here

   static MyDataClass data;
   // Do some work
   return data;

   // Note once the scoped lock ends here another thread could enter
   // and modify data before the caller has a chance to copy it
   // so even this isn't a very good solution, really the mutex should
   // be locked by the caller.
 }

3. Function pointers do not work very well with templates, if only one signature of the function exists then things will work otherwise the compiler will complain of template instantiation ambiguities that can only be resolved through ugly function pointer casting.

A function pointer casting to resolve template instantiation ambiguities

void callback_func(void *){}
void callback_func(char *){}

template
void consumer(funcT callback_func)
{
}

int main()
{
   consumer(callback_func); // ERROR: Which one?

   consumer((void (*)(void*))callback_func);
   consumer((void (*)(char*))callback_func);
}

4. What if your invoker expects a function that takes only one parameter and you want to use a 3rd party function as the callback, to which you have no source code, and this has a completely different signature to that expected by the invoker? Well, you can wrap the function with another function that adapts the interface but this wrapper will need to have the additional required parameters hard coded into it. What if you want the flexibility of changing what the parameter values are for arbitrary calls to the invoker? Well, for that you will need to write an adaptor for each and every permutation and even then it’s still fixed at compile time so it’s not easily extensible and becomes quite messy with multiple functions that need to be maintain.

An adaptor using a function

bool third_party_func(int, char, float){ return true; }

template
void invoker(funcT callback_func)
{
   callback_func(int());
}

// C style adaptors
void adaptor_func1(int n)
{
   third_party_func(n, 'c', 1.1f); // Hard coded bindings, cannot be changed at runtime
}

void adaptor_func2(int n)
{
   third_party_func(n, 'b', 5.9f); // Hard coded bindings, cannot be changed at runtime
}

int main()
{
   // C style function has hard coded bindings
   invoker(adaptor_func1);
   invoker(adaptor_func2);
}

So, is there a better way? Well, now you come to mention it, yes there is! Enter the functor.

What is a functor? Well, simply put it is a function object, and in C-PlusPlus we model a functor using a normal class object. What makes the object a functor is the provision of a function operator, which gives the class object function semantics. The function operator, in its simply canonical for looks like this…

The basic canonical form of a functor

class Functor
{
   public:
   R operator()(T1, ..., Tn)
   {
      return R();
   }
};

Where R is the return type, T is a parameter type and just like any function the number of parameters is arbitrary. So a more concrete type of a functor that takes 2 int parameters and returns a bool would be as follows…

A concrete example of a simple functor

class LessThanFunctor
{
public:
   bool operator()(int lhs, int rhs)
   {
      return lhs < rhs;
   }
};

It’s pretty clear to see that this simple functor will compare two integers and if the left-hand-side is less than the right-hand-side it will return true, else it will return false.

How do we use a functor and how does it differ from a function? Well, as already stated a functor is just a class with function semantics, of course it is still just a class and like all classes it can contain data and function members and instances can be created. What this means is that each instance of the functor object can contain and maintain its own internal state. This state can either be set during construction of the function or after construction. This means the functor can be primed with state before use and it can set its own state during use, which can be extracted after.

Lets look at a functor in action. First of all, let’s revisit the issue of binding parameters to 3rd party functions to facilitate using them in with an invoker that expects a different signature. With a functor the additional parameters can be loaded into the functor instance when it is created, which can then be bound to the third party call at run-time and not compile time. In fact this is exactly what the standard library functions bind1st and bind2nd do.

Look at the example below, notice how much neater the solution is, no need for multiple functions to provide multiple bindings and also note that the bindings provided are passed into the constructor of the functor rather than being hard-coded into it, thus allowing these bindings to be changed at run-time.

An adaptor using a functor

bool third_party_func(int, char, float) { return true; }

template
void invoker(funcT callback_func)
{
   callback_func(int());
}

// C++ style adaptor
class adaptor_functor
{
public:
   // Initialize runtime bindings
   adaptor_functor(char cb, float fb) : cb_(cb), fb_(fb){}

   void operator()(int n)
   {
      third_party_func(cb_, fb_, n);
   }

private:
   char cb_;
   float fb_;
};

int main()
{
   // C++ functor has bindings that can be set ar runtime via the functors constructor
   invoker(adaptor_functor('a', 2.3f));
   invoker(adaptor_functor('z', 0.0f));
}

How about another example? A common usage of a callback function is to provide a user defined logging mechanism for a 3rd party library. The library will callback to your logging callback function, provide it with some details and it is up to you to design a function that will do something useful with those details, like write them to a log file. Furthermore we must record how many times we logged something and how many of these were errors so at the end of the call to the third-party library we can append a count of entries to the log file. Implementing this using a standard C function callback mechanism would require quite a lot of effort, however, using a functor it’s pretty simple.

A logging functor in action

#include
#include
#include
#include

// Note that this logging class assumes single threading, additional code would be required
// to provide mutual exclusion semantics, which are outside the scope of this article
class LoggingFunctor
{
public:

   // Constructor allows user defined output stream
   LoggingFunctor(std::ostream & os) :
   os_(os), nErrCnt_(0), nLogCnt_(0) {}

   // Overload for std::string
   void operator()(std::string const & s, bool bErr)
   {
      // Hand off to overload for char const *
      (*this)(s.c_str(), bErr);
   }

   // The main logging funcion
   void operator()(char const * szMsg, bool bErr)
   {
      // Count log item
      ++ nLogCnt_;

      // Display date & time
      time_t t = time(0);
      char tbuf[80];
      strftime (tbuf,80,"%x %X ",localtime(&t));
      os_ << tbuf;

      // Is this an error message?
      if(bErr)
      {
         // Count error and display error prefix
         ++ nErrCnt_;
         os_ << "ERROR: ";
      }

      // Now log it
      os_ << szMsg << std::endl;
   }

   // Accessors to the log and error count
   int GetErrCnt() const { return nErrCnt_; }
   int GetLogCnt() const { return nLogCnt_; }

private:
   // Non-copyable semantics to prevent accidental copy or assignment
   LoggingFunctor(LoggingFunctor const &);
   LoggingFunctor operator=(LoggingFunctor const &);

private:
   std::ostream & os_;
   int nErrCnt_;
   int nLogCnt_;
};

template
void Mock3rdPartyCall(LoggingFunctorT & logger)
{
   for(int i = 0 ; i < 25; ++i)
   {
      // Build a log message
      std::stringstream ss;
      ss << "Log entry " << i;

      // Log it, treat every 3rd iteration as an error
      logger(ss.str(), i%3 == 0);
   }
}

int main()
{
   // Log to stdout for this example
   LoggingFunctor loggingFunctor(std::cout);

   // Call the mock 3rd party function
   Mock3rdPartyCall(loggingFunctor);

   std::cout
      << std::endl
      << loggingFunctor.GetLogCnt() << " items logged, "
      << loggingFunctor.GetErrCnt() << " of which were errors." << std::endl;
}

Okay, we’ve had a couple of concrete example of using functors but I can hear you screaming, “are there any drawbacks?”. Well, yes. For a start functors are not C API friendly and cannot really be made to work with an existing function that already expects an old C style function pointer. Other drawbacks? Well, unlike functions, functors have to be instantiated and like any object a functor can throw on construction so additional consideration must be given to ensure code is exception safe. Other than these few issues the use of a C-PlusPlus functor has few downsides and many benefits.

So how do you start making use of functors? When writing C-PlusPlus code that uses callbacks it is always a good idea to implement support for functors as well as function pointers. This is exactly what the Standard Template Library does. The code to support both functors and function pointers is quite simple, requiring only the use of a simple template parameter rather than an explicit function pointer type. Since the calling semantics of a functor and a function are identical, the invoker works just as well with either. In template meta-programming parlance we say that a functor models a function concept and, therefore, either a function or a functor can be passed as a template parameter where that parameter represents a function concept. Let’s go back to the LessThan functor to see this.

How to write an invoker to use either a functor or a function

#include

class LessThanFunctor
{
public:
   bool operator()(int lhs, int rhs)
   {
      return lhs < rhs;
   }
};

bool LessThanFunction(int lhs, int rhs)
{
   return lhs < rhs;
}

// To make this work with a function or functor we just use a template parameter
template
bool Invoker(functorT func, int x, int y)
{
   return func(x,y);
}

int main()
{
   std::cout
      << "Functor: " << Invoker(LessThanFunctor(), 5, 6)
      << std::endl
      << "Function: " << Invoker(LessThanFunction, 5, 6)
      << std::endl
      << "Functor: " << Invoker(LessThanFunctor(), 7, 6)
      << std::endl
      << "Function: " << Invoker(LessThanFunction, 7, 6)
      << std::endl;
}

You’ll find the functor concept is used ubiquitously by the C-PlusPlus STL (Standard Template Library) as well as the Boost libraries. Knowing how to write and use functors is a key success factor in writing generic and reusable code and being able to make use of advanced features of the STL and Boost. They are a tool that should be in any C-PlusPlus programmers toolkit!

Further reading: The Function Pointer Tutorials – Introduction to the basics of C and C-PlusPlus Function Pointers, Callbacks and Functors.

Return Value Optimization

evilrix — Sun, 02 Dec 2012 01:06:54 +0000

In days of old, returning something by value from a function in C++ was necessarily avoided because it would, invariably, involve one or even two copies of the object being created and potentially costly calls to a copy-constructor and destructor. Advances in compiler optimizations have all but eliminated this concern thanks to a clever set of optimizations implemented by most modern compilers.

The C++ standard allows the omission of the call to the copy constructor and, thus, allows the compiler to create a return value in the stack-frame of the calling function. This has the effect of allowing the compiler to treat both objects (in the caller and the callee) as the same entity, thus eliminating the need to take a copy.

There are two versions of this optimization available, Named Return Value Optimization (NRVO) and Return Value Optimization (RVO). Although the end result is the same, the syntax and semantics of each is slightly different:

RVO: Return Value Optimization is carried out when an object is constructed in-line within the return statement of a function, which would normally result in a temporary object being created on the stack, which is then copied into the calling functions stack-frame. When RVO is performed the object is created within the stack-frame of the calling function, thus avoiding the creation and destruction of an unnecessary temporary and the invocation of a copy constructor.

Bar Foo()
{
   return Bar();
}

Without RVO

Items constructed: 2
Items destructed: 1
Copies taken : 1

With RVO

Items constructed: 1
Items destructed: 0
Copies taken : 0

NRVO: Named Return Value Optimization is carried out when an object is created with a name within the called function and is then returned by name, which would normally result in a temporary object being copied on the stack, which is then copied into the calling functions stack-frame. When NRVO is performed the named object is created within the stack-frame of the calling function, thus avoiding the creation and destruction of an unnecessary temporary and the invocation of, potentially, two copy constructors.

Bar Foo()
{
   Bar bar;
   return bar;
}

Without NRVO

Items constructed: 3
Items destructed: 2
Copies taken : 2

With NRVO

Items constructed: 1
Items destructed: 0
Copies taken : 0

It should be obvious by now that when RVO or NRVO are used the copy-constructor on the returned object may not be called. For this reason it is very important that you do not write code that relies on the calling of a copy-constructor (such as instance counting, for example) since it may or may not be called depending upon the compiler, the optimization level and the way the function is written.

Each compiler implements support for RVN and NRVO to varying degrees so it is important to refer to your favourite compilers documentation to establish how well supported these two optimizations are.

It is not always possible for a compiler to carry out NRVO, code must be written to facilitate it. Again, this does vary from compiler to compiler but if there are multiple return paths you can be pretty sure NRVO will not take place.

// Example of N/RVO

struct MyClass
{
   MyClass()
   {
      std::cout << "MyClass::c_tor()" << std::endl;
   }

   MyClass(MyClass const &)
   {
      std::cout << "MyClass::cc_tor()" << std::endl;
   }

   ~MyClass()
   {
      std::cout << "MyClass::d_tor()" << std::endl;
   }
};

MyClass NRVO()
{
   std::cout << "Named Return value Optimization" << std::endl;

   MyClass myClass;
   return myClass;
};

MyClass RVO()
{
   std::cout << "Return value Optimization" << std::endl;

   return MyClass();
};

MyClass NoNRVO()
{
   std::cout << "** NO *** Named Return value Optimization -- this is unlikely to optimize" << std::endl;

   if(0)
   {
      MyClass myClass;
      return myClass;
   }
   else
   {
      MyClass myClass;
      return myClass;
   }
};

MyClass NoRVO()
{
   std::cout << "** NO *** Return value Optimization ??? -- this should still optimize" << std::endl;

   if(0)
   {
      return MyClass();
   }
   else
   {
      return MyClass();
   }
};

int main(void)
{
   std::cout <>> START >>>" << std::endl;

   MyClass myClass1 = NRVO();
   MyClass myClass2 = RVO();

   MyClass myClass3 = NoNRVO();
   MyClass myClass4 = NoRVO();
   std::cout << "<<< END <<<" << std::endl;
}

Sealing a C++ Class

evilrix — Sun, 02 Dec 2012 00:59:24 +0000

Unlike C#, C++ doesn’t have native support for sealing classes (so they cannot be sub-classed). At the cost of a virtual base class pointer it is possible to implement a pseudo sealing mechanism.

The trick is to virtually inherit from a base class where the constructor is private and the sub-class is declared a friend in said base class. If you then try to sub-class the pseudo sealed class the compiler will not be able to synthesize a callable constructor for the virtual base class so instantiation will fail.

It works by making the default constructor of the sealer class private, which means nothing can construct it. We then make the class we want to seal a friend of the sealer class and subclass it with virtual inheritance. As the subclass is a friend of the sealer class it can call the private constructor so we are able to instantiate instances of it. Since we virtually inherited the sealer class and since in C++ the top most sub-class of an inheritance tree always called the base classes constructor directly the fact that this constructor is inaccessible means the compiler will produce an error. Voila, we have sealed the class to prevent it being sub-classed.

The following code example uses a macro called SEALED, which takes care of creating a virtual base class and making the real class virtually derive from it.

#define SEALED(className)
	className ## Sealer
		{
			private: className ## Sealer(){};
			friend class className;
		};
		class className : virtual private className ## Sealer

class SEALED(MyClass) {};

class MyClassDisallowed : public MyClass {};

int main()
{
	// Perfectly legal construction
	MyClass myClass;

	// Illegal construction, super-class is sealed
	MyClassDisallowed myClassDisallowed;
}

Static assertions in C++

evilrix — Sun, 02 Dec 2012 00:56:55 +0000

Errors will happen. It is a fact of life for the programmer. How and when errors are detected have a great impact on quality and cost of a product. It is better to detect errors at compile time, when possible and practical. Errors that make their way to become runtime problems are harder to detect and may go unchecked until such time that the code reaches a customer. The later the defect is identified the more costly it is in terms of time and money.

A static assertion is similar to a runtime assertion, in so far as it allows the programmer to assert that an expression must be true, or an error message must be raised. The difference is that static assertions are triggered at compile time rather than runtime. How is this useful? Well, how about a situation where we’ve had to make assumptions about the capacity of a specific type, say int?
The C++ Standard makes no specific claims about how big an int will be other than it “has the natural size suggested by the architecture of the execution environment.”

What does this actually mean? Well, in reality, it means the size of an int could be any size. The same is true of other types; a long must be as least as big as an int and an int at least s big as a short. Clearly, making an assumption about type size is non-portable and potentially dangerous, and should only be done in controlled cases, such as writing an OS kernel for a particular processor with a particular compiler. As soon as the programmer loses the ability to specify the compiler and the target architecture, type size assumptions go out the window. So, how does a static assertion help? Good question, let’s see…

Wouldn’t it be nice if we could assert our assumption about the size of a type at compile time so that if this assumption breaks, we are alerted immediately? For example, the code is built by a different compiler than the author used on a platform he did not anticipate. Enter static assertions. The trick is to turn to templates and, specifically, make use of the fact that a template variant, that is not instantiated, will not cause a compile time failure even if it contains erroneous code.

Create a template function that takes a bool template value parameter (not a function parameter). Within the function create a char array and use the value of the bool to determine the size. In the case where the bool template value is true, a char array of one element will be created. Since this will never be used the compiler should happily optimize this away. In the case where the bool template value is false, the compiler will try to generate a template function that creates a char array of zero elements. Since this is invalid C++, a compiler error will ensue.

We can now use a simple macro to facilitate the use of this template and then use a sizeof() to assert the sizes of the types we are assuming. Of course, any compile time constant can be asserted with a static assert, this is just one example of usage.

template
inline void STATIC_ASSERT_IMPL()
{
	// B will be true or false, which will implictly convert to 1 or 0
	char STATIC_ASSERT_FAILURE[B] = {0};
}

#define STATIC_ASSERT(B) STATIC_ASSERT_IMPL <b>()

int main()
{
	// On a Windows 32 bit platform with Visual Studio 2005 this will not fail
	STATIC_ASSERT(sizeof(int) == 4);

	// On a Windows 32 bit platform with Visual Studio 2005 this *will* fail
	STATIC_ASSERT(sizeof(int) == 3);
}

Determining if a C++ type is convertible to another at compile time

evilrix — Sun, 02 Dec 2012 00:51:43 +0000

When writing generic code, using template meta-programming techniques, it is sometimes useful to know if a type is convertible to another type. A good example of when this might be is if you are writing diagnostic instrumentation for code to generate a log or trace file for debugging purposes. The relationship of the types may have significance.

This relationship can be determined with a little bit of meta-template trickery along with a special function prototype that takes an ellipsis … as its formal parameter list.

A little background for those who don’t know, the … ellipsis in C and C++ parlance is a special function parameter that means the function will accept any type and any number of parameters and is often used in C along with var_args to create functions that take variable arguments, functions such as printf() or scanf(), for example.

An important factor to this technique is that, in C++, a function can be overloaded with an ellipsis version and it will be called if and only if no other function of the same name can be found to match the calling parameter list. We take advantage of this by declaring (but not defining) two overloads of with same function name; one that take a reference to the type we’re looking to see if we can convert to and the other takes the … ellipsis.

The trick is to have the ellipsis version return a type that is a different size to that of the more specific function. At compile time the compiler will use static polymorphism to decide which function to call and we can then use the sizeof operator on the function call to get the size of the function’s return type that the compiler decided matched the calling parameter. If the types are convertible then the return type size will be that of the specific function taking a reference to the convertible type, otherwise the size will be that of the generic function that has the ellipsis parameter.

Note, neither of these functions actually needs to be defined — only declared — because neither of them is actually ever called; there is no runtime cost to this technique. This can all be wrapped up in a simple little template meta-function to simplify usage.

Below is a contrived example…

// Some types
struct A{};
struct B:A{};
struct C{};

template 
struct is_convertible
{
private:
   struct True_ { char x[2]; };
   struct False_ { };

   static True_ helper(T2 const &);
   static False_ helper(...);

public:
   static bool const YES = (
      sizeof(True_) == sizeof(is_convertible::helper(T1()))
      );
}; 

template 
void foo(T1 const & t1, T2 const & t2)
{
   if(is_convertible::YES)
   {
      std::cout << "Type t1 is convertible to t2" << std::endl;
   }
   else
   {
      std::cout << "Type t1 is not convertible to t2" << std::endl;
   }
} 

int main(void)
{
   struct A a;
   struct B b;
   struct C c;

   foo(b,a);
   foo(c,a);
}