<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

 <title>MWLabs. | My dev pills.</title>
 <link type="application/atom+xml" rel="self" href="http://moonwave99.github.com/atom.xml"/>
 <link type="text/html" rel="alternate" href="http://moonwave99.github.com/"/>
 <updated>2012-10-21T02:28:59+02:00</updated>
 <id>http://moonwave99.github.com/</id>
 <author>
   <name>Diego Caponera</name>
 </author>

 
 <entry>
   <title>Taglib Unicode Madness.</title>
   <link href="http://moonwave99.github.com/2012-09-21/taglib-unicode-madness.html"/>
   <updated>2012-09-21T21:00:03+02:00</updated>
   <id>http://moonwave99.github.com/2012-09-21/taglib-unicode-madness</id>
   <content type="html">&lt;p&gt;Lately I&amp;#8217;ve been working on a Cocoa MP3 tagger/renamer app: it gathers features from &lt;a href='http://deadbeatsw.com/thetagger/'&gt;various&lt;/a&gt; &lt;a href='http://peippo.eu/musorg/'&gt;useful&lt;/a&gt; &lt;a href='http://www.publicspace.net/ABetterFinderRename/'&gt;programs&lt;/a&gt; that didn&amp;#8217;t make the cut on their own for not having them all. It was all fun and games until I met with Unicode weirdings in tag saving [via &lt;a href='http://taglib.github.com/'&gt;TagLib&lt;/a&gt;].&lt;/p&gt;
&lt;!--more--&gt;
&lt;p&gt;The scenario is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;fetch data from &lt;a href='http://www.discogs.com/developers/'&gt;Discogs API&lt;/a&gt; into &lt;code&gt;NSString&lt;/code&gt; fields of a model;&lt;/li&gt;

&lt;li&gt;convert such strings into C-strings [&lt;code&gt;const char*&lt;/code&gt;] via &lt;code&gt;UTF8String&lt;/code&gt; method of &lt;code&gt;NSString&lt;/code&gt; class;&lt;/li&gt;

&lt;li&gt;set them into &lt;code&gt;TagLib::Tag&lt;/code&gt; property of each file;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything was fine for plain english releases [being english language &lt;a href='http://en.wikipedia.org/wiki/Diacritic#English'&gt;almost free from diacritical marks&lt;/a&gt;], then I stumbled upon an italian record: data fetching went flawless, but when I persisted it to files and I checked Xcode console I met &lt;strong&gt;Mr. √®&lt;/strong&gt; [the MacRoman representation of &lt;em&gt;è&lt;/em&gt;, italian for third-person singular of &lt;em&gt;to be&lt;/em&gt;]. Shouldn&amp;#8217;t &lt;code&gt;UTF8String&lt;/code&gt; take care of encoding non-ASCII characters?&lt;/p&gt;

&lt;p&gt;At a first glance I thought about a library issue, but even the minimal &lt;code&gt;NSLog(@&amp;quot;%s&amp;quot;, [@&amp;quot;è&amp;quot; UTF8String])&lt;/code&gt; example was broken! I tried then to mess with Taglib parameters, but I was having no clue at all; after a googling session, I understood I needed &lt;a href='http://en.wikipedia.org/wiki/Wide_character'&gt;wide characters&lt;/a&gt;, which are compatible with &lt;a href='http://taglib.github.com/api/classTagLib_1_1String.html#a49bc90194a70c0d0b4fbea62cf865511'&gt;this&lt;/a&gt; &lt;code&gt;Taglib::String&lt;/code&gt; constructor.&lt;/p&gt;

&lt;p&gt;How to perform conversion from &lt;code&gt;NSString&lt;/code&gt; to &lt;code&gt;wstring&lt;/code&gt;?&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;NSData* asData = [string dataUsingEncoding:kEncoding_wchar_t];

TagLib::wstring ws = TagLib::wstring(
	(wchar_t*)[asData bytes],
	[asData length] / sizeof(wchar_t)
);&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;where &lt;code&gt;wstring&lt;/code&gt; is a provided implementation of &lt;code&gt;std::wstring&lt;/code&gt; [not defined in all systems &lt;a href='http://taglib.github.com/api/namespaceTagLib.html#a2c62885ba4d6f17b273a66ca63b4641b'&gt;as stated here&lt;/a&gt;], and &lt;code&gt;kEncoding_wchar_t&lt;/code&gt; is defined as following:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;#if TARGET_RT_BIG_ENDIAN
	const NSStringEncoding kEncoding_wchar_t =
	CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingUTF32BE);
#else
	const NSStringEncoding kEncoding_wchar_t =
	CFStringConvertEncodingToNSStringEncoding(kCFStringEncodingUTF32LE);
#endif&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Finally I could build desired string:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;TagLib::String::String(ws, TagLib::String::Type::Latin1);&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and set into files&amp;#8217; tag, which are correctly saved and rendered by the program itself, by QuickLook and by external media players.&lt;/p&gt;

&lt;p&gt;One mystery lasts: I had to use &lt;code&gt;TagLib::String::Type::Latin1&lt;/code&gt; encoding flag, and not expected &lt;code&gt;TagLib::String::Type::UTF8&lt;/code&gt;, which threw a &amp;#8220;Unicode conversion error&amp;#8221; exception: I will ask Stackoverflow later maybe.&lt;/p&gt;</content>
   <author>
    <name>Diego</name>
    <email>diego.caponera@gmail.com</email>
   </author>
 </entry>
 
 <entry>
   <title>Hello World!</title>
   <link href="http://moonwave99.github.com/2012-09-12/hello-world.html"/>
   <updated>2012-09-12T09:00:03+02:00</updated>
   <id>http://moonwave99.github.com/2012-09-12/hello-world</id>
   <content type="html">&lt;p&gt;Sorry for the trivial title - I&amp;#8217;ve just jumped over the &lt;a href='https://github.com/mojombo/jekyll'&gt;Jekyll&lt;/a&gt; wagon because I needed a clean place where to post my development thoughts and advices. As &lt;a href='http://www.diegocaponera.com'&gt;my main blog&lt;/a&gt; is more focused on music and prose, I felt better separating concerns and delegating all my programming world to the place I store &lt;a href='https://www.github.com/moonwave99'&gt;my repos&lt;/a&gt; as well.&lt;/p&gt;

&lt;p&gt;I have to thank &lt;a href='http://jimbarraud.com/'&gt;Jim Barraud&lt;/a&gt; for the minimal WordPress theme he designed, whose I tweaked a bit to get actual style, &lt;a href='http://disqus.com/'&gt;Disqus&lt;/a&gt; for the comment facilities, and &lt;a href='https://www.github.com'&gt;github&lt;/a&gt; itself of course for making my code feel &lt;em&gt;alive&lt;/em&gt; [he told me, really].&lt;/p&gt;

&lt;p&gt;Hope you have a nice time around!&lt;/p&gt;</content>
   <author>
    <name>Diego</name>
    <email>diego.caponera@gmail.com</email>
   </author>
 </entry>
 

</feed>
