<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:thr="http://purl.org/syndication/thread/1.0" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">
    <title>Smart Software</title>
    
    <link rel="alternate" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/" />
    <link rel="service.post" type="application/atom+xml" href="http://www.typepad.com/t/atom/weblog/blog_id=7693" title="Smart Software" /> 
    <id>tag:typepad.com,2003:weblog-7693</id>
    <updated>2013-03-14T08:12:45Z</updated>
    <subtitle>Musings on Technology, Entrepreneurship and Life</subtitle>
    <generator uri="http://www.typepad.com/">TypePad</generator>
    <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/smartsoftware" /><feedburner:info uri="smartsoftware" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><feedburner:browserFriendly>This is an XML content feed. It is intended to be viewed in a newsreader or syndicated to another site, subject to copyright and fair use.</feedburner:browserFriendly><entry>
        <title>Immutable Collections Critique</title>
        <link rel="alternate" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2013/03/immutable-collections-critique.html" />
        <link rel="service.edit" type="application/atom+xml" href="http://www.typepad.com/t/atom/weblog/blog_id=7693/entry_id=6a00d8345242f069e2017ee94bae42970d" title="Immutable Collections Critique" />
        <link rel="replies" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2013/03/immutable-collections-critique.html" thr:count="6" thr:when="2013-03-21T16:19:31Z" />
        <id>tag:typepad.com,2003:post-6a00d8345242f069e2017ee94bae42970d</id>
        <published>2013-03-14T01:12:45-07:00</published>
        <updated>2013-03-16T00:15:08Z</updated>
        <summary>Microsoft released a preliminary version of immutable collections with mutable performance. It uses many of the same performance tricks that I used to build functional collections such as supporting freezable binary tree data structures. The algorithm used is similar to AVL trees in that it minimizes the difference in heights of two child trees. Red-black trees are not used probably because the number of rotations will still cause the O(lg(n)) allocations in an immutable setting and these allocations are a function of the height of the whole tree.. However, one major difference is the lack of support for constant equality comparison and fast set operations, each of which will limit the usage scenarios for the collection. Set operations in the...</summary>
        <author>
            <name>Wes</name>
        </author>
        
        
<content type="xhtml" xml:lang="en-us" xml:base="http://wesnerm.blogs.com/net_undocumented/">
<div xmlns="http://www.w3.org/1999/xhtml"><p>Microsoft released a preliminary version of <a href="http://blogs.msdn.com/b/andrewarnottms/archive/2011/08/30/immutable-collections-with-mutable-performance.aspx">immutable collections with mutable performance</a>. It uses many of the same performance tricks that I used to build functional collections such as supporting freezable binary tree data structures. The algorithm used is similar to AVL trees in that it minimizes the difference in heights of two child trees. Red-black trees are not used probably because the number of rotations will still cause the O(lg(n)) allocations in an immutable setting and these allocations are a function of the height of the whole tree..</p>  <p>However, one major difference is the lack of support for constant equality comparison and fast set operations, each of which will limit the usage scenarios for the collection. Set operations in the Microsoft library are O(m lg(m+n) ), but can be implemented in O(m lg(n/m)) and even faster with constant-time equality comparisons O(d lg(n/d)), where d is the minimum of the number of differences between the two sets and m. A common problem which this helps is quickly determining all the changes between two different versions of the same collections, but I often use another solution with O(d). This is important for using immutable data structures pervasively such in a layout engine, where there are potentially many items and the interface needs to remain responsive.</p>  <p>At some point in the future, I plan to open-source my functional collections and other data structures as they are typically the fastest implementations available. In addition, multiple different strategies can be used to optimized for different types of usages such as full persistence, back-tracking, and ephemerality.</p>  <p>UPDATE: I just pulled the performance numbers from my head from a while back, but I think I made a mistake. They look wrong to me. I wrote quickly, and, on reflection, the Microsoft libraries should probably be O(m lg (n+m )), not O(n+m), for the union, intersection, etc operations. I also need to verify the bounds for mine. I’ll probably post a correction with a detailed analysis in the near future.</p></div>
</content>


    </entry>
    <entry>
        <title>Mobile Innovations</title>
        <link rel="alternate" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2013/03/mobile-innovations.html" />
        <link rel="service.edit" type="application/atom+xml" href="http://www.typepad.com/t/atom/weblog/blog_id=7693/entry_id=6a00d8345242f069e2017ee94b7097970d" title="Mobile Innovations" />
        <link rel="replies" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2013/03/mobile-innovations.html" thr:count="0" />
        <id>tag:typepad.com,2003:post-6a00d8345242f069e2017ee94b7097970d</id>
        <published>2013-03-14T00:41:00-07:00</published>
        <updated>2013-03-14T11:51:24Z</updated>
        <summary>2013 will be the year of real hardware and software mobile innovation following many years of incremental spec bumps in smartphone technology. A number of smartphone manufacturers, Apple, Samsung, Nokia, and Motorola, are set to release this year highly differentiated smartphones that have been in been development for many years now. The new technologies we should expect this year: 3D operating system and hardware Tactile touchscreens Full HD displays Integrated displays Eye-tracking, hand gestures Flexible, bendable displays Nontraditional hybrid phone-tablets (eg, foldable or stretchable) Solar panels Edge-to-edge and button-less displays Samsung is launching the new flagship Galaxy S4 smartphone with technology breakthroughs that “rival color television.” Leaked phones of a new Samsung device appear to be an updated Galaxy S3...</summary>
        <author>
            <name>Wes</name>
        </author>
        
        
<content type="xhtml" xml:lang="en-us" xml:base="http://wesnerm.blogs.com/net_undocumented/">
<div xmlns="http://www.w3.org/1999/xhtml"><p>2013 will be the year of real hardware and software mobile innovation following many years of incremental spec bumps in smartphone technology. A number of smartphone manufacturers, Apple, Samsung, Nokia, and Motorola, are set to release this year highly differentiated smartphones that have been in been development for many years now.</p>  <p>The new technologies we should expect this year:</p>  <ul>   <li>3D operating system and hardware </li>    <li>Tactile touchscreens </li>    <li>Full HD displays </li>    <li>Integrated displays </li>    <li>Eye-tracking, hand gestures </li>    <li>Flexible, bendable displays </li>    <li>Nontraditional hybrid phone-tablets (eg, <a href="http://www.androidauthority.com/samsung-files-tri-fold-flexible-display-patent-smartphones-127455/">foldable</a> or <a href="http://www.dailytech.com/New+Samsung+Flexible+Display+Patent+Detailed/article24184.htm">stretchable</a>) </li>    <li>Solar panels </li>    <li>Edge-to-edge and button-less displays </li> </ul>  <p>Samsung is launching the new flagship Galaxy S4 smartphone with technology breakthroughs that “rival color television.” Leaked phones of a new Samsung device appear to be an updated Galaxy S3 to ship alongside the S4. Strong evidence indicates that the Galaxy S4 will include 3D technology, hand gestures, eye-tracking technology and a flexible display. A flexible display is a necessary prerequisite for a tactile touchscreen. The best indications are from Samsung TV, which already ship with most of these technologies. Both Samsung and Nokia representatives in 2011 previously stated that 3D phones would be emerge in 2012; and many rumors even predicted Galaxy S3 sporting a 3D display. Samsung has a few 3D smartphone patents; their announcement teaser also hints with a three-dimensional 4 logo.</p>  <p>Apple has been recently criticized for lack of innovations, somewhat rightly, since many of the improvements in the OS and hardware since the original iPhone have been lackluster. Apple will likely release a major hardware and software innovations that will roll up a multiyear development effort. Based on patents, Apple has apparently branched off its OS efforts to support a complete redesign based on 3D and other advanced input technologies, of which there are many patents. Almost all new features of the past six iterations lie outside the core user interface as Apple has chosen not to invest in a moribund interface. Steve Jobs mentioned in the launch of the iPhone. </p>  <div>   <blockquote>     <p>There's an old Wayne Gretzky quote that I love. 'I skate to where the puck is going to be, not where it has been.' And we've always tried to do that at Apple. Since the very very beginning. And we always will.</p>   </blockquote>    <p>Game-changing technologies require a long incubation period to develop, which is not helped by the annual shutdown and re-launch of a yearly product cycle. According to Apple’s designer Ive, a proper design involves multiple iterations to correct flaws and incorporate feedback from the whole process, so that new features are not simply tacked up but fit into a whole gestalt. For instance, eye-tracking, hover gestures and tactile touchscreens make the whole 3D experience work, as objects angled away from the screen are confusing to touch. <a href="http://www.extremetech.com/electronics/139179-future-of-input-eye-tracking">Eye-tracking is a high-speed, maximum-accuracy input mechanism</a> that works seamlessly on a 3D display.</p>    <p>Apple’s patents reveal interest in a featureless front phone surface, almost indistinguishable from the back. The display becomes the device and subsumes the functionality previously of discrete components such as <a href="http://appleinsider.com/articles/09/01/08/apple_files_patent_for_camera_hidden_behind_display">cameras</a>, <a href="http://www.patentlyapple.com/patently-apple/2013/03/apple-invents-a-speaker-system-with-sound-radiating-surface.html">speakers</a>, <a href="http://www.patentlyapple.com/patently-apple/2011/02/a-future-optical-display-could-turn-your-iphone-into-a-scanner.html">optical sensors</a>, <a href="http://www.patentlyapple.com/patently-apple/2013/02/apple-wins-a-shocker-with-2008-touch-based-solar-panel-patent.html">solar panels</a>, <a href="http://www.patentlyapple.com/patently-apple/2012/05/apple-reveals-wildly-intelligent-multi-tiered-haptics-system.html">multi-tiered haptics</a>, and <a href="http://gadgets.ndtv.com/mobiles/news/apple-granted-patent-for-iphone-5s-integrated-touch-display-340011">touch</a>. One patent allows the home button to disappear and reappear with specialized hardware. In this way, the front panel becomes the whole display.</p>    <p>Recent leaks indicate two new iPhones are at or near production for an upcoming release in the next couple months. A new iPhone 5S in the iPhone 5 form factor but with an “advanced Retina+ display” and a large-screen “budget” iPhone with a plastic case minus the home button. The advanced display is likely the revolutionary new integrated display mentioned earlier. The plastic casing may be a placeholder for an all new case made from LiquidMetal, a metallic glass, via an <a href="http://www.patentlyapple.com/patently-apple/2013/01/apple-reveals-new-machinery-for-creating-liquidmetal-forms.html">Apple-patented process</a>.</p>    <p>Motorola and Nokia will also introduce revolutionary technologies, but they will likely be thwarted in market share gains by new technologies from Apple and Samsung. Motorola is working on a new X-phone that will feature a longer battery life, an unbreakable screen, and an unprecedented new technology. My research indicates early work on solar panels. The use of a flexible screen indicates a tactile display. Nokia is likely to introduce new 3D technology, a bendable phone, a hybrid tablet-phone.</p> </div></div>
</content>


    </entry>
    <entry>
        <title>Mainstreaming of Functional Programming</title>
        <link rel="alternate" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2012/08/mainstreaming-of-functional-programming.html" />
        <link rel="service.edit" type="application/atom+xml" href="http://www.typepad.com/t/atom/weblog/blog_id=7693/entry_id=6a00d8345242f069e2017c317da091970b" title="Mainstreaming of Functional Programming" />
        <link rel="replies" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2012/08/mainstreaming-of-functional-programming.html" thr:count="4" thr:when="2012-08-30T19:22:33Z" />
        <id>tag:typepad.com,2003:post-6a00d8345242f069e2017c317da091970b</id>
        <published>2012-08-26T20:42:49-07:00</published>
        <updated>2012-08-27T03:42:49Z</updated>
        <summary>I started out on my projects using imperative techniques, but have since rewritten all my code to based on the functional paradigm. It was the only way that I could eliminate the escalating complexity in achieving my ambitious goals. Although the first programming language I was formally taught in a university was Lisp, much of my early programming background was in C++. I used to be an early advocate of object-oriented programming, but I soon noticed its limitation in programming in a more mathematical way, particularly those involving functions or manipulating symbolic expressions algebraically. This meant that translating ideas into code often resulted in a program that are overly complex with little resemblance to the underlying ideas. Also, the types...</summary>
        <author>
            <name>Wes</name>
        </author>
        
        
<content type="xhtml" xml:lang="en-us" xml:base="http://wesnerm.blogs.com/net_undocumented/">
<div xmlns="http://www.w3.org/1999/xhtml"><p>I started out on my projects using imperative techniques, but have since rewritten all my code to based on the functional paradigm. It was the only way that I could eliminate the escalating complexity in achieving my ambitious goals. </p>  <p>Although the first programming language I was formally taught in a university was Lisp, much of my early programming background was in C++. I used to be an early advocate of object-oriented programming, but I soon noticed its limitation in programming in a more mathematical way, particularly those involving functions or manipulating symbolic expressions algebraically. This meant that translating ideas into code often resulted in a program that are overly complex with little resemblance to the underlying ideas. Also, the types of problems that I have solved in functional languages were considerably more higher level than in traditional languages using techniques such as pattern matching and unification. I have seen programs in natural language and AI written in C++, where the programmers recreated the Lisp runtime complete with garbage collection and linked data structures.</p>  <p>While there are some formalisms such as sigma calculus, which feels a bit contrived, OO programming does not seem to have a strong mathematical basis. The various principles and practices that accompany object-oriented training resemble a discipline still in the “art” phase rather than “science.” My general feeling is that object-oriented programming is the current “fad” of our times and won’t be as relevant in the middle of the century. FP has a stronger mathematical basis and scales big and small to different types of computing such as over a network.</p>  <p>I am reminded of Wes Dyer, Microsoft C# developer, post on the conceptual simplicity of functional programming to its alternative.</p>  <blockquote>   <p>Imperative programming is sometimes reminiscent of a <a href="http://en.wikipedia.org/wiki/Rube_Goldberg_machine">Rube Goldberg machine</a>. Both require meticulous thought to ensure that a process works correctly despite a myriad of state transitions and interdependencies. It is amazing these complicated programs work at all. </p>    <p>Dijkstra <a href="http://www.cs.utexas.edu/users/EWD/transcriptions/EWD10xx/EWD1036.html">pointed out</a> that <a href="http://www.cs.utexas.edu/~EWD/transcriptions/EWD03xx/EWD303.html">too many</a> programmers rely on executing a program in order to understand it. The reason is imperative programs lack sufficient underlying formalisms to make guarantees about any but the most trivial of programs. As much as I love a debugger, it is disheartening to need to use it to understand my code.</p> </blockquote>  <p>Languages like Clojure and Scala have incorporated functional programming at the core. Rob Hickey, inventor of Clojure, gave a presentation “<a href="http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hickey">Are We There Yet</a>?” in which he identified the sources of complexity as deriving from mutation from his 20 years of software experience. He also has another good presentation “<a href="http://www.infoq.com/presentations/Simple-Made-Easy">Simple Made Easy</a>.”</p>  <p>CMU <a href="http://existentialtype.wordpress.com/2012/08/17/intro-curriculum-update/">revised its computer science curriculum</a> to <a href="http://existentialtype.wordpress.com/2011/03/15/teaching-fp-to-freshmen/">emphasize functional programming over object-oriented programming</a>. Object-oriented programming was eliminated from the introductory curriculum, because it is “anti-modular and anti-parallel by its very nature, and hence unsuitable for a modern CS curriculum.” Also, the “new data structures course emphasizes parallel algorithms as the general case, and places equal emphasis on persistent, as well as ephemeral, data structures.” When I first learned about functional data structures, I felt that they were just as important as any other data structures taught in my Algorithms course but that the university unwittingly steered us into an imperative programming paradigm.</p>  <p>One <a href="http://existentialtype.wordpress.com/2011/03/15/teaching-fp-to-freshmen/#comment-132">comment</a> from the post elaborates that object-oriented programming is anti-concurrent because it’s about state which is shared liberally. It’s also anti-modular because of dependencies. Functional data structures don’t have these characteristics because they are immutable and acyclic. “Clean code” thinking like SOLID principles and others have emerged to compensate for deficiencies in OO programming though “to not much avail.” I would add that mutation also loses information from prior versions of data structures, which can result in unnecessary and awkward limitations in software.</p>  <p>It’s good to see some activity at Microsoft, which in my mind used to be the epitome of stateful programming. Joe Duffy, Microsoft architect and concurrency expert, wrote in his blog on the <a href="http://www.bluebytesoftware.com/blog/2010/07/12/ThoughtsOnImmutabilityAndConcurrency.aspx">benefits of immutability</a>.</p>  <blockquote>   <p>What about concurrency? Immutable data structures facilitate sharing data amongst otherwise isolated tasks in an efficient zero-copy manner. No synchronization necessary. This is the real payoff.</p>    <p>For example, say we’ve got a document-editor and would like to launch a background task that does spellchecking in parallel. How will the spellchecker concurrently access the document, given that the user may continue editing it simultaneously? Likely we will use an immutable data structure to hold some interesting document state, such as storing text in a piece-table. OneNote, Visual Studio, and many other document-editors use this technique. This is zero-cost snapshot isolation.</p>    <p>Not having immutability in this particular scenario is immensely painful. Isolation won’t work very well. You could model the document as a task, and require the spellchecker to interact with it using messages.... Those kinds of message-passing races are non-trivial to deal with. Synchronization won’t work well either. Clearly we don’t want to lock the user out of editing his or her document just because spellchecking is occurring. Such a boneheaded design is what leads to spinning donuts, bleached-white screens, and “(Not Responding)” title bars. But clearly we don’t want to acquire a lock and then make a full copy of the entire document. Perhaps we’d try to copy just what is visible on the screen. This is a dangerous game to play.</p>    <p>Immutability does not solve all of the problems in this scenario, however. Snapshots of any kind lead to a subtle issue that is familiar to those with experience doing multimaster, in which multiple parties have conflicting views on what “the” data ought to be, and in which these views must be reconciled.</p>    <p>In this particular case, the spellchecker sends the results back to the task which spawned it, and presumably owns the document, when it has finished checking some portion of the document. Because the spellchecker was working with an immutable snapshot, however, its answer may now be out-of-date. We have turned the need to deal with message-level interleaving – as described above – into the need to deal with all of the messages that may have interleaved within a window of time. This is where multimaster techniques, such as diffing and merging come into play. Other techniques can be used, of course, like cancelling and ignoring out-of-date results. But it is clear something intentional must be done.</p> </blockquote>  <p>Joe seems to be working on an <a href="http://www.bluebytesoftware.com/blog/2010/09/18/WeAreHiring.aspx">experimental operating system</a> based on functional programming ideas.</p>  <p>What fascinates me is this post is how similar much of my thinking is to Microsoft’s regarding background tasks concurrently with editing and the use of multimaster synchronization techniques.</p></div>
</content>


    </entry>
    <entry>
        <title>Apple  Xerox</title>
        <link rel="alternate" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2012/08/apple-xerox.html" />
        <link rel="service.edit" type="application/atom+xml" href="http://www.typepad.com/t/atom/weblog/blog_id=7693/entry_id=6a00d8345242f069e20177445a3f7c970d" title="Apple &amp;amp; Xerox" />
        <link rel="replies" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2012/08/apple-xerox.html" thr:count="0" />
        <id>tag:typepad.com,2003:post-6a00d8345242f069e20177445a3f7c970d</id>
        <published>2012-08-26T16:00:50-07:00</published>
        <updated>2012-08-26T23:00:50Z</updated>
        <summary>With Apple’s outright win in the trial Apple vs Samsung over trade dress (“cloning”) and patent issues, various people have argued the importance of copying in the development of user interface, citing Apple’s copying of the Xerox Alto user interface. I came across two articles, explaining that there were substantial differences between Xerox and Apple’s graphical interfaces. The story of Apple’s copying appears to be in many ways a well-known myth. The Xerox Parc Visit On Xerox, Apple and Progress Apple employees had visited the research labs twice, which is a very limited amount of exposure time for substantial copying to take place. Xerox accept shares of Apple in exchange for the visits. Apple was already at work on bitmapped...</summary>
        <author>
            <name>Wes</name>
        </author>
        
        
<content type="xhtml" xml:lang="en-us" xml:base="http://wesnerm.blogs.com/net_undocumented/">
<div xmlns="http://www.w3.org/1999/xhtml"><p>With Apple’s outright win in the trial <a href="http://mashable.com/2012/08/24/apple-samsung-verdict/">Apple vs Samsung</a> over trade dress (“cloning”) and patent issues, various people have argued the importance of copying in the development of user interface, citing Apple’s copying of the Xerox Alto user interface.</p>  <p>I came across two articles, explaining that there were substantial differences between Xerox and Apple’s graphical interfaces. The story of Apple’s copying appears to be in many ways a well-known myth.</p>  <ul>   <li><a href="http://www-sul.stanford.edu/mac/parc.html">The Xerox Parc Visit</a></li>    <li><a href="http://www.folklore.org/StoryView.py?project=Macintosh&amp;story=On_Xerox,_Apple_and_Progress.txt&amp;topic=Origins&amp;sortOrder=Sort%20by%20Date&amp;detail=medium">On Xerox, Apple and Progress</a></li> </ul>  <p>Apple employees had visited the research labs twice, which is a very limited amount of exposure time for substantial copying to take place. Xerox accept shares of Apple in exchange for the visits. Apple was already at work on bitmapped user interfaces including multiple fonts and graphical capabilities. The Mac and Lisa projects actually predated the visits. Engineers Bill Atkinson and Jef Raskin had known of work at Xerox or interned there.</p>  <p>In addition, Apple’s graphical interfaces sported a number of improvements.</p>  <ul>   <li>Direct manipulation. A click on an object resulted in menu appearing.</li>    <li>Drag and Drop</li>    <li>Overlapping windows</li>    <li>Automatic repaints.  Alto required a click for the window to redraw itself.</li>    <li>Finder</li> </ul>  <p>Some inventions such as the mouse were already known prior to the visit and underwent substantial improvements before commercialization.</p></div>
</content>


    </entry>
    <entry>
        <title>Continuing Education</title>
        <link rel="alternate" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2012/05/continuing-education.html" />
        <link rel="service.edit" type="application/atom+xml" href="http://www.typepad.com/t/atom/weblog/blog_id=7693/entry_id=6a00d8345242f069e20163055a4508970d" title="Continuing Education" />
        <link rel="replies" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2012/05/continuing-education.html" thr:count="3" thr:when="2012-09-19T08:29:24Z" />
        <id>tag:typepad.com,2003:post-6a00d8345242f069e20163055a4508970d</id>
        <published>2012-05-08T06:08:43-07:00</published>
        <updated>2012-05-08T16:09:52Z</updated>
        <summary>A large fraction of my time these days is spent viewing online courses. Coursera (featuring courses from Stanford, Berkeley, Michigan, Penn and Princeton) and edX (featuring MIT &amp; Harvard) are spearheading free online access to high-quality top university courses. Udacity, founded by former Stanford professor Sebastien Thrun, is also noteworthy, though the content is not as valuable as the other two. I’ve taken all but one of the Coursera classes. These courses have mostly been computer-science related, though the variety of the subjects taught will soon expand to the humanities and other subjects. For me, the most beneficial part is the overview refresher to the field of artificial intelligence, taught by the leading professors in the field: AI. This was...</summary>
        <author>
            <name>Wes</name>
        </author>
        
        
<content type="xhtml" xml:lang="en-us" xml:base="http://wesnerm.blogs.com/net_undocumented/">
<div xmlns="http://www.w3.org/1999/xhtml"><p>A large fraction of my time these days is spent viewing online courses. Coursera (featuring courses from Stanford, Berkeley, Michigan, Penn and Princeton) and edX (featuring MIT &amp; Harvard) are spearheading free online access to high-quality top university courses. Udacity, founded by former Stanford professor Sebastien Thrun, is also noteworthy, though the content is not as valuable as the other two. </p>  <p>I’ve taken all but one of the Coursera classes. These courses have mostly been computer-science related, though the variety of the subjects taught will soon expand to the humanities and other subjects.</p>  <p>For me, the most beneficial part is the overview refresher to the field of artificial intelligence, taught by the leading professors in the field:</p>  <ul>   <li>AI. This was taught by Norvig (95% of AI courses uses his text) and Thrun </li>    <li>Machine Learning </li>    <li>Natural Language Processing by Manning and Jurafsky </li>    <li>Probabilistic Graphical Models </li>    <li>Game Theory </li>    <li>Computer Vision </li>    <li>Programming a Robotic Car</li>    <li>Introduction to Robotics </li>    <li>Data Mining</li>    <li>Computational Neuroscience </li> </ul>  <p>Some are courses I have already taken before, but, since its been nearly two decades, there has been considerable advances in technology that warrant a second look.   </p>  <p>Another site that I spend a lot of time on is <a href="http://pluralsight-training.net">PluralSight Training (pluralsight-training.net),</a> which has over 200 hardcore developer videos on a range of technology, mostly targeting the Microsoft platform, but also including other popular topics such mobile and cloud computing. At the current rate that I am consuming these PluralSight courses, by end of this year or next, I should be thoroughly familiar with nearly every aspect of the latest Microsoft technologies as well as the the most common mobile, web, and cloud APIs and services. PluralSight has an annual subscription package costing from $299 for web and mobile access to course videos to $499 (which includes assessments and certificates).</p>  <p>I have attempted various study programs over the past decade. For instance, I collected numerous online textbooks to read, but I find that many texts are difficult to go through without the foundation of an introductory course. Online course lectures require less conscious effort and there is also less chance that I will skip over the boring parts.</p>  <p>I marveled at how students have acquired university knowledge at accelerated pace and sought to replicate their high rate of knowledge acquisition. For instance, Scott Young is attempting an <a href="http://www.scotthyoung.com/blog/mit-challenge/">MIT challenge</a>, which is to learn four years of MIT OpenCourseWare material in a period of 12 months. With fourteen courses already completed so far this year, I am well on my way to acquiring another bachelor’s degree worth of knowledge (typically requiring four years) by the end of this year—a feat that I will likely repeat again each upcoming year. I am so excited.</p></div>
</content>


    </entry>
    <entry>
        <title>Rise of Big Data, Machine Learning and Data Mining</title>
        <link rel="alternate" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2012/01/rise-of-big-data-machine-learning-and-data-mining.html" />
        <link rel="service.edit" type="application/atom+xml" href="http://www.typepad.com/t/atom/weblog/blog_id=7693/entry_id=6a00d8345242f069e20167611dae07970b" title="Rise of Big Data, Machine Learning and Data Mining" />
        <link rel="replies" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2012/01/rise-of-big-data-machine-learning-and-data-mining.html" thr:count="0" />
        <id>tag:typepad.com,2003:post-6a00d8345242f069e20167611dae07970b</id>
        <published>2012-01-26T07:16:30-08:00</published>
        <updated>2012-01-26T15:17:15Z</updated>
        <summary>My approach in artificial intelligence have primarily been symbolic, and, in prior posts on AI, I indicated my skepticism on machine learning and other statistical techniques as a valid long-term approach to solving problems. With supervised learning techniques, it was possible to construct a function from inputs to output by learning from data. However, in many cases, particularly neural networks, the function remains a black box in which no model can be extracted out from which one can perform more complicated types of reasoning. This is not entirely true. In reality, neural networks involve a set of matrix calculations, which can be explored, and some techniques such as Bayesian models do offer multi-directional, not just bidirectional, inference in which the...</summary>
        <author>
            <name>Wes</name>
        </author>
        
        
<content type="xhtml" xml:lang="en-us" xml:base="http://wesnerm.blogs.com/net_undocumented/">
<div xmlns="http://www.w3.org/1999/xhtml"><p>My approach in artificial intelligence have primarily been symbolic, and, in prior posts on <a href="http://wesnerm.blogs.com/net_undocumented/2004/06/ai.html">AI</a>, I indicated my skepticism on machine learning and other statistical techniques as a valid long-term approach to solving problems. With supervised learning techniques, it was possible to construct a function from inputs to output by learning from data. However, in many cases, particularly neural networks, the function remains a black box in which no model can be extracted out from which one can perform more complicated types of reasoning. This is not entirely true. In reality, neural networks involve a set of matrix calculations, which can be explored, and some techniques such as Bayesian models do offer multi-directional, not just bidirectional, inference in which the sought probabilities of any node in the graph may be conditioned on any other nodes.</p>  <p>I spoke with a former Harvard classmate of mine, who pursued a PhD in Natural Language Processing at Harvard under the tutelage of Professor Stuart Sheiber, who also interested me in natural language. He went into Microsoft Research after obtaining his degree, only to leave the field of NLP for a director of program management position in the product groups, because he felt that we still don't really understand natural language. Given that natural language processing is the basis of some of my work and I developed effective approaches to incorporating natural language understanding in the products that I develop, the comment was somewhat disheartening. Later, after reviewing his CV, I discovered that his entire focus on natural language processing was focused on statistical techniques, which to me offers easy heuristics but very little explanatory power that only a real model could provide. Also, my focus has been more on natural language manipulation which is more tractable than inference and to watch for any emergent intelligence properties that could reduce the need for searches that inference would entail.</p>  <p>My gradual warming to machine learning techniques is the result of taking Andrew Ng's online courses on Machine Learning. I have read about neural networks independently and encountered many of the techniques multiple times in my applied math and management coursework--Bayesian modeling, Markov models, Decision Trees, Regression, etc--and even recognized their potential in program by including some of these algorithms in my AI libraries, however I never fully appreciated their power.</p>  <p>My warming also mirrors the gradual acceptance of these techniques by industry over the 1990s. Neural networks were initially discredited by a paper in 1970s by a well-known researcher in AI; the limitations on the expressiveness of neural networks were later overcome and the field exploded. In economics, the term data mining was once looked upon with disdain and not regarded as serious research, but the mathematical rigor combined with the growing volume of data of the digital age changed its perception into the one of the hottest subject areas in the discipline. Machine learning reduces the need to discover models yet yield good approximate results.</p>  <p>Peter Norvig, author of <i>AI: A Modern Approach</i>, the leading AI text with 95% market share, recently gave a presentation on the rise of big data and machine learning. He is currently the director of Research at Google, where he applies AI techniques to make sense of the vast amounts of web data crawled by the search engine. Peter Norvig also followed the transition from symbolic AI with his books. His first text on AI, written in 1992, was <i>Paradigms of AI: Case Studies in Common Lisp</i>, incorporating only symbolic approaches; the second text written mentioned earlier consists mostly of non-symbolic approaches. </p>  <p>His work at Google led him to write about the rise of data in the famous paper, <a href="http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/35179.pdf">The Unreasonable Effectiveness of Data</a>. Statistical approaches have automated and revolutionized natural language parsing and machine translation. In many cases, these proved superior to more expensive, human involved efforts. For instance, Chinese machine translation was automated without a single developer knowing the Chinese language.</p>  <p>In a lecture "<a href="http://www.youtube.com/watch?v=HT540VrCDwg&amp;feature=youtu.be">Innovation in Search and Artificial Intelligence</a>," Peter Norvig describes the rationale behind the movement from previous approaches to automated statistical approaches.</p>  <p>Below, I have included some of his remarks.</p>  <blockquote>   <p>First I want to talk about the way we understand the world and make models of the world and try to get them to our computers and make sense. This is the process of theory formation. Here's a guy. We call him Isaac and he makes some observations of the world. Then he gets an idea an decides to formulate the idea into form of a theory or model.</p>    <p><a href="http://wesnerm.blogs.com/.a/6a00d8345242f069e2016300289dc8970d-pi"><img style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="Image [4]" border="0" alt="Image [4]" src="http://wesnerm.blogs.com/.a/6a00d8345242f069e2016300289dd2970d-pi" width="508" height="329" /></a></p>    <p>Then you can apply the model to make predictions of the future.</p>    <p><a href="http://wesnerm.blogs.com/.a/6a00d8345242f069e20168e61f20f0970c-pi"><img style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="Image [2]" border="0" alt="Image [2]" src="http://wesnerm.blogs.com/.a/6a00d8345242f069e2016300289e1b970d-pi" width="507" height="332" /></a></p>    <p>It's great that approach works. But, of course, it could be thousands of years before we got someone who was smart enough to come up with a model like that. We need a process where we can iterate a lot faster--a more agile theory making process to get those kinds of advances.</p>    <p>One of the problems of this approach of formulating theories like that is that essentially all models are wrong, but some are useful.  They all make approximations somehow. They don't model the world completely, but some of them are very useful, like the ones Isaac was using. So if you are going to be wrong anyways, the question is "is there some shortcut so that you can trade off development time to advance much faster, but that may be a little more wrong, but can still be more useful?"</p>    <p>Initially, computer programs were taught to behave in this manner:</p>    <p><a href="http://wesnerm.blogs.com/.a/6a00d8345242f069e20168e61f2166970c-pi"><img style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="Image [3]" border="0" alt="Image [3]" src="http://wesnerm.blogs.com/.a/6a00d8345242f069e20168e61f216f970c-pi" width="509" height="328" /></a></p>    <p>There's input, output and data, but computer science was this stuff in the middle. In the past few decades, processing power of computers have increased dramatically.</p> </blockquote> He uses this example in many of his lectures. Traditionally, programs were the focus of artificial intelligence, but now the red circle has shifted to data. The program is not longer a custom written component, but a generic learning algorithm (like a neural network) that takes data to learn from in order to produce the appropriate output for each input. The function is effectively determined by training data.   <blockquote>   <p><a href="http://wesnerm.blogs.com/.a/6a00d8345242f069e2016300289e92970d-pi"><img style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="Image" border="0" alt="Image" src="http://wesnerm.blogs.com/.a/6a00d8345242f069e20167611dae00970b-pi" width="507" height="328" /></a></p> </blockquote>  <p>As if to emphasize the point, Norvig mentions how it was once believe that certain algorithms were inherently better than others. The improvements were tweaked to incorporate more advanced models or additional variables. However, an interesting phenomenon occurs when more training is fed to each of the algorithm. As the size of data increases by factors of 10X from sample sizes of thousands to billions, the performance rankings of the algorithms change positions. At some point, the behaviors of the algorithms asymptotes, whereby additional data really doesn't add much more information. The simpler algorithms often outperform the more advanced ones.</p></div>
</content>


    </entry>
    <entry>
        <title>Microsoft AI Initiatives</title>
        <link rel="alternate" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2012/01/microsoft-ai-initiatives.html" />
        <link rel="service.edit" type="application/atom+xml" href="http://www.typepad.com/t/atom/weblog/blog_id=7693/entry_id=6a00d8345242f069e20167610b2a5a970b" title="Microsoft AI Initiatives" />
        <link rel="replies" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2012/01/microsoft-ai-initiatives.html" thr:count="0" />
        <id>tag:typepad.com,2003:post-6a00d8345242f069e20167610b2a5a970b</id>
        <published>2012-01-25T01:42:31-08:00</published>
        <updated>2012-01-25T09:43:03Z</updated>
        <summary>Several computer science classes focus on algorithms. These include classes in data structures, artificial intelligence, computer graphics and numerical computing. Some of these data structures are quite involved and I have felt that they should be incorporated inside system libraries. Many of the classical data structures have in the 1990s become a staple of standard libraries such as the Standard Template Library of C++ and with the frameworks included with the Java and .NET runtimes. However, libraries for numerical computing (manipulating matrices and performing statistics), handling artificial intelligence, or doing computationally geometry have still not found themselves as full-class citizens in modern APIs, although 3D graphics do have some presence. There have been some recent activity in developing consumable AI...</summary>
        <author>
            <name>Wes</name>
        </author>
        <category scheme="http://www.sixapart.com/ns/types#category" term=".NET" />
        
        
<content type="xhtml" xml:lang="en-us" xml:base="http://wesnerm.blogs.com/net_undocumented/">
<div xmlns="http://www.w3.org/1999/xhtml"><p>Several computer science classes focus on algorithms. These include classes in data structures, artificial intelligence, computer graphics and numerical computing. </p>  <p>Some of these data structures are quite involved and I have felt that they should be incorporated inside system libraries. Many of the classical data structures have in the 1990s become a staple of standard libraries such as the Standard Template Library of C++ and with the frameworks included with the Java and .NET runtimes. However, libraries for numerical computing (manipulating matrices and performing statistics), handling artificial intelligence, or doing computationally geometry have still not found themselves as full-class citizens in modern APIs, although 3D graphics do have some presence.</p>  <p>There have been some recent activity in developing consumable AI libraries in the past few years at Microsoft.</p>  <p>With SQL Server 2005, Microsoft <a href="http://msdn.microsoft.com/en-us/library/ms345131(v=sql.90).aspx">incorporated various AI and data mining packages</a>: decision trees, association rules, naïve Bayes, sequence clustering, time series, neural nets, and text mining. A few years ago, Microsoft developed the <a href="http://msdn.microsoft.com/en-us/devlabs/hh145003">Windows Solver Foundation</a> libraries that include optimization, solvers, and latent term-rewriting functionality. A <a href="http://www.microsoft.com/mscorp/execmail/2010/05-17HPC.mspx">Technical Computing Initiative</a> was launched, but some of the players involved have left the company and the output from the initiative remains to be seen. It's also not clear the goals of this initiative.</p>  <p>Microsoft had for a long while made available a Speech API, but its recognition capabilities are somewhat weak and frustrating. There is still no general purpose Natural Language API; this is somewhat complicated by the need to support multiple languages.</p>  <p>Recent developer events have introduced new libraries from research: </p>  <ul>   <li><a href="http://research.microsoft.com/en-us/um/cambridge/projects/infernet/">Infer.NET</a> supports probabilistic inference. The application of this library though is quite limited. </li>    <li>A more promising library called <a href="http://www.microsoftpdc.com/2009/SVR32">Semantic Engine</a> includes a range of technologies from Machine Learning, Computer Vision, Natural Language and others.</li> </ul>  <p><a href="http://wesnerm.blogs.com/.a/6a00d8345242f069e2016300162aef970d-pi"><img style="background-image: none; border-bottom: 0px; border-left: 0px; padding-left: 0px; padding-right: 0px; display: inline; border-top: 0px; border-right: 0px; padding-top: 0px" title="mse-slide-architecture" border="0" alt="mse-slide-architecture" src="http://wesnerm.blogs.com/.a/6a00d8345242f069e2016300162af9970d-pi" width="487" height="370" /></a></p>  <p>There are some downsides to most of these new libraries. They are based on managed code and currently have restrictions that prohibit non-internal commercial use. </p></div>
</content>


    </entry>
    <entry>
        <title>Leverage in the Software Business</title>
        <link rel="alternate" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2012/01/leverage-in-the-software-business.html" />
        <link rel="service.edit" type="application/atom+xml" href="http://www.typepad.com/t/atom/weblog/blog_id=7693/entry_id=6a00d8345242f069e2016760f91fed970b" title="Leverage in the Software Business" />
        <link rel="replies" type="text/html" href="http://wesnerm.blogs.com/net_undocumented/2012/01/leverage-in-the-software-business.html" thr:count="0" />
        <id>tag:typepad.com,2003:post-6a00d8345242f069e2016760f91fed970b</id>
        <published>2012-01-23T16:17:15-08:00</published>
        <updated>2012-01-24T00:18:26Z</updated>
        <summary>It’s a great time to be in the software business, because are many levers available to quickly produce products. Open Source. In recent years, open source has become a true phenomenon. One can find libraries for advanced technologies that are competitive with research offerings from the likes of Google and Microsoft. Even Google relies heavily on open source, which may be a key reason it iterates faster than Microsoft, which develops most of its software in-house. For instance, Chrome, itself based on the WebKit open source project, uses over 80 other open-source libraries credited in it About box. From machine translation to text-to-speech to optical character recognition to computer vision to numerical computing to video processing to GIS, the range...</summary>
        <author>
            <name>Wes</name>
        </author>
        <category scheme="http://www.sixapart.com/ns/types#category" term="Entrepreneurship" />
        
        
<content type="xhtml" xml:lang="en-us" xml:base="http://wesnerm.blogs.com/net_undocumented/">
<div xmlns="http://www.w3.org/1999/xhtml"><p>It’s a great time to be in the software business, because are many levers available to quickly produce products.</p>  <p><strong>Open Source</strong>. </p>  <p>In recent years, open source has become a true phenomenon. One can find libraries for advanced technologies that are competitive with research offerings from the likes of Google and Microsoft. Even Google relies heavily on open source, which may be a key reason it iterates faster than Microsoft, which develops most of its software in-house. For instance, Chrome, itself based on the WebKit open source project, uses over 80 other open-source libraries credited in it About box. From machine translation to text-to-speech to optical character recognition to computer vision to numerical computing to video processing to GIS, the range of competencies offered from open-source to the new startup is breathtaking. In addition to the traditional source code repositories like SourceForge and CodeProject, many platform and book samples as well as course code offer ready-to-use technology.</p>  <p><strong>Cross-platform languages</strong>. </p>  <p>Several cross-platform solutions have emerged C#/Mono, Qt, Air, HTML and Java to allow the products to be built on one platform such as Windows and quickly migrated to others such as mobile devices and the Mac.</p>  <p><strong>Open Data</strong>.</p>  <p>Beside source code, data (both raw numbers and media files) is available freely from the government, universities and elsewhere. Natural language information is available from the Linguistic Data Consortium. Data for mapping, demographics and nutrition is freely available from the government. Websites like infochimp.com serve as a portal for these types of data files.</p>  <p><strong>Component Libraries</strong>.</p>  <p>For hard to obtain source code and data, there are companies that offer for small sums access to that data. User interface libraries are pervasive. Nuance licenses its speech recognition technology for other companies to use within their products.</p>  <p><strong>Web Services</strong>.</p>  <p>Web APIs potentially offer instant access to valuable services on the Web, though tend to be less stable that OS-specific APIS. Nick Bradbury wrote of the <a href="http://nick.typepad.com/blog/2011/11/the-long-term-failure-of-web-apis.html">long-term failure of Web APIs</a>, because web APIs have to be maintained continuously and any software that relies on them will need to be updated over time and could potentially break in the future.</p>  <p>A software company could provide its own gateway web service to ameliorate this situation, so that the client application should not have to change. Another advantage of this approach is that the company may use GPL code that would otherwise not be commercially viable.</p></div>
</content>


    </entry>
 
</feed><!-- ph=1 -->
