<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" version="2.0">

<channel>
	<title>Musings</title>
	
	<link>http://cbeer.info/blog</link>
	<description />
	<lastBuildDate>Tue, 21 Jun 2011 14:18:52 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3-aortic-dissection</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/cbeer" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="cbeer" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Open Repositories Developer Challenge: Microservices</title>
		<link>http://cbeer.info/blog/2011/open-repositories-developer-challenge-microservices/</link>
		<comments>http://cbeer.info/blog/2011/open-repositories-developer-challenge-microservices/#comments</comments>
		<pubDate>Tue, 21 Jun 2011 14:18:52 +0000</pubDate>
		<dc:creator>chris</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://cbeer.info/blog/?p=537</guid>
		<description><![CDATA[As part of the Developer Challenge at Open Repositories 2011, Jessie Keck, Michael Klein, Bess Sadler and I submitted the emergent community for Ruby-based curation microservices. While I had written some initial code in late 2010, I only intended to &#8230; <a href="http://cbeer.info/blog/2011/open-repositories-developer-challenge-microservices/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>As part of the <a href="http://devcsi.ukoln.ac.uk/blog/dev-challenge-or11/">Developer Challenge</a> at Open Repositories 2011, <a href="https://github.com/jkeck">Jessie Keck</a>, <a href="https://github.com/mbklein">Michael Klein</a>, <a href="http://github.com/bess/">Bess Sadler</a> and I submitted the emergent community for <a href="https://github.com/microservices">Ruby-based curation microservices</a>. While I had written some initial code in late 2010, I only intended to experiment with the <a href="http://www.cdlib.org/services/uc3/curation/index.html">California Digital Library microservices</a> and explore how the microservices model could be used within an application, so it was never intended to be "production" ready.</p>
<p>Taking inspiration from Jim Jagielski's opening keynote "<a href="http://people.apache.org/~jim/presos/OR2011/Open_Source_NotJust.pdf">Open Source: It’s just not for IT anymore! (pdf)</a>", we wanted to help foster a community around the microservices, and so we took a number of initial steps to convert the various implementations of ruby microservices into a better community-driven, collaborative project:</p>
<ol>
<li>Created a <a href="https://github.com/microservices">microservices</a> "organization" on github to hold the community-driven source code repositories. Before, the projects were held under a personal account that had a diversity of projects in various states of use and support. By creating a topic-driven organization, we hope to attract contributors and promote easier discovery of these projects</li>
<li>Created a <a href="http://groups.google.com/group/ruby-microservices">mailing list</a> to record decisions, answer questions, and collaborate.</li>
<li>Agreed to <a href="http://groups.google.com/group/ruby-microservices/browse_thread/thread/1654356b2bb89006">a set of standards and practices</a> for microservices projects to ensure consistency and quality across these projects:
<ol>
<li>Basic "meta" files -- like README, TODO, LICENSE, etc -- should be present and contain enough information to help people get started using and contributing to the projects</li>
<li>Clarified source code licenses, and standardized on the Apache Public License 2.0 for each project.</li>
<li>Vastly improved the source code testing and documentation coverage, and standardized around <a href="http://relishapp.com/rspec">rspec</a> and <a href="http://yardoc.org">yard</a>. Projects are now subject to <a href="http://hudson.projecthydra.org/">continuous integration</a> to ensure tests pass, documentation is built, and test coverage remains high.</li>
</ol>
</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://cbeer.info/blog/2011/open-repositories-developer-challenge-microservices/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Open Repositories '11 presentation</title>
		<link>http://cbeer.info/blog/2011/or11/</link>
		<comments>http://cbeer.info/blog/2011/or11/#comments</comments>
		<pubDate>Fri, 10 Jun 2011 19:50:00 +0000</pubDate>
		<dc:creator>chris</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://cbeer.info/blog/?p=532</guid>
		<description><![CDATA[slides (pdf) Managing digital media content adds different challenges to file management than traditional text and images. The content is time based, and therefore more complex. Even the metadata needed to describe all aspects of the content to support better &#8230; <a href="http://cbeer.info/blog/2011/or11/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><span style="font-weight: bold; float: right"><a href="http://cbeer.info/~chris/or11.pdf">slides (pdf)</a></span></p>
<p>Managing digital media content adds different challenges to file management than traditional text and images.	The content is time based, and therefore more complex.	Even the metadata needed to describe all aspects of the content to support better access is more complicated. Even after media materials have been cataloged, digitized, and stored in a repository or database, scholars and archivists lack the tools to manage and expose the data to the world. Significant workflow challenges exist to go from large, preservation-quality digital files to media appropriate for delivery across the Internet.</p>
<p>The WGBH Media Library and Archive department (MLA) manages a collection of over 750,000 items dating back to the late 1940’s. As an educational foundation and the creator of a valuable collection of media resources, WGBH has embraced new developments in online media in its efforts to bring its archived materials to a broader audience and to serve the needs of the academic community. WGBH is successful in exposing content for the public through national production websites such as American Experience, FRONTLINE and NOVA, whose customized and carefully constructed features and services create added value for end users by encouraging the dissemination and use of WGBH- owned content. Like many similar institutions, this has been supported by the deployment of many ad-hoc, silo-ed content management systems on a project-by-project basis with each portal maintaining unique metadata and media assets, making it difficult to create new, innovative interfaces and services with the underlying content.</p>
<p>In 2000, in partnership with a vendor, WGBH developed a DAM architecture for media access and <a href="http://daminfo.wgbh.org">published reference architecture documentation</a> for other media organizations to replicate the work. The preservation DAM system is based on a proprietary system from the publishing and creative industries with limitations for metadata structure and interface. The vendor tended to develop the system toward what they saw as market trends and viable business sales. WGBH has found that although the system works, it is not flexible to the changing needs of the media industry, and the vendor is unable to tailor the software to our particular user needs without significant additional investment. In addition, upgrades are costly and time consuming, and all of the site-specific customizations built around the software need simultaneous upgrading by internal teams (e.g. extensive customizations to support media ingestions of large video files requiring limited technical knowledge). The customization links often break and need to be rewritten with every upgrade.</p>
<h2>Access</h2>
<p>To address the need to expose archival content in a sustainable manner, for a variety of audiences, and to encourage innovation within media archives, WGBH created Open Vault2, which provides a digital access portal into a cross-section of material from the WGBH Media Library and Archives. Although designed as an access portal, a secondary objective in creating Open Vault was to explore the potential for the system to fit within the multifaceted content management ecosystem for both access and preservation use.</p>
<p>WGBH Open Vault is built using Blacklight3, Solr and the Fedora repository. Beyond the Open Vault user interface, we exposed a number of APIs, either for internal use or to support existing data exchange projects, including Atom/RSS feeds, unAPI4, oEmbed5, and OAI-PMH. By taking advantage of existing open-source solutions as much as possible, we were able to focus our efforts towards domain-relevant issues. This has proven a reliable platform, and we have since deployed similar technology for a couple cross-institutional, data-intensive projects.</p>
<p>In 2006, WGBH launched Open Vault, an access repository based on CWIS. This site combined clips of media assets from four different series (three of which had separate finding aid websites created earlier).</p>
<p>In 2008/9, WGBH MLA and Interactive completed an Andrew W. Mellon Foundation funded project which allowed us to work closely with humanities scholars researching their needs and habits in using digital media in their work. We developed a prototype, dubbed "Open Vault Research", using Fedora and a PHP front-end. One discovery was scholars lack tools for working with media, while traditional scholarship is still focused on citing textual resources. To address this, we created a number of tools for working with media material:<br />
- aligned transcripts, which allows the user to rapidly scan the transcript of an interview, and seek immediately to a section of interest;<br />
- annotations + tags, which allows the user to segment and describe media fragments and refer back to those notes later;<br />
- fragment addressing, which allows the savvy user to deep-link into a particular point in an object.</p>
<p>Taking these user needs into account, we developed Open Vault v2 using Blacklight and the Fedora repository.</p>
<p>Finally, we are about to deploy a new iteration of Open Vault using Blacklight 3.0 (and, as a footnote, although our application has significantly different behavior, the customizations are only about 3500 lines of code, more than half as HTML templates). Although the Hydra framework as matured significantly since the beginning of the project, because the management of the media and metadata is still performed in external systems, we continue to access the Fedora APIs directly.</p>
<p>In this redesign, we looked at usage patterns over the collection and re-organized and re-prioritized elements of the user experience.<br />
 - The majority of our users entered the website at a record page from an external search engine (with about a 50% bounce rate). However, if a user stayed and watched a video, often they would navigate the website to "related content" (exposed using solr more like this)</p>
<p>- Subject browse was used more frequently than expected to give an overview of the materials in the collection</p>
<h2>Re-use</h2>
<p><a href="http://www.digitalcommonwealth.org/browse/?q=wgbh&#038;go.x=0&#038;go.y=0">Digital Commonwealth</a> is an on-going project, to which we began contributing material from our first iteration of Open Vault using OAI-PMH.</p>
<p><a href="http://projectvietnam.ccnmtl.columbia.edu">Project Vietnam</a> was a collaboration with the Columbia Center  for New Media, Teaching and Learning, that embedded material from Vietnam: A Television History that we exposed on Open Vault. We spent a significant amount of time figuring out how to exchange media and metadata with CCNMTL and settled on a handful of open standards.</p>
<p>The Mozilla Foundation/WebMadeMovies also wanted to work with us around HTML5-based media experiments. To develop a <a href="http://code.chirls.com/cpb/wgbh/#org.wgbh.mla:ecede706a8e277ab3b80052325707476364ab69e\0">quick demonstrator</a>, Mozilla, with little assistance or guidance from the Open Vault team, was able to successfully build a javascript-based discovery interface using our OpenSearch API, and integrate both our video content and TEI-encoded transcripts, into their popcorn.js environment.</p>
<h2>Technology</h2>
<p>For our media player environment, we needed a technology that supports several requirements:</p>
<ul>
<li>the ability to jump into any point of an item, which is especially important when serving hour long raw interviews (which excludes standard delivery (over HTTP, or otherwise) of the content),</li>
<li>an open source, or low cost, delivery platform,</li>
<li>a robust javascript API that allows us, at a minimum, to programmatically adjust the playhead (which we use to provide media/transcript synchronization, "deep linking" into a video, and annotation of media fragments)</li>
</ul>
<p>We finally settled on a Flash-based player (which provides a more consistent user experience) with an HTML5-based fallback (to support iOS devices and others).</p>
<p>For delivery, we're using h.264 pseudostreaming, which is delivered over HTTP (and is fully compatible with traditional HTTP delivery for clients that don't support h.264 pseudostreaming, which makes alternate uses easier).</p>
<p>To support ease of reuse, we adopted principle that the “Website is the API”, and in so doing, were able to bake in discoverable standards and approaches that are replicable to other holdings and implementations. This approach included semantic markup, adding additional contextual information (as alternative link relations), and exposing user state information (e.g. the location of the playhead) within the page content.</p>
<p>To support advanced usage, we also expose a number of auto-discoverable APIs which expose structured information to recompose page elements without needing to parse web page HTML.</p>
<h3>OAI-PMH</h3>
<p>To support "traditional" aggregation, like the Digital Commonwealth project, we have an OAI-PMH endpoint. (see also why OAI-PMH should die)</p>
<h3>OpenSearch (blacklight)</h3>
<p>For other aggregation efforts, we provide an OpenSearch endpoint that allows simple machine-to-machine discovery in a standard way.</p>
<h3>Atom/RSS (blacklight)</h3>
<p>All search results expose a discoverable Atom/RSS feed. Blacklight also provides functionality through the Document Extension Framework that allows clients to request specific representations of objects as part of the content of the feed.</p>
<h3>unAPI (blacklight plugin)</h3>
<p>The unAPI endpoint allows applications to discover structured information based on an identifier and a content type.</p>
<h3>oEmbed (blacklight plugin)</h3>
<p> oEmbed, rather than forcing implementors to discover media assets (through page scraping or unAPI), allows a client to discover the embeddable properties for an asset (and construct a player) in a standard way. oEmbed provides an easily parseable set of metadata required for embedding and, possibly, a pre- generated player implementation;</p>
<h3>HTML &lt;meta&gt; tags</h3>
<p>While encouraging re-use of materials, we documented possible improvements to make ad-hoc innovation and mash-up creation significantly easier, including:</p>
<ul>
<li>oEmbed, rather than forcing implementors to discover media assets (through page scraping or unAPI), introspect the assets for technical metadata, and then construct a player, oEmbed provides an easily parseable set of metadata required for embedding and, possibly, a pre- generated player implementation;</li>
<li>additional information in the Atom/RSS feeds, in particular ensuring the data contained within the feed representations is comparable to the normal user interface;</li>
<li>and, exposing additional information on the page for developer-use, which, in the case of technical or rights metadata, is less relevant to our primary audience, but may be essential to building third-party interfaces to content.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://cbeer.info/blog/2011/or11/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>jQuery UI Autocomplete and LiquidMetal</title>
		<link>http://cbeer.info/blog/2011/jquery-ui-autocomplete-and-liquidmetal/</link>
		<comments>http://cbeer.info/blog/2011/jquery-ui-autocomplete-and-liquidmetal/#comments</comments>
		<pubDate>Fri, 08 Apr 2011 18:41:25 +0000</pubDate>
		<dc:creator>chris</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://cbeer.info/blog/?p=515</guid>
		<description><![CDATA[By default, the jQuery UI Autocomplete widget filters the source data using a very basic regular expression match: filter: function(array, term) { var matcher = new RegExp( $.ui.autocomplete.escapeRegex(term), "i" ); return $.grep( array, function(value) { return matcher.test( value.label &#124;&#124; value.value &#8230; <a href="http://cbeer.info/blog/2011/jquery-ui-autocomplete-and-liquidmetal/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>By default, the <a href="http://jqueryui.com/demos/autocomplete/">jQuery UI Autocomplete</a> widget filters the source data using a very basic regular expression match:</p>
<pre name="code" class="javascript">
  filter: function(array, term) {
		var matcher = new RegExp( $.ui.autocomplete.escapeRegex(term), "i" );
		return $.grep( array, function(value) {
			return matcher.test( value.label || value.value || value );
		});
	}
</pre>
<div style="font-size: 70%; text-align: right;">
<a href="https://github.com/jquery/jquery-ui/blob/master/ui/jquery.ui.autocomplete.js#L431">source</a></div>
<p>While this works, it doesn't provide relevancy ranking or near-matches, both of which are important when selecting from long lists of values, some of which are not well-known or contain a significant number of obscure items. </p>
<p>To address this, I added a custom data source to the Autcomplete widget that uses the <a href="https://github.com/rmm5t/liquidmetal/">LiquidMetal</a> library, which is a refinement of the Quicksilver scoring algorithm. </p>
<pre name="code" class="javascript">
		source: function(request, response ) {
			var arr;

			if(request.term == "")  {
			return response(data);
			}

			arr = $.map(data, function(value) {
				var score = LiquidMetal.score(value, request.term);
				if(score < 0.5) {
				  return null; // jQuery.map compacts null values
				}
				return { 'value': value, 'score': LiquidMetal.score(value, request.term) };
			});

			arr = arr.sort(function(a,b) { return a['score'] < b['score'] }) ;
		  	return response( $.map(arr, function(value) { return  value['value']; }) );
		}
</pre>
<div style="font-size: 70%; text-align: right;">
<a href="http://cbeer.info/~chris/liquidmetal-demo.html">demo</a></div>
<p>Surprisingly easy.</p>
]]></content:encoded>
			<wfw:commentRss>http://cbeer.info/blog/2011/jquery-ui-autocomplete-and-liquidmetal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Blacklight OAI Demonstrator</title>
		<link>http://cbeer.info/blog/2011/blacklight-oai-demonstrator/</link>
		<comments>http://cbeer.info/blog/2011/blacklight-oai-demonstrator/#comments</comments>
		<pubDate>Tue, 29 Mar 2011 23:01:36 +0000</pubDate>
		<dc:creator>chris</dc:creator>
				<category><![CDATA[Experiments]]></category>

		<guid isPermaLink="false">http://cbeer.info/blog/?p=504</guid>
		<description><![CDATA[I recently put together a simple Blacklight-based OAI-PMH harvester (https://github.com/cbeer/blacklight-oai-demo). Created primarily as an experiment, it was prompted by Ed Corrado's Code4Lib-L thread " Simple Web-based Dublin Core search engine?" and some recent inquiries to the Blacklight community about using &#8230; <a href="http://cbeer.info/blog/2011/blacklight-oai-demonstrator/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I recently put together a simple <a href="http://projectblacklight.org">Blacklight-based</a> OAI-PMH harvester (<a href="https://github.com/cbeer/blacklight-oai-demo">https://github.com/cbeer/blacklight-oai-demo</a>). Created primarily as an experiment, it was prompted by Ed Corrado's Code4Lib-L thread "<a href="https://listserv.nd.edu/cgi-bin/wa?A2=ind1103&#038;L=CODE4LIB&#038;T=0&#038;F=&#038;S=&#038;P=73892">	 Simple Web-based Dublin Core search engine?</a>" and some recent inquiries to the Blacklight community about using Blacklight with non-MaRC metadata. The whole experiment was surprisingly easy, thanks to <a href="http://github.com/edsu/ruby-oai">ruby-oai</a>. In its current form, you can configure OAI providers (with metadata formats and sets, using XSL transforms to convert the OAI-PMH record into Solr-ingestable XML), set up harvesting schedules, and use the standard Blacklight discovery framework. Finally, there is a minimal test suite (using <a href="https://github.com/myronmarston/vcr">VCR</a> to mock OAI-PMH requests to the <a href="http://memory.loc.gov/ammem/oamh/oai_request.html">Library of Congress</a>)</p>

<a href='http://cbeer.info/blog/2011/blacklight-oai-demonstrator/screen-shot-2011-03-29-at-6-42-43-pm/' title='Blacklight Search Results'><img width="150" height="150" src="http://cbeer.info/blog/wp-content/uploads/2011/03/Screen-shot-2011-03-29-at-6.42.43-PM-150x150.png" class="attachment-thumbnail" alt="Blacklight Search Results" title="Blacklight Search Results" /></a>
<a href='http://cbeer.info/blog/2011/blacklight-oai-demonstrator/screen-shot-2011-03-29-at-6-46-04-pm/' title='OAI Provider Configuration'><img width="150" height="150" src="http://cbeer.info/blog/wp-content/uploads/2011/03/Screen-shot-2011-03-29-at-6.46.04-PM-150x150.png" class="attachment-thumbnail" alt="OAI Provider Configuration" title="OAI Provider Configuration" /></a>

]]></content:encoded>
			<wfw:commentRss>http://cbeer.info/blog/2011/blacklight-oai-demonstrator/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Useful Standards for Public Media Projects: Linkbacks</title>
		<link>http://cbeer.info/blog/2010/useful-standards-for-public-media-projects-linkbacks/</link>
		<comments>http://cbeer.info/blog/2010/useful-standards-for-public-media-projects-linkbacks/#comments</comments>
		<pubDate>Fri, 12 Nov 2010 01:28:54 +0000</pubDate>
		<dc:creator>chris</dc:creator>
				<category><![CDATA[Public Media]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://cbeer.info/blog/?p=487</guid>
		<description><![CDATA[In the age of real-time web crawling, third-party comment services (e.g. Disqus), and a relatively standardized set of "engagement" platforms (Twitter, Facebook), the Linkback standards are probably less relevant than it used to be, but I believe they are still &#8230; <a href="http://cbeer.info/blog/2010/useful-standards-for-public-media-projects-linkbacks/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>In the age of real-time web crawling, third-party comment services (e.g. <a href="http://disqus.com/">Disqus</a>), and a relatively standardized set of "engagement" platforms (Twitter, Facebook), the <a href="http://en.wikipedia.org/wiki/Linkback">Linkback</a> standards are probably less relevant than it used to be, but I believe they are still an easy way to add a meaningful  layer of serious communication across a variety of platforms. For all the architectural and social flaws behind the standards, collecting link data is trivial to implement and gives you control of some of the most important pieces of information one can collect: how people are discovering, discussing or re-using your content. One possible advantage is that both the Trackback and Pingback are more-or-less opt-in standards, giving the participants some control over how broadly they want to advertise their discussion. As the public media community keeps pushing user engagement, this is just another tool in the toolbox (and already present in most established content management systems)</p>
<p>I've heard a rumor that trackback urls used to be a standard part of the PBS web site infrastructure years ago -- I'd be very interested to know what, if anything, was learned from collecting that data.</p>
]]></content:encoded>
			<wfw:commentRss>http://cbeer.info/blog/2010/useful-standards-for-public-media-projects-linkbacks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Useful Standards for Public Media Projects: unAPI</title>
		<link>http://cbeer.info/blog/2010/useful-standards-for-public-media-projects-unapi/</link>
		<comments>http://cbeer.info/blog/2010/useful-standards-for-public-media-projects-unapi/#comments</comments>
		<pubDate>Tue, 26 Oct 2010 12:30:49 +0000</pubDate>
		<dc:creator>chris</dc:creator>
				<category><![CDATA[Public Media]]></category>

		<guid isPermaLink="false">http://cbeer.info/blog/?p=471</guid>
		<description><![CDATA[The oEmbed standard is great for exposing embeddable content in way that, most of the time, the average user never has to think about it -- it just works. However, with more complex data, including multiple content objects or underlying &#8230; <a href="http://cbeer.info/blog/2010/useful-standards-for-public-media-projects-unapi/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>The oEmbed standard is great for exposing embeddable content in way that, most of the time, the average user never has to think about it -- it just works. However, with more complex data, including multiple content objects or underlying metadata formats, something less opinionated is necessary. This is the niche <a href="http://unapi.info/">unAPI</a> plays a simple, but powerful, role. Like oEmbed and other standards, unAPI defines a service endpoint with a handful of basic operations and discovery mechanism, and does so in a plain and obvious manner that makes it easy for tech-inclined folk to work with.</p>
<p>For an archives project at work, we implemented unAPI as a simple way to segment the page content and expose underlying metadata formats, in order to offer our partners a quick (and content agnostic) way to pull elements into existing tools. The API endpoint is exposed within the page header:</p>
<pre name="code" class="html">
&lt;link rel="unapi-server" type="application/xml" title="unAPI" href="http://openvault.wgbh.org/api/unapi/" /&gt;
</pre>
<div style="font-size: 70%; margin-top: -1em;">Source: <a href="http://openvault.wgbh.org/catalog/org.wgbh.mla:0119be4cad49d0c0f47e9eca1d343e0464539a4c">http://openvault.wgbh.org/catalog/org.wgbh.mla:0119be4cad49d0c0f47e9eca1d343e0464539a4c</a></div>
<p>Within the page, any number of unAPI IDs are embedded:</p>
<pre name="code" class="html">
&lt;abbr class="unapi-id" style="display: none" title="org.wgbh.mla:0119be4cad49d0c0f47e9eca1d343e0464539a4c"&gt;&lt;/abbr&gt;
</pre>
<div style="font-size: 70%; margin-top: -1em;">Source: <a href="http://openvault.wgbh.org/catalog/org.wgbh.mla:0119be4cad49d0c0f47e9eca1d343e0464539a4c">http://openvault.wgbh.org/catalog/org.wgbh.mla:0119be4cad49d0c0f47e9eca1d343e0464539a4c</a></div>
<p>An unAPI client can request a list of formats from the service:</p>
<pre name="code" class="xml">
&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;formats id="org.wgbh.mla:0119be4cad49d0c0f47e9eca1d343e0464539a4c"&gt;
  &lt;format type="application/xml" docs="http://www.openarchives.org/OAI/2.0/oai_dc.xsd" name="oai_dc"/&gt;
  &lt;format type="application/xml" docs="" name="pbcore"/&gt;
  &lt;format type="image/jpeg" name="jpeg"/&gt;
&lt;/formats&gt;
</pre>
<div style="font-size: 70%; margin-top: -1em;">Source: <a href="http://openvault.wgbh.org/api/unapi?id=org.wgbh.mla:0119be4cad49d0c0f47e9eca1d343e0464539a4c">http://openvault.wgbh.org/api/unapi?id=org.wgbh.mla:0119be4cad49d0c0f47e9eca1d343e0464539a4c</a></div>
<p>These formats could be used to share any type of data -- different flavors of metadata, content outside the application context. As with oEmbed, this provides a basic way to provide federated and aggregated data within a common framework. In addition to being a convenient service to share content among applications, there is also support for unAPI within the <a href="http://www.zotero.org/">Zotero citation management tool</a>.</p>
<p>What could this look like in order to access NPR.org content? Instead of a <a href="http://www.npr.org/api/index">rich API</a>, by simply feeding a story url into the unAPI service, an application could retrieve the different content elements -- a <i>text/html</i> or <i>text/plain</i> representation of the story,  the <i>audio/mp3</i> from the broadcast, an <i>image/jpeg</i> feature image, and perhaps an <i>application/xml+rss</i> feed of the series, comments, or category. It lacks the power behind the stand-alone API, but provides data in a form that is a little easier to craft new ways of highlighting this content on station websites. </p>
]]></content:encoded>
			<wfw:commentRss>http://cbeer.info/blog/2010/useful-standards-for-public-media-projects-unapi/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Useful Standards for Public Media Projects: oEmbed</title>
		<link>http://cbeer.info/blog/2010/useful-standards-for-public-media-projects-oembed/</link>
		<comments>http://cbeer.info/blog/2010/useful-standards-for-public-media-projects-oembed/#comments</comments>
		<pubDate>Sun, 24 Oct 2010 15:36:01 +0000</pubDate>
		<dc:creator>chris</dc:creator>
				<category><![CDATA[Public Media]]></category>

		<guid isPermaLink="false">http://cbeer.info/blog/?p=450</guid>
		<description><![CDATA[oEmbed is a format for allowing an embedded representation of a URL on third party sites. The simple API allows a website to display embedded content (such as photos or videos) when a user posts a link to that resource, &#8230; <a href="http://cbeer.info/blog/2010/useful-standards-for-public-media-projects-oembed/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<blockquote><p>
oEmbed is a format for allowing an embedded representation of a URL on third party sites. The simple API allows a website to display embedded content (such as photos or videos) when a user posts a link to that resource, without having to parse the resource directly.
</p></blockquote>
<p>oEmbed allows the content provider to expose content from a web page (along with a basic set of metadata to support that), allowing an application to embed content on behalf of a user without assuming the user knows what to do with raw HTML code.</p>
<p>Here's example JSON output from Youtube:</p>
<pre name="code" class="javascript">
{"provider_url": "http:\/\/www.youtube.com\/",
"title": "NOVA | Emergency Mine Rescue",
 "html": "&lt;object width=\"480\" height=\"295\"&gt;[...]&lt;\/object&gt;",
"author_name": "NOVAonline",
"height": 295,
"thumbnail_width": 480,
 "width": 480,
"version": "1.0",
"author_url": "http:\/\/www.youtube.com\/user\/NOVAonline",
"provider_name": "YouTube",
"thumbnail_url": "http:\/\/i3.ytimg.com\/vi\/NUrLEKfHB_0\/hqdefault.jpg",
"type": "video",
"thumbnail_height": 360}
</pre>
<div style="font-size: 70%; text-align: right;">Source: <a href="http://www.youtube.com/oembed?url=http://www.youtube.com/watch?v=NUrLEKfHB_0">http://www.youtube.com/oembed?url=http://www.youtube.com/watch?v=NUrLEKfHB_0</a></div>
<p>When given a URL, for which there is an oEmbed endpoint defined or discoverable, an application can query the oEmbed service to retrieve the embed code and automatically insert it into the page. The great thing about this standard is aggregating media from any compliant source is now as easy as writing text, with all the heavy lifting done in the background.</p>
<p>---</p>
<p><del datetime="2010-10-25T00:17:03+00:00">In preparing this post, I noticed <a href="http://video.pbs.org">PBS Video</a> is offering the oEmbed discovery endpoint, however the offered URL returns a 404 error rather than embed content. So close. </del> (looks like oEmbed works on some videos and not others..)</p>
]]></content:encoded>
			<wfw:commentRss>http://cbeer.info/blog/2010/useful-standards-for-public-media-projects-oembed/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Useful Standards for Public Media Projects: FOAF</title>
		<link>http://cbeer.info/blog/2010/5-useful-standards-for-public-media-projects/</link>
		<comments>http://cbeer.info/blog/2010/5-useful-standards-for-public-media-projects/#comments</comments>
		<pubDate>Sun, 17 Oct 2010 15:35:29 +0000</pubDate>
		<dc:creator>chris</dc:creator>
				<category><![CDATA[Public Media]]></category>

		<guid isPermaLink="false">http://cbeer.info/blog/?p=442</guid>
		<description><![CDATA[This series of post was inspired by Barrett Golding's Hacks &#38; Hackers digital projects round-up, which highlights some high-level initiatives coming out of public media, some of which may be developing or adopting standards for content distribution, aggregation, or preservation. &#8230; <a href="http://cbeer.info/blog/2010/5-useful-standards-for-public-media-projects/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>This series of post was inspired by Barrett Golding's Hacks &amp; Hackers <a href="http://hackshackers.com/2010/10/08/public-media-is-investing-in-major-digital-projects/">digital projects round-up</a>, which highlights some high-level initiatives coming out of public media, some of which may be developing or adopting standards for content distribution, aggregation, or preservation. The challenge with many standards in public media is they depend on infrastructure, systems, or getting a large enough group within the community to commit to supporting it. </p>
<p>I'm hoping this is a practical set of several easy-to-implement standards in public media digital projects, especially as organizations are thinking about ways of building communities, becoming better neighbors, talking about aggregation and decentralization, and more. The standards I'm discussing are used in contexts much large than public media, are inherently useful, and systems-agnostic.</p>
<h3><a href="http://www.foaf-project.org/">Friend of a Friend (FOAF)</a></h3>
<p>There is currently no centralized, accessible database of public media organizations, and no group who is willing to take on the headache of populating, maintaining and administering such a creation. Fortunately, most organizations (and likely all the salient ones..) have some web presence and we can use the already-decentralized web to model our decentralized organizational structure. From the project website:</p>
<blockquote><p>
The Friend of a Friend (FOAF) project is creating a Web of machine-readable pages describing people, the links between them and the things they create and do; it is a contribution to the linked information system known as the Web. FOAF defines an open, decentralized technology for connecting social Web sites, and the people they describe.
</p></blockquote>
<p>Within the context of public broadcasting, what can FOAF do? (In a gross simplification), if every organization published an authoritative FOAF document, containing any information each station thought relevant, we could link, aggregate and query the decentralized data set to begin to answer any number of questions programmatically (where is the closest NPR station? what is the URL for streaming audio for station XYZ? what is the pledge phone number for every station in Wisconsin?).</p>
<p>Here's a quick demonstration document for a public media station:</p>
<pre>
&lt;rdf:RDF
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
      xmlns:foaf="http://xmlns.com/foaf/0.1/"
      xmlns:admin="http://webns.net/mvcb/"&gt;
&lt;foaf:PersonalProfileDocument rdf:about=""&gt;
  &lt;foaf:maker rdf:resource="#wkar"/&gt;
  &lt;foaf:primaryTopic rdf:resource="#wkar"/&gt;
&lt;/foaf:PersonalProfileDocument&gt;
&lt;foaf:Organization rdf:ID="wkar"&gt;
&lt;foaf:name&gt;WKAR&lt;/foaf:name&gt;
&lt;foaf:age&gt;78&lt;/foaf:age&gt;
&lt;foaf:mbox rdf:resource="mailto:webmaster@wkar.org"/&gt;
&lt;foaf:phone rdf:resource="tel:+15174329527" /&gt;
&lt;foaf:homepage rdf:resource="http://wkar.org"/&gt;
&lt;foaf:weblog rdf:resource="http://www.publicbroadcasting.net/wkar/news.newsmain" /&gt;
&lt;foaf:tipjar rdf:resource="http://wkar.org/give/" /&gt;
&lt;foaf:tipjar rdf:resource="tel:5174323120x371" /&gt;
&lt;foaf:isPrimaryTopicOf rdf:resource="http://en.wikipedia.org/wiki/WKAR_(AM)"/&gt;
&lt;foaf:isPrimaryTopicOf rdf:resource="http://en.wikipedia.org/wiki/WKAR-FM"/&gt;
&lt;foaf:isPrimaryTopicOf rdf:resource="http://en.wikipedia.org/wiki/WKAR-TV"/&gt;
&lt;foaf:depiction rdf:resource="http://wkar.org/images/wkar-w-140x50.gif"/&gt;
&lt;foaf:logo rdf:resource="http://wkar.org/images/wkar-w-140x50.gif"/&gt;
&lt;foaf:member rdf:resource="http://www.mprn.org/foaf.rdf#mprn" /&gt;
&lt;foaf:member rdf:resource="http://www.pbs.org/foaf.rdf#pbs" /&gt;
&lt;foaf:member rdf:resource="http://www.npr.org/foaf.rdf#npr" /&gt;
&lt;/foaf:Organization&gt;
&lt;/rdf:RDF&gt;
</pre>
<p><span style="font-size: 0.7em">I'm making no claims about the accuracy, correctness or well-formedness of this document, I'm just offering it as an example of what could be done.</span></p>
<p>Because this format is just an RDF document, it is trivial (and encouraged) to extend the FOAF vocabulary with domain-specific information from other sources, e.g. <a href="http://purl.org/ontology/po/">the BBC Programmes ontology</a>.</p>
<p>FOAF has rules and methods for FOAF document discovery and by creating a mesh of organizations we could assemble a full, real-time picture of public media organizations without the overhead of centralization and contribute to a tiny piece of the larger web of knowledge without significant work on the part of any single individual.</p>
]]></content:encoded>
			<wfw:commentRss>http://cbeer.info/blog/2010/5-useful-standards-for-public-media-projects/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Public Media Links</title>
		<link>http://cbeer.info/blog/2010/public-media-links/</link>
		<comments>http://cbeer.info/blog/2010/public-media-links/#comments</comments>
		<pubDate>Sun, 10 Oct 2010 14:15:26 +0000</pubDate>
		<dc:creator>chris</dc:creator>
				<category><![CDATA[Public Media]]></category>

		<guid isPermaLink="false">http://cbeer.info/blog/?p=386</guid>
		<description><![CDATA[Zeitgeist - the most shared BBC links on Twitter : Earlier this summer, the BBC R&#38;D group created Zeitgeist dashboard. Recently, they released the code as open source on github. It seems like there's a ways to go before it &#8230; <a href="http://cbeer.info/blog/2010/public-media-links/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<ul>
<li><a href="http://www.bbc.co.uk/blogs/researchanddevelopment/2010/07/zeitgeist-the-most-shared-bbc.shtml"> Zeitgeist - the most shared BBC links on Twitter </a>: Earlier this summer, the BBC R&amp;D group created Zeitgeist dashboard. Recently, they released the code as open source <a href="http://github.com/bbcrd/zeitgeist">on github</a>. It seems like there's a ways to go before it is easily replicable, but very interesting nevertheless.  </li>
<li><a href="http://area51.stackexchange.com/proposals/20591/backstage">Backstage</a>: Proposed Q&#038;A site for those involved in the technical aspects of television and radio broadcasting, including the publication and consumption of related metadata and APIs, and content delivery over the Internet.</li>
<li><a href="http://www.openvideoconference.org/">Open Video Conference</a>: There was a sizable public media contingent at OVC this year and a lot of ideas about the direction and future for public media. </li>
<li><a href="http://praegnanz.de/html5video/index.php">HTML5 Video Player Comparison</a>: Following on OVC, this is a great list of HTML5 video players, which often also include a javascript API wrapper for fallback players. I need to spend time updating my video utilities to use the new HTML5 APIs.</li>
<li><a href="http://pubcamp10.eventbrite.com/">PublicMediaCamp '10</a>: Public Media Camp 2010 has been scheduled for November 20th and 21st in Washington, D.C. </li>
<li><a href="http://www.slideshare.net/edsonm/michael-edson-the-smithsonian-web-and-new-media-strategy-what-it-is-how-we-made-it-and-why-it-makes-a-difference-3656578">Michael Edson: Fast, Open, and Transparent: Developing the Smithsonian's Web and New Media Strategy</a>: Created earlier this year, it re-surfaced again recently. The presentation articulates many of the same issues faced by public media (decentralization, unexpected rivals, brand identity, relevance, and "thermocline" issues are the "pain points" Michael Edson identifies).</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://cbeer.info/blog/2010/public-media-links/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Gallery of Station Websites: Progress</title>
		<link>http://cbeer.info/blog/2010/gallery-of-station-websites-progress/</link>
		<comments>http://cbeer.info/blog/2010/gallery-of-station-websites-progress/#comments</comments>
		<pubDate>Mon, 13 Sep 2010 00:56:41 +0000</pubDate>
		<dc:creator>chris</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://cbeer.info/blog/?p=432</guid>
		<description><![CDATA[After two weeks of use, the station gallery has had a fair amount of traffic and feedback (bolstered by a write-up in Current and a couple station resource sites. While I've been revising the site all along, I rolled out &#8230; <a href="http://cbeer.info/blog/2010/gallery-of-station-websites-progress/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>After two weeks of use, the <a href="http://stations.publicmediatech.com/">station gallery</a> has had a fair amount of traffic and feedback (bolstered by a write-up in <a href="http://currentpublicmedia.blogspot.com/2010/09/did-your-station-site-make-beers-list.html">Current</a> and a couple station resource sites. While I've been revising the site all along, I rolled out some more substantial changes this weekend, which should improve the functionality and performance of the gallery.</p>
<p>First, and most obvious, I did a little bit of styling work to take it from barely functional to ugly-but-useable. If anyone has design ideas for the site, please leave a comment. Other web galleries usually severely crop their images, but I'm attracted to the full page screenshots, but that comes with a whole range of other design challenges.</p>
<p>As part of the design work, I've moved the "user-generated" content forms to the gallery view (and eliminated the station-view entirely), which I hope will encourage some conversation. I toyed with the idea of integrating with services like twitter and delicious to provide comments and tagging, respectively, but I couldn't come up with a good way to distinguish between conversation and criticism.</p>
<p>I've been working on a couple ideas for a proper home page to the gallery to help give the site some context besides these blog posts. I'll try to make some more progress on this as long as people believe this could be a useful resource for the public media system.</p>
<p>Shortly after the first prototype, I started migrating data into Solr (using the <a href="http://outoftime.github.com/sunspot/">ruby sunspot</a> library) and earlier this week I added full-text search for page content (which may or may not match the screenshot content). I'm still playing with approaches to crawling station websites to extract different types of pages (schedules, contact, news stories, features, etc) with <a href="http://anemone.rubyforge.org/">Anemone</a>, hopefully I will have something interesting in the next couple weeks.</p>
<p>I'm still working out ways to automate (and schedule re-occuring) screenshot updates, which is complicated by the lack of decent cross-platform tools. On Mac OS X, I've been using <a href="http://www.paulhammond.org/webkit2png/">webkit2png</a>, which has been great, but this server is running Debian Linux and the only comparable utility I've found is <a href="http://code.google.com/p/wkhtmltopdf/">wkhtmltopdf</a>, which requires a patched version to QT. Messages queues or workflow engines seem like overkill, so in the meantime I'll do manual updates occasionally. </p>
<p>As always, the source code to the site is available at <a href="http://github.com/cbeer/publicmediatech-stations">http://github.com/cbeer/publicmediatech-stations</a> for anyone interested in hacking on the gallery or just seeing how it was put together.</p>
]]></content:encoded>
			<wfw:commentRss>http://cbeer.info/blog/2010/gallery-of-station-websites-progress/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

