<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Canadensys</title>
	
	<link>http://www.canadensys.net</link>
	<description>A network of Canadian biological collections</description>
	<lastBuildDate>Fri, 11 May 2012 20:13:03 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.2</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/canadensys" /><feedburner:info uri="canadensys" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><feedburner:emailServiceId>canadensys</feedburner:emailServiceId><feedburner:feedburnerHostname>http://feedburner.google.com</feedburner:feedburnerHostname><item>
		<title>Why we should publish our data under Creative Commons Zero (CC0)</title>
		<link>http://feedproxy.google.com/~r/canadensys/~3/kuPqewi06Rk/why-we-should-publish-our-data-under-cc0</link>
		<comments>http://www.canadensys.net/2012/why-we-should-publish-our-data-under-cc0#comments</comments>
		<pubDate>Fri, 27 Jan 2012 14:26:32 +0000</pubDate>
		<dc:creator>Peter Desmet</dc:creator>
				<category><![CDATA[Open source & Commons]]></category>

		<guid isPermaLink="false">http://www.canadensys.net/?p=1637</guid>
		<description><![CDATA[With the first datasets getting published and more coming soon, the issue comes up under what license we &#8211; the Canadensys community and the individual collections &#8211; will publish our data. Dealing with the legal stuff can be tedious, which is why we have looked into this issue with the Canadensys Steering Committee &#038; Science [...]]]></description>
			<content:encoded><![CDATA[<p>With the first datasets getting <a href="http://data.canadensys.net/ipt">published</a> and more coming soon, the issue comes up under what license we &#8211; the Canadensys community and the individual collections &#8211; will publish our data. Dealing with the legal stuff can be tedious, which is why we have looked into this issue with the <a href="http://www.canadensys.net/people">Canadensys Steering Committee &#038; Science and Technology Advisory Board</a> before opening the discussion to the whole community.</p>
<p class="note">By <strong>data</strong> we mean specimen, observation or checklist datasets published as a Darwin Core Archive and any derivatives. To keep the discussion focused, this does not include pictures or software code.</p>
<p class="note">2012.01.30 &#8211; Update to post: technically CC0 is not a license, but a waiver (see <a href="#comment-424477904">comment below</a>).</p>

<h2>What we hope to achieve</h2>
<ol>
	<li><strong>One license for the whole Canadensys community</strong>, which is easier for aggregation and sends a strong message as one community.</li>
	<li><strong>An existing license</strong>, because we don&#8217;t want to write our own legal documents.</li>
	<li><strong>An open license</strong>, allowing our data to be really used.</li>
	<li><strong>A clear license</strong>, so users can focus on doing great research with the data, instead of figuring out the fine print.</li>
	<li><strong>Giving credit where credit is due</strong>.</li>
</ol>

<h2>Our recommendation</h2>
<p><a href="http://creativecommons.org/publicdomain/zero/1.0/"><img src="http://www.canadensys.net/wp-content/uploads/why-cc0-cc-zero.png" alt="cc-zero" class="alignright" /></a> We recommend Canadensys participants to publish their data under <strong><a href="http://creativecommons.org/publicdomain/zero/1.0/">Creative Commons Zero (CC0)</a></strong>. With CC0 you waive any copyright you might have over the data(set) and dedicate it to the public domain. Users can copy, use, modify and distribute the data without asking your permission. You cannot be held liable for any (mis)use of the data either.</p>
<p><a href="http://wiki.creativecommons.org/CC0_use_for_data">CC0 is recommended for data and databases</a> and is used by hundreds of organizations. It is especially recommended for scientific data and thus encouraged by <a href="http://www.pensoft.net/">Pensoft</a> (see <a href="http://www.pensoft.net/J_FILES/Pensoft_Data_Publishing_Policies_and_Guidelines.pdf">their guidelines for biodiversity data papers</a>) and <a href="http://www.nature.com/">Nature</a> (see <a href="http://dx.doi.org/10.1038/461171a">this opinion piece</a>). Although CC0 doesn&#8217;t legally require users of the data to cite the source, it does not take away the moral responsibility to give attribution, as is common in scientific research (more about that below).</p>

<h2>Why would I waive my copyright?</h2>
<p>For starters, there’s very little copyright to be had in our data, datasets and databases. Copyright only applies to creative content and 99% of our data are facts, which cannot be copyrighted. We do hold copyright over some text in remarks fields, the data format or database model we chose/created, and pictures. If we consider a Darwin Core Archive (which is how we are publishing our data) the creative content is even further reduced: the data format is a standard and we only provide a link to pictures, not the pictures themselves.</p>
<p>Figuring out where the facts stop and where the (copyrightable) creative content begins can already be difficult for the content owner, so imagine what a legal nightmare it can become for the user. On top of that different rules are used in different countries. Publishing our data under CC0 removes any ambiguity and red tape. We waive any copyright we might have had over the creative content and our data gets the legal status of public domain. It can no longer be copyrighted by anyone.</p>

<h2>Can&#8217;t we use another license?</h2>
<p>Let’s go over the options. Keep in mind that these licenses only apply to the creative aspect of the dataset, not the facts. But as pointed out above, figuring this out can be difficult or impossible for the user. So much so in fact, that the user may decide not to use the data at all, especially if they think they might not meet the conditions of the license.</p>

<h3>All rights reserved</h3>
<p><img src="http://www.canadensys.net/wp-content/uploads/why-cc0-copyright.png" alt="copyright" class="alignright" /> The user cannot use the data(set) without the permission of the owner.</p>
<p>Conclusion: Not good.</p>

<h3>Open Data Commons Public Domain Dedication and License (<a href="http://opendatacommons.org/licenses/pddl/summary/">PDDL</a>)</h3>
<p>There are no restrictions on how to use the data. This license is very similar to CC0.</p>
<p>Conclusion: Perfect, in fact this license was a precursor of CC0, but&#8230; it is less well known and maybe not as legally thorough as CC0. CC0 made a huge effort to cover legislation in almost all countries and the Creative Commons community is working hard to improve this even further. Therefore, if you have to choose, CC0 is probably better.</p>

<h3>Creative Commons Attribution-NoDerivs (<a href="http://creativecommons.org/licenses/by-nd/3.0/">CC BY-ND</a>)</h3>
<p><a href="http://creativecommons.org/licenses/by-nd/3.0/"><img src="http://www.canadensys.net/wp-content/uploads/why-cc0-by-nd.png" alt="by-nd" class="alignright" /></a> The user cannot build upon the data(set), which is what most data use involves.</p>
<p>Conclusion: Not good, and sadly used by <a href="http://www.theplantlist.org/terms/">theplantlist.org</a>. <a href="http://www.linkedin.com/in/rdmpage">Roderic Page</a> pointed this out by <a href="http://iphylo.blogspot.com/2011/01/why-won-plant-list-won-let-me-do-this.html">showing what cool things he can NOT do with the data</a>.</p>

<h3>Creative Commons Attribution-NonCommercial (<a href="http://creativecommons.org/licenses/by-nc/3.0/">CC BY-NC</a>)</h3>
<p><a href="http://creativecommons.org/licenses/by-nc/3.0/"><img src="http://www.canadensys.net/wp-content/uploads/why-cc0-by-nc.png" alt="by-nc" class="alignright" /></a> The user cannot use the data(set) for commercial purposes. This seems fine from an academic viewpoint, but the license is a lot more restrictive than intuitively thought. See: Hagedorn, G. et al. ZooKeys 150 (2011). <a href="http://dx.doi.org/10.3897/zookeys.150.2189">Creative Commons licenses and the non-commercial condition: Implications for the re-use of biodiversity information</a>.</p>
<p>Conclusion: Not good.</p>

<h3>Creative Commons Attribution-ShareAlike (<a href="http://creativecommons.org/licenses/by-sa/3.0/">CC BY-SA</a>) or Open Data Commons Open Database License (<a href="http://opendatacommons.org/licenses/odbl/summary/">ODbL</a>)</h3>
<p><a href="http://creativecommons.org/licenses/by-sa/3.0/"><img src="http://www.canadensys.net/wp-content/uploads/why-cc0-by-sa.png" alt="by-sa" class="alignright" /></a> The user has to share any work based upon the data(set) under a license that is identical or similar to the one used.</p>
<p>Conclusion: Good, but&#8230; this can lead to some problems for an aggregator like <a href="http://www.canadensys.net">Canadensys</a> or <a href="http://www.gbif.org">GBIF</a>: if they are mixing and merging data with different SA licenses, which one do they choose? They might be incompatible.</p>

<h3>Creative Commons Attribution (<a href="http://creativecommons.org/licenses/by/3.0/">CC BY</a>) or Open Data Commons Attribution License (<a href="http://opendatacommons.org/licenses/by/summary/">ODC-By</a>)</h3>
<p><a href="http://creativecommons.org/licenses/by/3.0/"><img src="http://www.canadensys.net/wp-content/uploads/why-cc0-by.png" alt="by" class="alignright" /></a> The user has to attribute the data(set) in the manner specified by the owner. This condition is also present in the three licenses above.</p>
<p>Conclusion: Good, but&#8230; this can lead to impractical &#8220;attribution stacking&#8221;. If an aggregator or a user of that aggregator is using and integrating different datasets provided under a BY license, they legally have to cite the owner for each and every one of those in the manner specified by these owners (again, for the potential creative content in the data). See <a href="http://sciencecommons.org/projects/publishing/open-access-data-protocol/">point 5.3 at the bottom of this Creative Commons page</a> for a better explanation and <a href="http://wir.okfn.org/2012/01/27/attribution-stacking-as-a-barrier-to-reuse/">this blog post</a> for an example.</p>

<h2>But giving credit is a good thing!</h2>
<p>Absolutely, but legally enforcing it can lead to the opposite affect: a user may decide not to use the data out of fear of not completely complying with the license (see paragraph above). As hinted at the beginning of this post, CC0 removes the drastic legally enforceable requirement to give attribution, but it does not remove the moral obligation to give attribution. In fact, this has been the common practice in scientific research for many decades: legally, you don&#8217;t have to cite the research/data you&#8217;re using, but not doing so could be considered plagiarism, which would compromise your reputation and the credibility of your work.</p>
<p>To encourage users to give credit where credit is due, we propose to create <a href="http://www.canadensys.net/norms">Canadensys <strong>norms</strong></a>. Norms are not a legal document (see <a href="http://opendatacommons.org/norms/odc-by-sa/">an example here</a>), but a &#8220;code of conduct&#8221; where we declare how we would like users to use, share and cite our data, and how they can participate. We can explain how one could cite an individual specimen, a collection, a dataset or an aggregated &#8220;Canadensys&#8221; download. We can point out that our data are constantly being corrected or added to, so it is useful to keep coming back to the original repository and not to a secondary repository that may not have been updated. In addition to that, we can build tools to monitor downloads or automatically create an adequate citation. And with the arrival of <a href="http://dx.doi.org/10.1186/1471-2105-12-S15-S2">data papers</a> &#8211; which drafts can now be <a href="http://www.gbif.org/communications/news-and-events/showsingle/article/new-incentive-for-biodiversity-data-publishing">automatically generated from IPT</a> &#8211;  data(sets) are really brought into the realm of traditional publishing and the associated scientific recognition.</p>

<h2>Conclusion</h2>
<p>All this to say that there are mechanisms where both users and data owners can benefit, without the legal burden. CC0 + norms guarantees that our data can be used now and in the future. I for one will update the license for our <a href="http://data.canadensys.net/ipt">Université de Montréal Biodiversity Centre datasets</a>. We hope you will join us!</p>
<p>Thanks to the <a href="http://www.linkedin.com/in/gregorhagedorn">Gregor Hagedorn</a> for his valuable advice on all the intricacies of data licensing.</p><img src="http://feeds.feedburner.com/~r/canadensys/~4/kuPqewi06Rk" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.canadensys.net/2012/why-we-should-publish-our-data-under-cc0/feed</wfw:commentRss>
		<slash:comments>10</slash:comments>
		<feedburner:origLink>http://www.canadensys.net/2012/why-we-should-publish-our-data-under-cc0</feedburner:origLink></item>
		<item>
		<title>New terms in Darwin Core</title>
		<link>http://feedproxy.google.com/~r/canadensys/~3/O8ofnh3-Ibw/new-terms-in-darwin-core</link>
		<comments>http://www.canadensys.net/2011/new-terms-in-darwin-core#comments</comments>
		<pubDate>Thu, 22 Dec 2011 20:17:53 +0000</pubDate>
		<dc:creator>Peter Desmet</dc:creator>
				<category><![CDATA[Darwin Core]]></category>

		<guid isPermaLink="false">http://www.canadensys.net/?p=1566</guid>
		<description><![CDATA[For the first time since its ratification in October 2009, Darwin Core has been revised! But don&#8217;t panic: it&#8217;s probably not going to affect your data publication. Darwin Core is a community standard, so anyone can submit ideas, suggestions or corrections via the Darwin Core issue tracker (guidelines to do so can be found here). [...]]]></description>
			<content:encoded><![CDATA[<p>For the first time since its ratification in October 2009, Darwin Core has been revised! But don&#8217;t panic: it&#8217;s probably not going to affect your data publication.</p>

<p><a href="http://www.canadensys.net/darwin-core">Darwin Core</a> is a community standard, so anyone can submit ideas, suggestions or corrections via the <a href="http://code.google.com/p/darwincore/issues/list">Darwin Core issue tracker</a> (guidelines to do so can be found <a href="http://rs.tdwg.org/dwc/terms/namespace/index.htm">here</a>). Minor error corrections can get accepted without peer review, but for more serious changes enough people have to agree (via open commentary and review) before the suggestions are included in the standard. The TDWG Executive Committee has now officially <a href="http://rs.tdwg.org/dwc/terms/history/decisions/index.htm">approved a bunch of suggestions</a> (not all) on 16 October 2011 and the standard has been updated in November.</p>

<p>So, what are the most important changes? Well, we have <a href="http://www.imdb.com/title/tt0109831/">Four Weddings and a Funeral</a>:</p>

<h2>4 new terms</h2>
<p class="note">The term explanations are mine, specific for specimen data publication. For the official definitions, click on the term.</p>
<ul>
	<li><a href="http://rs.tdwg.org/dwc/terms/index.htm#dcterms:references">dcterms:references</a>: A term borrowed from Dublin Core: a url for the most original or complete information about the record, e.g. the <a href="http://herbie.zoology.ubc.ca/~botany/herbarium/details.php?db=vwsp.fp7&#038;layout=vwsp_web_details&#038;recid=4525&#038;ass_num=V189371">online specimen database record</a>.</li>
	<li><a href="http://rs.tdwg.org/dwc/terms/index.htm#namePublishedInYear">namePublishedInYear</a>: Mainly for checklists: the year in which the scientificName was published.</li>
	<li><a href="http://rs.tdwg.org/dwc/terms/index.htm#georeferencedDate">georeferencedDate</a>: Erroneously not included in the first version of Darwin Core: the date on which the location was georeferenced.</li>
	<li><a href="http://rs.tdwg.org/dwc/terms/index.htm#identificationVerificationStatus">identificationVerificationStatus</a>: A number from 0-4 to indicate the extent to which the taxonomic identification has been verified to be correct. The <a href="http://plantnet.rbgsyd.nsw.gov.au/HISCOM/HISPID/HISPID3/H3_verify.html">codes can be found here</a>.</li>
</ul>

<h2>1 deprecated term</h2>
<ul>
	<li><a href="http://rs.tdwg.org/dwc/terms/history/index.htm#occurrenceDetails-2009-04-24">occurrenceDetails</a>: This term could only be used for specimens and observations. It has now been replaced by the more general dcterms:references (see above).</li>
</ul>

<p>This brings the total number of core terms at 159. We have updated <a href="https://spreadsheets.google.com/spreadsheet/pub?hl=en&#038;key=0Apk_TuFGIiOPdDBQbWxkVmFHZlZnLUNLRVByUXhUQmc&#038;output=html" class="doc_google">our list</a>, please do so as well with your local files or visit the official <a href="http://rs.tdwg.org/dwc/terms/index.htm">Darwin Core website</a>.</p><img src="http://feeds.feedburner.com/~r/canadensys/~4/O8ofnh3-Ibw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.canadensys.net/2011/new-terms-in-darwin-core/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.canadensys.net/2011/new-terms-in-darwin-core</feedburner:origLink></item>
		<item>
		<title>VASCAN portal and database are now open source</title>
		<link>http://feedproxy.google.com/~r/canadensys/~3/eUSsF99E8X4/vascan-portal-and-database-open-source</link>
		<comments>http://www.canadensys.net/2011/vascan-portal-and-database-open-source#comments</comments>
		<pubDate>Mon, 12 Dec 2011 20:17:13 +0000</pubDate>
		<dc:creator>Peter Desmet</dc:creator>
				<category><![CDATA[Open source & Commons]]></category>
		<category><![CDATA[VASCAN]]></category>

		<guid isPermaLink="false">http://www.canadensys.net/?p=1354</guid>
		<description><![CDATA[The Database of Vascular Plants of Canada (VASCAN) web portal and its database are now open source. The code is hosted on our Google Code site and can be used by anyone to create their own checklist portal. By doing so, we hope that the time and effort that went into developing VASCAN can benefit [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://data.canadensys.net/vascan">Database of Vascular Plants of Canada (VASCAN) web portal</a> and its database are now open source. The code is hosted on <a href="http://code.google.com/p/canadensys/">our Google Code site</a> and can be used by anyone to create their own checklist portal. By doing so, we hope that the time and effort that went into developing VASCAN can benefit other developers. We also hope to encourage collaborative development, as anyone can now review the code or develop additional modules. The repository also serves as a backup of our code and its development history.</p>

<h2>What is VASCAN?</h2>
<p>You don&#8217;t know yet? No problem, you can read all about it here:</p>

<ul>
	<li><a href="http://data.canadensys.net/vascan/about">VASCAN about page</a></li>
	<li><a href="http://www.canadensys.net/vascan">Brand new page describing VASCAN as software</a></li>
	<li><a href="http://www.canadensys.net/2011/vascan-functionalities" class="doc_blog">Blog post summarizing VASCAN&#8217;s functionalities</a></li>
</ul>

<h2>Where can I get the code?</h2>
<ul>
	<li><a href="http://code.google.com/p/canadensys/source/browse/#svn%2Fvascan-portal%2Ftrunk" class="doc_code">Source code for the webportal</a></li>
	<li><a href="http://code.google.com/p/canadensys/source/browse/#svn%2Fvascan-database%2Fmysql" class="doc_data">SQL dump of the MySQL database</a></li>
	<li><a href="http://code.google.com/p/canadensys/wiki/VascanPortalInstallation">Installation guidelines</a></li>
</ul>

<h2>Challenges in making VASCAN open source</h2>
<p>Even though we always wanted to open-source our code, it is of course developed to fit our needs and setup first. So, our configuration is entangled with the functionality and we have some idiosyncratic code here and there. The challenge was to adapt the current code to support different setups, while keeping most of its original structure. This is what we did:</p>
<ul>
 	<li>All configurations (database connection, data folder, base url, etc.) are now stored in a separate file</li>
 	<li>All generated data (images, Darwin Core Archives, indexes, sitemap, etc.) are now stored outside the webapp</li>
 	<li>Automatic generation of a &#8220;sitemap.xml&#8221; for improved indexing by search engines like Google. Access can now be controlled with &#8220;robots.txt&#8221;.</li>
	<li>Developed an installation function to build the application on your own server. You only need to edit 3 files and building the application is as simple as typing the command &#8220;ant war&#8221; to create a Java war file.</li>
	<li><a href="http://www.canadensys.net/vascan">Improved documentation</a> and <a href="http://code.google.com/p/canadensys/wiki/VascanPortalInstallation">installation guidelines</a>.</li>
</ul><img src="http://feeds.feedburner.com/~r/canadensys/~4/eUSsF99E8X4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.canadensys.net/2011/vascan-portal-and-database-open-source/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.canadensys.net/2011/vascan-portal-and-database-open-source</feedburner:origLink></item>
		<item>
		<title>Updating a customized IPT</title>
		<link>http://feedproxy.google.com/~r/canadensys/~3/Zmnz-fliI0I/updating-a-customized-ipt</link>
		<comments>http://www.canadensys.net/2011/updating-a-customized-ipt#comments</comments>
		<pubDate>Thu, 08 Dec 2011 15:40:47 +0000</pubDate>
		<dc:creator>Peter Desmet</dc:creator>
				<category><![CDATA[IPT]]></category>

		<guid isPermaLink="false">http://www.canadensys.net/?p=1531</guid>
		<description><![CDATA[This is a follow-up of the post Customizing the IPT. As mentioned at the very end of my post about customizing the IPT, I face a problem when I want to install a new version of the GBIF Integrated Publishing Toolkit: installing it will overwrite all my customized files! Luckily Tim Robertson gave me a [...]]]></description>
			<content:encoded><![CDATA[<p class="note">This is a follow-up of the post <a href="http://www.canadensys.net/2011/customizing-the-ipt">Customizing the IPT</a>.</p>

<p>As mentioned at the very end of my post about <a href="http://www.canadensys.net/2011/customizing-the-ipt">customizing the IPT</a>, I face a problem when I want to install a new version of the <a href="http://www.canadensys.net/ipt">GBIF Integrated Publishing Toolkit</a>: installing it will overwrite all my customized files! Luckily Tim Robertson gave me a hint on how to solve this: a shell script to reapply my customization.</p>
<p>Here&#8217;s how it works (for Mac and Linux systems only):</p>

<h2>Comparing the customized files with the default files</h2>

<p>First of all, I need to compare my customized files with the files from the new IPT. They might have changed to include new functionalities or fix bugs. So, I installed the newest version of IPT on my <em>localhost</em>, opened the default files and compared them with my files. Although there are tools to compare files, I mostly did this manually. The biggest change in version 2.0.3 was the addition of localization, for which I&#8217;m using a different UI, so I had to tweak some things here and there. It took me about 3 hours until I was satisfied with the new customized IPT version on my <em>localhost</em>.</p>

<p>I also subscribed to the RSS of the <a href="http://code.google.com/p/gbif-providertoolkit/">IPT Google Code website</a>, to be notified of any changes in the code of &#8220;my&#8221; files, but I was just using this as a heads-up for coming changes. It is more efficient to change everything at once, when a stable version of IPT is out.</p>
<ul>
	<li><a href="http://code.google.com/feeds/p/gbif-providertoolkit/svnchanges/basic?path=/trunk/gbif-ipt/src/main/webapp/WEB-INF/pages/inc">RSS subscription</a> for any changes in <a href="http://code.google.com/p/gbif-providertoolkit/source/browse/#svn%2Ftrunk%2Fgbif-ipt%2Fsrc%2Fmain%2Fwebapp%2FWEB-INF%2Fpages%2Finc">/webapp/WEB-INF/pages/inc</a>, which contains most of my customized files</li>
	<li><a href="http://code.google.com/feeds/p/gbif-providertoolkit/svnchanges/basic?path=/trunk/gbif-ipt/src/main/webapp/styles/main.css">RSS subscription</a> for any changes in <a href="http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-ipt/src/main/webapp/styles/main.css">/webapp/styles/main.css</a>, where I&#8217;m commenting out a lot of stuff so my CSS can kick in.</li>
</ul>

<h2>Setting up a file structure</h2>

<p>This is how we&#8217;ve organized the files on our server. I&#8217;ve created a folder called <em>ipt-customization</em>, which contains all my customized files. That way, they can never be overwritten by a new IPT installation, which gets deployed in <em>webapps</em>. The folder also contains a script to apply the customization and a folder to backup the default files currently used by IPT.</p>

<ul class="custom_list">
	<li class="doc_folder">ipt-data</li>
	<li class="doc_folder">webapps
	<ul class="custom_list">
		<li class="doc_folder">ipt</li>
	</ul>
	</li>
	<li class="doc_folder">ipt-customization
	<ul class="custom_list">
		<li class="doc_folder">backup-default</li>
		<li class="doc_code">apply-customization.sh</li>
		<li class="doc_code">revert-customization.sh</li>
		<li class="doc_code">header.ftl</li>
		<li class="doc_code">header_setup.ftl</li>
		<li class="doc_code">menu.ftl</li>
		<li class="doc_code">footer.ftl</li>
		<li class="doc_code">main.css</li>
		<li class="doc_code">custom.js</li>
	</ul>
	</li>
</ul>

<h2>Creating the shell script</h2>

<p>The <strong><em>apply-customization.sh</em></strong> script works in two steps:<p>
<ol>
	<li>Backup the default files, by copying them from IPT to the folder <em>backup-default</em>. The script will ask if I want to overwrite any previously backed up files. The last part is important if I&#8217;m running the script several times. In that case I do not want to overwrite the backups with the already customized files.</li>
	<li>Overwrite the files currently used by IPT with the customized files, by copying them from my <em>ipt-customization</em> folder to the correct folder in IPT</li>
</ol>

<code>
# backup files of new IPT installation<br />
cp -i ../webapps/ipt/WEB-INF/pages/inc/footer.ftl ../ipt-customization/backup-default/<br />
cp -i ../webapps/ipt/WEB-INF/pages/inc/header_setup.ftl ../ipt-customization/backup-default/<br />
cp -i ../webapps/ipt/WEB-INF/pages/inc/header.ftl ../ipt-customization/backup-default/<br />
cp -i ../webapps/ipt/WEB-INF/pages/inc/menu.ftl ../ipt-customization/backup-default/<br />
cp -i ../webapps/ipt/styles/main.css ../ipt-customization/backup-default/<br />
<br />
# apply customization<br />
cp footer.ftl ../webapps/ipt/WEB-INF/pages/inc/<br />
cp header_setup.ftl ../webapps/ipt/WEB-INF/pages/inc/<br />
cp header.ftl ../webapps/ipt/WEB-INF/pages/inc/<br />
cp menu.ftl ../webapps/ipt/WEB-INF/pages/inc/<br />
cp main.css ../webapps/ipt/styles/<br />
cp custom.js ../webapps/ipt/js/<br />
</code>

<p>I also created a script <strong><em>revert-customization.sh</em></strong>, to revert the customization to the default IPT, in case something is broken. It moves the backed up files back to IPT:</p>

<code>
# revert customization<br />
cp backup-default/footer.ftl ../webapps/ipt/WEB-INF/pages/inc/<br />
cp backup-default/header_setup.ftl ../webapps/ipt/WEB-INF/pages/inc/<br />
cp backup-default/header.ftl ../webapps/ipt/WEB-INF/pages/inc/<br />
cp backup-default/menu.ftl ../webapps/ipt/WEB-INF/pages/inc/<br />
cp backup-default/main.css ../webapps/ipt/styles/<br />
rm ../webapps/ipt/js/custom.js<br />
</code>

<h2>Running the script</h2>

<p>From the command line, I login to my server, navigate to the folder <em>ipt-customization</em> and make my script executable:</p>

<code>chmod +x apply-customization.sh</code>

<p>I only have to do this the first time I want to use my script. From then on I can use:</p>

<code>sh ./apply-customization.sh</code>

<p>To execute the script and customize my <a href="http://data.canadensys.net/ipt/">new version of IPT</a>!</p><img src="http://feeds.feedburner.com/~r/canadensys/~4/Zmnz-fliI0I" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.canadensys.net/2011/updating-a-customized-ipt/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.canadensys.net/2011/updating-a-customized-ipt</feedburner:origLink></item>
		<item>
		<title>How many species?</title>
		<link>http://feedproxy.google.com/~r/canadensys/~3/qHv8qduLPDs/how-many-species</link>
		<comments>http://www.canadensys.net/2011/how-many-species#comments</comments>
		<pubDate>Thu, 01 Dec 2011 15:40:30 +0000</pubDate>
		<dc:creator>Peter Desmet</dc:creator>
				<category><![CDATA[Entomological collections]]></category>

		<guid isPermaLink="false">http://www.canadensys.net/?p=1492</guid>
		<description><![CDATA[This guest post by Terry Wheeler originally appeared on the Lyman Entomological Museum blog. One of the fundamental rules of running a business is that you have to keep track of your inventory. If you don’t know what’s in the warehouse, or who works for you, you’re not going to get very far as a [...]]]></description>
			<content:encoded><![CDATA[<p class="note">This guest post by <a href="http://lyman.mcgill.ca/Wheeler.html">Terry Wheeler</a> originally appeared on the <a href="http://lymanmuseum.wordpress.com/2011/11/10/how-many-species/">Lyman Entomological Museum blog</a>.</p>

<p>One of the fundamental rules of running a business is that you have to keep track of your inventory. If you don’t know what’s in the warehouse, or who works for you, you’re not going to get very far as a manager. I think about this every time somebody asks me how many species of flies live in Quebec, or how many species of insects there are in Canada, or how many beetles there are on the island of Montreal. The answer to all three questions is the same – “we don’t know”</p>

<p>Now, as embarrassing as that answer is, it’s not because I didn’t study, or because I can’t be bothered to look it up; it’s because we (“we” being the scientific community) simply DO NOT KNOW.</p>

<p>In 1979 a newly minted group called the <a href="http://www.biology.ualberta.ca/bsc/bschome.htm">Biological Survey of Canada</a> published a book called <em>Canada and its Insect Fauna</em> edited by H.V. Danks, but containing the collected wisdom of the  Canadian entomological community. And the purpose of that book was to describe the state of our knowledge of the terrestrial arthropods of Canada. The question was “how many species?”. The answer was “33,672″. End of story? Not a chance. Because the other key number in that list was “32,826″ – that’s a ballpark, top-of-the-head estimate of how many MORE species were living in Canada but remained unrecorded or undescribed. That number is an underestimate.</p>

<p>So, that means that in our own country, where the insects we study are running or flying or crawling or swimming around underfoot, we know less than half of our own species. It’s not because of a fundamental laziness in the entomological community (we’re a hard-working crew!); it’s simply because insects are difficult and diverse, and because there are only so many of us to get the job done.</p>

<p>In the 30-odd years since <em>Canada and its Insect Fauna</em> was published, the Biological Survey of Canada has continued to persevere in documenting the Canadian arthropod fauna. We know a lot more now than we did then, but there’s still a long way to go. New species are sitting preserved in our collections waiting to be discovered and described. New records are running around in our back yards waiting to be documented. Many habitats and regions of Canada have been explored in only a passing way for most arthropod groups. When potential new students come to my lab and ask if there are any good projects to do on taxonomy or phylogeny or inventory of insects around here the first thing I do is pull up a chair for them. They’ll need to get comfy – it takes a while to run through the list.</p>

<p>Nothing gets a job done like spouting optimism, and the entomologists of Canada are, on the whole, a pretty optimistic crew. The job ahead is enormous, but with a great group of people chipping away, we will get the job done. <em>Canada and its Insect Fauna II</em> is the goal we need to aim for.</p>

<p>Why? Because arthropods are intimately connected to every single terrestrial ecosystem in this country. Insects and spiders and mites and their little-known relatives are the mechanics and engineers and drivers and barometers of life on earth and we won’t truly understand, or be able to manage or conserve, those ecosystems until we know who is doing the work.</p><img src="http://feeds.feedburner.com/~r/canadensys/~4/qHv8qduLPDs" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.canadensys.net/2011/how-many-species/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://www.canadensys.net/2011/how-many-species</feedburner:origLink></item>
		<item>
		<title>VASCAN functionalities</title>
		<link>http://feedproxy.google.com/~r/canadensys/~3/03Vl8Xypx8U/vascan-functionalities</link>
		<comments>http://www.canadensys.net/2011/vascan-functionalities#comments</comments>
		<pubDate>Wed, 16 Nov 2011 18:39:20 +0000</pubDate>
		<dc:creator>Peter Desmet</dc:creator>
				<category><![CDATA[Darwin Core]]></category>
		<category><![CDATA[VASCAN]]></category>

		<guid isPermaLink="false">http://www.canadensys.net/?p=1387</guid>
		<description><![CDATA[This post is also available in French. Our Database of a vascular plants of Canada (VASCAN) is a bit over a year old now (launched on October 22, 2010) and the response has been great so far! The site is averaging 1,850 visits / 20,500 page views a month from over a 100 countries, and [...]]]></description>
			<content:encoded><![CDATA[<p class="note"><a href="http://www.canadensys.net/2011/vascan-functionality?lang=fr">This post is also available in French</a>.</p>

<p>Our <a href="http://data.canadensys.net/vascan">Database of a vascular plants of Canada (VASCAN)</a> is a bit over a year old now (launched on October 22, 2010) and the response has been great so far! The site is averaging 1,850 visits / 20,500 page views a month from over a 100 countries, and is used as a source for vernacular names in <a href="http://fr.wikipedia.org/wiki/H%C3%AAtre_%C3%A0_grandes_feuilles#cite_note-1">Wikipedia</a> and as a tool in the <a href="http://www.horticulture-indigo.com/blog/?p=135">horticultural sector</a>. We thought this would be a good time to summarize some of the things you can do with VASCAN:</p>

<h2>What is VASCAN?</h2>

<p>You can read all about VASCAN in the <a href="http://data.canadensys.net/vascan/about">about page</a>. If you want a more visual explanation, here&#8217;s a <a href="http://www.slideshare.net/peterdesmet/vascan-demonstration">presentation about VASCAN</a> (also embedded at the end of this post).</p>

<h2>Name search</h2>

<p>You can search for any scientific or vernacular name by using the <a href="http://data.canadensys.net/vascan/search">name search</a>. The full search allows you to search on parts of names (starting from the beginning, e.g. <a href="http://data.canadensys.net/vascan/search?q=Carex+ag">Carex ag</a>) and the dropdown box will offer suggestions and provide shortcuts to name pages.</p>

<p><a href="http://www.canadensys.net/wp-content/uploads/vascan-functionality-dropdown.png"><img src="http://www.canadensys.net/wp-content/uploads/vascan-functionality-dropdown.png" alt="vascan-functionality-dropdown" class="aligncenter" /></a></p>

<h2>Name pages</h2>

<p>A name page interprets the scientific or vernacular name provided in the URL (e.g. 
<a href="http://data.canadensys.net/vascan/name/acer%20saccharum">http://data.canadensys.net/vascan/name/acer saccharum</a>) and shows 
relevant information if the name has been found. The system will automatically forward unique synonym and vernacular names to the accepted taxon name, displaying more relevant information (e.g. <a href="http://data.canadensys.net/vascan/name/Acer%20saccharum%20subsp.%20saccharum">Acer saccharum subsp. saccharum</a> or <a href="http://data.canadensys.net/vascan/name/sugar%20maple">Sugar maple</a>), but this functionality can be turned off by appending &#8220;<a href="http://data.canadensys.net/vascan/name/Acer%20saccharum%20subsp.%20saccharum?redirect=no">?redirect=no</a>&#8221; to the link. If a name is linked to multiple taxa, a disambiguation page will be shown (e.g. <a href="http://data.canadensys.net/vascan/name/white%20maple">white maple</a> or <a href="http://data.canadensys.net/vascan/name/Solanum%20nigrum">Solanum nigrum</a>).</p>

<p>You can use name pages to create dynamic links in your spreadsheet, database or website. Just concatenate &#8220;http://data.canadensys.net/vascan/name/&#8221; with the name. Please exclude the author and provide the infrageneric or infraspecific rank (&#8220;sect.&#8221; or &#8220;subsp.&#8221;) in the name if applicable. It is not necessary to capitalize names correctly or encode space characters as &#8220;%20&#8243;. An example of a formula you can use in Excel to create links to VASCAN is:</p>

<code>= HYPERLINK("http://data.canadensys.net/vascan/name/" &#038; A2, A2)</code>

<p><a href="http://www.canadensys.net/wp-content/uploads/vascan-functionality-namelinks.png"><img src="http://www.canadensys.net/wp-content/uploads/vascan-functionality-namelinks.png" alt="vascan-functionality-namelinks" class="aligncenter" /></a></p>

<h2>Checklist builder</h2>
<p>Interested to know which <a href="http://data.canadensys.net/vascan/checklist?lang=en&#038;habit=tree&#038;taxon=0&#038;combination=anyof&#038;province=NL_N&#038;status=native&#038;rank=class&#038;rank=subclass&#038;rank=superorder&#038;rank=order&#038;rank=family&#038;rank=subfamily&#038;rank=tribe&#038;rank=subtribe&#038;rank=genus&#038;rank=subgenus&#038;rank=section&#038;rank=subsection&#038;rank=series&#038;rank=species&#038;rank=subspecies&#038;rank=variety&#038;hybrids=true&#038;nohybrids=false&#038;limitResults=true&#038;nolimit=false&#038;sort=taxonomically">trees are native in Newfoundland</a>, which <a href="http://data.canadensys.net/vascan/checklist?lang=en&#038;habit=all&#038;taxon=0&#038;combination=allof&#038;province=AB&#038;province=SK&#038;province=MB&#038;status=introduced&#038;rank=genus&#038;hybrids=true&#038;nohybrids=false&#038;limitResults=true&#038;nolimit=false&#038;sort=alphabetically">genera are introduced in all of the prairies</a> or which <a href="http://data.canadensys.net/vascan/checklist?lang=en&#038;habit=all&#038;taxon=1639&#038;combination=only_ca&#038;province=BC&#038;status=native&#038;status=introduced&#038;status=ephemeral&#038;status=excluded&#038;status=extirpated&#038;status=doubtful&#038;rank=species&#038;hybrids=true&#038;nohybrids=false&#038;limitResults=true&#038;nolimit=false&#038;sort=alphabetically"><em>Salix</em> species occur in British Columbia, but not in the rest of Canada</a>? You can find it out with the <a href="http://data.canadensys.net/vascan/checklist">checklist builder</a>, which allows you to combine a set of selection criteria (taxonomic group, habit, distribution, status or a combination of these) and display criteria (ranks to include, sort preference) to create your own customized checklist.</p>
<p><a href="http://www.canadensys.net/wp-content/uploads/vascan-functionality-checklist-builder.png"><img src="http://www.canadensys.net/wp-content/uploads/vascan-functionality-checklist-builder.png" alt="vascan-functionality-checklist-builder" class="aligncenter" /></a></p>
<p>Once your happy with the result you can download the checklist as a simple tab delimited text file or as a standardized <a href="http://www.canadensys.net/darwin-core">Darwin Core archive</a> following the <a href="http://www.gbif.org/informatics/name-services/sharing-taxonomic-data/the-gna-profile/">GBIF Global Names Architecture Profile</a>. The latter one also includes the vernacular names and synonyms.</p>
<p>The data are there to be used, which is why we licensed them under the <a href="http://creativecommons.org/licenses/by-sa/3.0/">Creative Commons Attribution-ShareAlike 3.0 Unported License</a>, allowing you to build upon our work (read more about it <a href="http://data.canadensys.net/vascan/about#rights">here</a>). You can also download the <a href="http://data.canadensys.net/ipt/resource.do?r=vascan">full database as a Darwin Core archive</a>.</p>

<h2>Feedback</h2>

<p>If you discover an issue with the data or interface, or you want to send a suggestion, you can do so by clicking the feedback button on the right of every page. So far, the response has been amazing: users have submitted over 700 suggestions for correction, improving the overall quality of the database. All issues are recorded in our public <a href="http://code.google.com/p/canadensys/issues/list?can=2&#038;q=label%3AVascan">Google Code issue tracker</a>, and we take each one into consideration. You can read more about it <a href="http://data.canadensys.net/vascan/about#feedback">here</a> and <a href="http://code.google.com/p/canadensys/">here</a>.</p>

<h2>Summary</h2>

<p>And that pretty much sums it up. If you want an overview of the data and functionality, we gave a VASCAN demonstration at the <a href="http://neherbaria.org/2011_meeting">Consortium of Northeastern Herbaria meeting in Philadelphia</a>, June 2011, which is embedded below. Let us know what you think!</p>

<p><script async class="speakerdeck-embed" data-id="4f7c69a4847299001f000fe3" data-ratio="1.3333333333333333" src="//speakerdeck.com/assets/embed.js"></script></p><img src="http://feeds.feedburner.com/~r/canadensys/~4/03Vl8Xypx8U" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.canadensys.net/2011/vascan-functionalities/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://www.canadensys.net/2011/vascan-functionalities</feedburner:origLink></item>
		<item>
		<title>Counting all specimens at the Marie-Victorin Herbarium</title>
		<link>http://feedproxy.google.com/~r/canadensys/~3/0dlfZKaCVCw/counting-all-specimens-at-mt</link>
		<comments>http://www.canadensys.net/2011/counting-all-specimens-at-mt#comments</comments>
		<pubDate>Wed, 02 Nov 2011 15:30:15 +0000</pubDate>
		<dc:creator>Peter Desmet</dc:creator>
				<category><![CDATA[Collection management]]></category>
		<category><![CDATA[Darwin Core]]></category>
		<category><![CDATA[Herbaria]]></category>
		<category><![CDATA[Metadata]]></category>

		<guid isPermaLink="false">http://www.canadensys.net/?p=1366</guid>
		<description><![CDATA[Last summer, 36 volunteers inventoried all of the 22,000+ vascular plant specimen folders of our Marie-Victorin Herbarium (MT), in preparation of its move to the Université de Montréal Biodiversity Centre. I gave a talk about the process, results and potential two weeks ago at the TDWG 2011 Annual Conference in New Orleans: You can read [...]]]></description>
			<content:encoded><![CDATA[<p>Last summer, 36 volunteers inventoried all of the 22,000+ vascular plant specimen folders of our <a href="http://www.biodiversite.umontreal.ca/herbier-marie-victorin?lang=en">Marie-Victorin Herbarium (MT)</a>, in preparation of its move to the <a href="http://www.biodiversite.umontreal.ca/?lang=en">Université de Montréal Biodiversity Centre</a>. I gave a talk about the process, results and potential two weeks ago at the <a href="http://www.tdwg.org/conference2011/">TDWG 2011 Annual Conference</a> in New Orleans:</p>

<p><script async class="speakerdeck-embed" data-id="4ee52f2f6058ee004d00fb24" data-ratio="1.3333333333333333" src="//speakerdeck.com/assets/embed.js"></script></p>

<p>You can read the <a href="https://mbgserv18.mobot.org/ocs/index.php/tdwg/2011/paper/view/163">abstract for the contributed talk here</a>.</p>

<h2>Useful for the herbarium</h2>

<p>The take-home message is that this experiment has paid off really well. Thanks to a great group of volunteers, we were able to collect very useful quantitative metadata for the whole vascular plant collection, with a limited budget (5740$, all staff salary) and in a short time (158 work days, including all the post-counting processing). At 110 specimens per dollar, this was a hundred times cheaper than full digitization (1$/specimen is a number that has been floating around for a number of years).</p>
<p>With the data, curator <a href="http://www.irbv.umontreal.ca/luc-brouillet?lang=en">Luc Brouillet</a> was able to reorganize the herbarium taxonomically, following <a href="http://www.mapress.com/phytotaxa/content/2011/f/pt00019p054.pdf">Christenhusz et al. (2011a)</a> for lycophytes, <a href="http://www.jstor.org/stable/25065646">Smith et al. (2006)</a> for ferns, <a href="http://www.mapress.com/phytotaxa/content/2011/f/pt00019p070.pdf">Christenhusz et al. (2011b)</a> for gymnosperms and <a href="http://dx.doi.org/10.1111/j.1095-8339.2009.00996.x">APG III (2009)</a> for flowering plants (the same classification as in <a href="http://data.canadensys.net/vascan/">VASCAN</a>). All folders have now been assigned with a new case/tray number, which will help us tremendously in organizing the move.</p>

<p>With the data, we now also have a much more detailed overview of the collection:</p>
<ul>
	<li>628,664 specimens, which is lower than previously estimated. 21.5% are fully digitized.</li>
	<li>380 families: 82% of all known families</li>
	<li>5,298 genera</li>
	<li>6 continents. North America is further divided in Canada, US and Central America. We also have a category for cultivated specimens.</li>
</ul>
<p>Since we counted the specimens per folder, which is a combination of a case, tray, region, genus and family, we can also calculate their distribution per variable:</p>

<iframe width="560" height="300" scrolling="no" frameborder="no" src="http://www.google.com/fusiontables/embedviz?gco_chartArea=%7B%22top%22%3A%2230%22%7D&#038;containerId=gviz_canvas&#038;q=select+col6%2C+SUM(col4)+from+1435313+&#038;qrs=where+col6+%3E%3D+&#038;qre=+and+col6+%3C%3D+&#038;qe=+group+by++col6+limit+9&#038;viz=GVIZ&#038;t=PIE&#038;width=500&#038;height=300"></iframe>

<p>Or answer questions like: &#8220;How many <em>Rubus</em> specimens do we have from Canada?&#8221; Answer: 2921, located in trays A236-07 and A238-04.</p>

<h2>Useful for others</h2>

<p>We think our metadata is useful for others as well, which is why we published it online:</p>
<ul>
	<li>As a <a href="http://bit.ly/mt-inventory-gft">Google Fusion Table</a>, allowing you to filter, aggregate and visualize the data very quickly, exactly like the embedded pie-chart above.</li>
	<li>As a <a href="http://bit.ly/mt-inventory">Darwin Core Archive</a> on the <a href="http://data.canadensys.net/ipt">Canadensys repository</a>. Using <a href="http://rs.tdwg.org/dwc/">Darwin Core</a> to express this kind of metadata is a bit experimental, which is why the dataset is not registered with GBIF, but in my opinion it works pretty well. The only term I was missing is one for the folder&#8217;s location at the herbarium: caseTrayNumber, which is now shared in <a href="http://rs.tdwg.org/dwc/terms/index.htm#dynamicProperties">dynamicProperties</a>. The big advantage of using a Darwin Core Archive is that I can not only share the dataset, but also the purpose and process behind it (which you can read and download on the <a href="http://bit.ly/mt-inventory">IPT page</a>) <em>and</em> in a standardized format (<a href="http://knb.ecoinformatics.org/software/eml/">EML</a>); something that is not possible with Google Fusion Tables.</li>
</ul>
<p>Having the full inventory of the herbarium online will definitely help taxonomists who are interested in loans, but it might also attract the attention of other biodiversity researchers. It could even spark demand driven digitization or set priorities for digitization according to the needs of users outside the taxonomic community (see <a href="https://journals.ku.edu/index.php/jbi/article/viewArticle/3988">Berendsohn &#038; Seltmann, 2011</a>), although the granularity of the data (genus, continent) might be too coarse. But at least it&#8217;s a first step towards some real numbers, and if we extrapolate our experiment to all 350,000,000 herbarium specimens worldwide (<a href="http://sciweb.nybg.org/science2/IndexHerbariorum.asp">Index Herbariorium</a>) it would &#8220;only&#8221; cost 3,200,000$ to get a geotaxonomic index for all herbarium specimens! We have updated our page on the <a href="http://biocol.org/urn:lsid:biocol.org:col:14437">Biodiversity Collections Index</a> and <a href="http://sweetgum.nybg.org/ih/herbarium.php?irn=126533">Index Herbariorum</a>. So can you!</p><img src="http://feeds.feedburner.com/~r/canadensys/~4/0dlfZKaCVCw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.canadensys.net/2011/counting-all-specimens-at-mt/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.canadensys.net/2011/counting-all-specimens-at-mt</feedburner:origLink></item>
		<item>
		<title>Opening a Darwin Core Archive with Excel</title>
		<link>http://feedproxy.google.com/~r/canadensys/~3/QZ-6P_n4Whk/open-dwca-with-excel</link>
		<comments>http://www.canadensys.net/2011/open-dwca-with-excel#comments</comments>
		<pubDate>Fri, 26 Aug 2011 13:09:46 +0000</pubDate>
		<dc:creator>Peter Desmet</dc:creator>
				<category><![CDATA[Darwin Core]]></category>
		<category><![CDATA[Excel]]></category>
		<category><![CDATA[VASCAN]]></category>

		<guid isPermaLink="false">http://www.canadensys.net/?p=1251</guid>
		<description><![CDATA[One of the editors of VASCAN asked me for an Excel file with all the vernacular names in our database, so she could check them for typos. As the database administrator, I could have exported the names from our MySQL database, but I chose to approach the request from a user&#8217;s perspective instead. All VASCAN [...]]]></description>
			<content:encoded><![CDATA[<p>One of the editors of <a href="http://data.canadensys.net/vascan">VASCAN</a> asked me for an Excel file with all the vernacular names in our database, so she could check them for typos. As the database administrator, I could have exported the names from our MySQL database, but I chose to approach the request from a user&#8217;s perspective instead. All <a href="http://data.canadensys.net/ipt/resource.do?r=vascan">VASCAN data</a> are available as a Darwin Core Archive, so the question became: <strong>How do I open a <a href="http://rs.tdwg.org/dwc/terms/guides/text/index.htm">Darwin Core Archive</a> with <a href="http://office.microsoft.com/en-us/excel/">Excel</a></strong>? This post explains one approach.</p>

<h2>Downloading and opening the Darwin Core Archive</h2>

<p>You can download the entire VASCAN dataset as a Darwin Core Archive from the <a href="http://data.canadensys.net/ipt/resource.do?r=vascan">VASCAN website</a> or our <a href="http://data.canadensys.net/ipt/resource.do?r=vascan">IPT portal</a>. Both link to the <a href="http://data.canadensys.net/ipt/archive.do?r=vascan">same file</a>. You can also generate your own custom subset with the <a href="http://data.canadensys.net/vascan/checklist">checklist builder</a>.</p>
<p>I download the file and unzip it.</p>

<h2>Understanding the archive</h2>

<p><a href="http://www.canadensys.net/wp-content/uploads/open-dwca-with-excel-zip.png"><img src="http://www.canadensys.net/wp-content/uploads/open-dwca-with-excel-zip.png" alt="open-dwca-with-excel-zip" class="aligncenter" /></a></p>

<p>To understand the structure of the Darwin Core Archive, I need to take a look at the <a href="http://www.canadensys.net/wp-content/uploads/open-dwca-with-excel-meta.xml">meta.xml</a> file, shown here without clutter:</p>

<code>
&lt;?xml version="1.0" encoding="utf-8"?&gt;<br />
&lt;archive ... &gt;<br />
&nbsp;&lt;core ... rowType="<a href="http://rs.tdwg.org/dwc/terms/Taxon">http://rs.tdwg.org/dwc/terms/Taxon</a>"&gt;<br />
&nbsp;&nbsp;&lt;files&gt;&lt;location&gt;<strong>taxon.txt</strong>&lt;/location&gt;&lt;/files&gt;<br />
&nbsp;&nbsp;...<br />
&nbsp;&lt;/core&gt;<br />
&nbsp;&lt;extension ... rowType="<a href="http://rs.gbif.org/terms/1.0/Distribution">http://rs.gbif.org/terms/1.0/Distribution</a>"&gt;<br />
&nbsp;&nbsp;&lt;files&gt;&lt;location&gt;distribution.txt&lt;/location&gt;&lt;/files&gt;<br />
&nbsp;&nbsp;...<br />
&nbsp;&lt;/extension&gt;<br />
&nbsp;&lt;extension ... rowType="<a href="http://rs.gbif.org/terms/1.0/VernacularName">http://rs.gbif.org/terms/1.0/VernacularName</a>"&gt;<br />
&nbsp;&nbsp;&lt;files&gt;&lt;location&gt;<strong>vernacularname.txt</strong>&lt;/location&gt;&lt;/files&gt;<br />
&nbsp;&nbsp;...<br />
&nbsp;&lt;/extension&gt;<br />
&nbsp;&lt;extension ... rowType="<a href="http://rs.gbif.org/terms/1.0/Description">http://rs.gbif.org/terms/1.0/Description</a>"&gt;<br />
&nbsp;&nbsp;&lt;files&gt;&lt;location&gt;description.txt&lt;/location&gt;&lt;/files&gt;<br />
&nbsp;&nbsp;...<br />
&nbsp;&lt;/extension&gt;<br />
&lt;/archive&gt;<br />
</code>

<p>I see two interesting files:</p>
<ul>
	<li><strong>taxon.txt</strong>: the core, containing all the taxa in VASCAN</li>
	<li><strong>vernacularname.txt</strong>: an extension, containing all the vernacular names for those taxa</li>
</ul>
<p>A Darwin Core Extension allows to store one-to-many relationships. One record in the core can link to many records in the extension. In this example, one taxon can link to many vernacular names (e.g. in different languages, preferred vs. alternative names, etc.). Just like cores, Darwin Core extensions have to be registered, so others can understand and use them. The vernacular name extension was created by GBIF and is registered <a href="http://rs.gbif.org/extension/gbif/1.0/vernacularname.xml">here</a>.</p>

<p><img src="http://www.canadensys.net/wp-content/uploads/open-dwca-with-excel-star-schema.png" alt="open-dwca-with-excel-star-schema" class="aligncenter" /></p>

<p>I decide to use both files, as I&#8217;m also interested in the taxa the vernacular names apply to.</p>

<h2>Excel and UTF-8 files</h2>

<p>Like all Darwin Core text files, taxon.txt and vernacularname.txt are encoded as <a href="http://en.wikipedia.org/wiki/UTF-8">UTF-8</a>, which allows the storage of special characters and accents. In this case, this is useful for French vernacular names like &#8220;Fougères&#8221;.</p>
<p>Sadly though, Excel doesn&#8217;t open UTF-8 files correctly by default. I&#8217;m using Microsoft Excel 2008 for Mac, and all the following techniques <strong>do not work</strong>:</p>
<ul>
	<li>In Finder: Right-click file &gt; Open with &gt; Microsoft Excel</li>
	<li>In Excel: File &gt; Open&#8230; &gt; Enable: Text Files &gt; Open &gt; Text Import Wizard</li>
	<li>In Excel: File &gt; Import&#8230; &gt; Select: Text File &gt; Import &gt; Choose a file &gt; Get Data &gt; Text Import Wizard</li>
</ul>

<p><a href="http://www.canadensys.net/wp-content/uploads/open-dwca-with-excel-wizard.png"><img src="http://www.canadensys.net/wp-content/uploads/open-dwca-with-excel-wizard.png" alt="open-dwca-with-excel-wizard" class="aligncenter" /></a></p>

<p>In the Text Import Wizard I can only choose Macintosh, Windows (ANSI) and DOS or OS/2 (PC-8) as the file origin, non of which will shows accents correctly, so my data in Excel looks like this:</p>

<p><a href="http://www.canadensys.net/wp-content/uploads/open-dwca-with-excel-utf8-error.png"><img src="http://www.canadensys.net/wp-content/uploads/open-dwca-with-excel-utf8-error.png" alt="open-dwca-with-excel-utf8-error" class="aligncenter" /></a></p>

<p>In this example my data are stored as <a href="http://en.wikipedia.org/wiki/Tab-separated_values">tab-separated values (tsv)</a>, but the character encoding problem is the same for <a href="http://en.wikipedia.org/wiki/Comma-separated_values">comma-separated values (csv)</a>.</p>

<p>If you know for a fact that the dataset you&#8217;re working with doesn&#8217;t contain accents or special characters, you can of course use the Text Import Wizard. It has some advantages over the workaround method.</p>

<h2>Workaround</h2>

<p>The workaround I always use, is to open my UTF-8 file in a decent <strong>text editor</strong>, like <a href="http://sourceforge.net/projects/smultron/">Smultron</a> for Mac (do not use <a href="http://en.wikipedia.org/wiki/TextEdit">TextEdit</a>) or <a href="http://notepad-plus-plus.org/">Notepad++</a> for PC (both free) and then copy and paste the data in Excel.</p>

<p><a href="http://www.canadensys.net/wp-content/uploads/open-dwca-with-excel-smultron.png"><img src="http://www.canadensys.net/wp-content/uploads/open-dwca-with-excel-smultron.png" alt="open-dwca-with-excel-smultron" class="aligncenter" /></a></p>

<ol>
	<li>Open the text file in a text editor</li>
	<li>Make sure the editor displays the text correctly</li>
	<li>Select all the data (cmd/ctrl+A)</li>
	<li>Copy all the data (cmd/ctrl+C)</li>
	<li>Open an empty spreadsheet in Excel</li>
	<li>Choose: Format &gt; Cells&#8230; &gt; Set category to &#8220;Text&#8221;</li>
	<li>Paste everything (cmd/ctrl+V)</li>
	<li>Choose: Format &gt; Cells&#8230; &gt; Set category to &#8220;General&#8221;</li>
	<li>Save the file as .xlsx.</li>
</ol>

<p>In step 6 you force Excel to interpret all the data as text, which avoids annoying date transformations like &#8220;2011-03-20&#8243; to &#8220;03/20/2011&#8243;. Numbers will be interpreted as text too though, so sorting might not work as expected. You can always transform numbers back by using <a href="http://www.techonthenet.com/excel/formulas/value.php">=VALUE()</a>. In step 8 we revert back to &#8220;general&#8221;, so long cell values will not be shown as &#8220;######&#8221;. If you do not have dates in your dataset, you could skip step 6 and 8.</p>

<p>Step 9: If you work with an older version of Excel (97-2004) and/or you save your file as .xls, your file will be limited to 65.536 rows.</p>

<p>This workaround method only works for tab-separated values, not for comma-separated values. It is also important to know that copying large chunks of data might slow down or crash your computer. Make sure to save often.</p>

<p><a href="http://www.canadensys.net/wp-content/uploads/open-dwca-with-excel-result.png"><img src="http://www.canadensys.net/wp-content/uploads/open-dwca-with-excel-result.png" alt="open-dwca-with-excel-result" class="aligncenter" /></a></p>

<h2>Final thoughts</h2>

<p>A lot of people (myself included) use Excel daily: it&#8217;s a powerful and user friendly program to manipulate data, but it&#8217;s lousy at importing UTF-8 encoded data. I hope this post can help you avoid some of the common pitfalls. If you know an alternative method, please let us know in the comments.</p>

<p>My two text files are now correctly imported in Excel as two sheets/tabs: one for vernacular names and one for taxa. I will explain how you can link both together in another blog post.</p><img src="http://feeds.feedburner.com/~r/canadensys/~4/QZ-6P_n4Whk" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.canadensys.net/2011/open-dwca-with-excel/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://www.canadensys.net/2011/open-dwca-with-excel</feedburner:origLink></item>
		<item>
		<title>Customizing the IPT</title>
		<link>http://feedproxy.google.com/~r/canadensys/~3/aN8QMp541Tk/customizing-the-ipt</link>
		<comments>http://www.canadensys.net/2011/customizing-the-ipt#comments</comments>
		<pubDate>Fri, 29 Jul 2011 19:12:16 +0000</pubDate>
		<dc:creator>Peter Desmet</dc:creator>
				<category><![CDATA[IPT]]></category>

		<guid isPermaLink="false">http://www.canadensys.net/?p=1201</guid>
		<description><![CDATA[This post originally appeared on the GBIF Developer Blog. One of my responsibilities as the Biodiversity Informatics Manager for Canadensys is to develop a data portal giving access to all the biodiversity information published by the participants of our network. A huge portion of this task can now be done with the GBIF Integrated Publishing [...]]]></description>
			<content:encoded><![CDATA[<p class="note">This post originally appeared on the <a href="http://gbif.blogspot.com/2011/07/customizing-ipt.html">GBIF Developer Blog</a>.</p>

<p>One of my responsibilities as the Biodiversity Informatics Manager for <a href="http://www.canadensys.net/">Canadensys</a> is to develop a data portal giving access to all the biodiversity information published by the participants of our network. A huge portion of this task can now be done with the <a href="http://code.google.com/p/gbif-providertoolkit/">GBIF Integrated Publishing Toolkit version 2</a> or IPT. The IPT allows to host biodiversity resources, manage their data and metadata, and register them with GBIF so they can appear on the <a href="http://data.gbif.org/">GBIF data portal</a>, which are all targets we want to achieve. Best of all, most management can be done by the collection managers themselves.</p>

<p>I have tested the IPT thoroughly and I am convinced the GBIF development team has done an excellent job creating a stable tool I can trust. This post explains how I have customized <a href="http://data.canadensys.net/ipt">our IPT installation</a> to integrate it with our other Canadensys websites.</p>

<p><a href="http://www.canadensys.net/wp-content/uploads/customizing-the-ipt-default.png"><img src="http://www.canadensys.net/wp-content/uploads/customizing-the-ipt-default.png" alt="customizing-the-ipt-default" class="aligncenter" /></a></p>

<h2>Background</h2>
<p>Our <a href="http://www.canadensys.net/">Canadensys community portal</a> is powered by WordPress (MySQL, PHP), while our data portal &#8211; which before the IPT installation only consisted of the <a href="http://data.canadensys.net/vascan">Database of Vascular Plants of Canada (VASCAN)</a> &#8211; is a Tomcat application. We are using different technologies because we want to use the most adequate technology for a certain website. <a href="http://wordpress.org/">WordPress</a> (or <a href="http://drupal.org/">Drupal</a> for that matter) is an excellent and easy-to-use <a href="http://en.wikipedia.org/wiki/Content_management_system">CMS</a>, perfect for our community portal, but not suitable for a custom made checklist website like VASCAN. To the user however, both websites look the same:</p>

<p><a href="http://www.canadensys.net/wp-content/uploads/customizing-the-ipt-community-portal.png"><img src="http://www.canadensys.net/wp-content/uploads/customizing-the-ipt-community-portal.png" alt="customizing-the-ipt-community-portal" class="aligncenter" /></a></p>

<p><a href="http://www.canadensys.net/wp-content/uploads/customizing-the-ipt-data-portal.png"><img src="http://www.canadensys.net/wp-content/uploads/customizing-the-ipt-data-portal.png" alt="customizing-the-ipt-data-portal" class="aligncenter" /></a></p>

<p>We do this by using the same HTML markup and <a href="http://en.wikipedia.org/wiki/Cascading_Style_Sheets">CSS</a> for both websites. If you want to learn <a href="http://www.w3schools.com/html/default.asp">HTML</a> and <a href="http://www.w3schools.com/css/default.asp">CSS</a>, <a href="http://www.w3schools.com/">w3schools</a> provides excellent tutorials.</p>

<p>The HTML markup defines elements on a page (e.g. header, menu, content, sidebar, footer) and the CSS stylizes those elements (e.g. their position and color). The CSS is typically stored as one file (e.g. <a href="http://www.canadensys.net/wp-content/themes/canadensys/style.css">style.css</a>) which is referenced in the &lt;head&gt; section of a page. For dynamic websites, the HTML is typically stored as different files, one for each section of a page (e.g. header.php, sidebar.php). Those files are combined as one page by the server if a page is requested. That way, changing a common element on all pages of a website (e.g. the header) can be done by changing just one file.</p>

<p>All of this also applies to the IPT. Here&#8217;s how the IPT looks like without CSS:</p>

<p><a href="http://www.canadensys.net/wp-content/uploads/customizing-the-ipt-no-css.png"><img src="http://www.canadensys.net/wp-content/uploads/customizing-the-ipt-no-css.png" alt="customizing-the-ipt-no-css" class="aligncenter" /></a></p>

<h2>Attempt 1 &#8211; Editing the CSS and logo</h2>
<p>My first attempt at customizing the IPT was at the <a href="http://www.gbif.org/participation/training/events/training-event-details/?eventid=113">Experts Workshop</a> in Copenhagen, by changing the CSS and logo only, which you can find in the <a href="http://code.google.com/p/gbif-providertoolkit/source/browse/#svn%2Ftrunk%2Fgbif-ipt%2Fsrc%2Fmain%2Fwebapp%2Fstyles">/styles</a> folder of your IPT installation:</p>

<code>
<a href="http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-ipt/src/main/webapp/styles/main.css">/styles/main.css</a><br />
<a href="http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-ipt/src/main/webapp/styles/logo.jpg">/styles/logo.jpg</a>
</code>

<p>In 15 minutes, my IPT was Canadensys red and had a custom logo:</p>

<p><a href="http://www.canadensys.net/wp-content/uploads/customizing-the-ipt-red.png"><img src="http://www.canadensys.net/wp-content/uploads/customizing-the-ipt-red.png" alt="customizing-the-ipt-red" class="aligncenter" /></a></p>

<h2>Attempt 2 &#8211; Editing the FreeMarker files</h2>
<p>Even though my IPT now had its own branding, it was still noticeably different from the other Canadensys websites. The only way I could change that, was by editing the HTML as well. Luckily, the sections I wanted to change were all stored as FreeMarker files in the <a href="http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-ipt/src/main/webapp/#webapp%2FWEB-INF%2Fpages%2Finc">/inc</a> folder:</p>

<code><a href="http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-ipt/src/main/webapp/WEB-INF/pages/inc/header.ftl">/WEB-INF/pages/inc/header.ftl</a> - the &lt;head&gt; section<br />
<a href="http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-ipt/src/main/webapp/WEB-INF/pages/inc/menu.ftl">/WEB-INF/pages/inc/menu.ftl</a> - the header, menu and sidebar<br />
<a href="http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-ipt/src/main/webapp/WEB-INF/pages/inc/footer.ftl">/WEB-INF/pages/inc/footer.ftl</a> - the footer<br />
<a href="http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-ipt/src/main/webapp/WEB-INF/pages/inc/header_setup.ftl">/WEB-INF/pages/inc/header_setup.ftl</a> - the header during installation</code>

<p>I incorporated the HTML structure I use for the VASCAN website into menu.ftl (including the header, menu, container and sidebar), making sure I did not break any of the IPT functionality.</p><p>I started doing the same with main.css by replacing chunks of now unused IPT CSS with CSS I copied over from VASCAN, but I quickly realized that this wasn&#8217;t the best option. Doing so would result in 2 CSS files: one for VASCAN and one for IPT, even though both web applications are under the same <a href="http://data.canadensys.net/">domain name</a> with a lot of shared CSS. It would be easier if I only had to maintain a single stylesheet, used by both applications.</p>

<h2>Attempt 3 &#8211; One styles folder for the data portal</h2>
<p>I created a /common/styles folder under ROOT, where I placed my single common data portal stylesheet: <a href="http://data.canadensys.net/common/styles/common.css">/common/styles/common.css</a>. This would be the CSS file I could use for IPT and VASCAN. I did the same for my <a href="http://en.wikipedia.org/wiki/Favicon">favicon</a>: <a href="http://data.canadensys.net/common/images/favicon.png">/common/images/favicon.png</a>.</p>
<p>I added a reference to both files in the header.ftl of my IPT (and VASCAN):</p>

<code>&lt;link rel="stylesheet" type="text/css" href="${baseURL}/styles/main.css"&gt;<br />
&lt;link rel="stylesheet" type="text/css" href="http://data.canadensys.net/common/styles/common.css"&gt;<br />
&lt;link rel="shortcut icon" href="http://data.canadensys.net/common/images/favicon.png"&gt;</code>

<p>As you can see on the first line, I kept the reference to the default IPT stylesheet: <a href="http://data.canadensys.net/ipt/styles/main.css">${baseURL}/styles/main.css</a> (it&#8217;s perfectly fine to reference more than one CSS file). This is where I would keep all the unaltered (=default) IPT CSS. In fact, I&#8217;m not removing anything from the default IPT stylesheet, I&#8217;m only commenting out the CSS that is unused or conflicting:</p>

<code>/* Unused or conflicting CSS */</code>

<p>The advantage of doing so, is that I now easily can compare this commented file with changes in the stylesheet of any new IPT version.</p>

<p>After I had done everything, my IPT now looked like <a href="http://data.canadensys.net/ipt">this</a>:</p>

<p><a href="http://www.canadensys.net/wp-content/uploads/customizing-the-ipt-final-1.png"><img src="http://www.canadensys.net/wp-content/uploads/customizing-the-ipt-final-1.png" alt="customizing-the-ipt-final-1" class="aligncenter" /></a></p>

<p><a href="http://www.canadensys.net/wp-content/uploads/customizing-the-ipt-final-2.png"><img src="http://www.canadensys.net/wp-content/uploads/customizing-the-ipt-final-2.png" alt="customizing-the-ipt-final-2" class="aligncenter" /></a></p>

<p>My IPT is now sporting the Canadensys header, footer and sidebar (only visible when editing a resource), making it indistinguishable from the other Canadensys websites. It is also using a more readable font-size (13.5px) and a fluid width.</p>

<h2>Closing remarks</h2>
<p>I have (re)designed quite a lot of websites, and very often I have been so frustrated with the HTML and CSS that I just started over from scratch. I didn&#8217;t have that option here and it wasn&#8217;t necessary either. I would like to thank the GBIF development team for creating such an easily customizable tool, with logical HTML and CSS. As a reminder, the whole customization has been done by editing only 5 files:</p>

<code><a href="http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-ipt/src/main/webapp/styles/main.css">/styles/main.css</a><br />
<a href="http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-ipt/src/main/webapp/WEB-INF/pages/inc/header.ftl">/WEB-INF/pages/inc/header.ftl</a><br />
<a href="http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-ipt/src/main/webapp/WEB-INF/pages/inc/menu.ftl">/WEB-INF/pages/inc/menu.ftl</a><br />
<a href="http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-ipt/src/main/webapp/WEB-INF/pages/inc/footer.ftl">/WEB-INF/pages/inc/footer.ftl</a><br />
<a href="http://code.google.com/p/gbif-providertoolkit/source/browse/trunk/gbif-ipt/src/main/webapp/WEB-INF/pages/inc/header_setup.ftl">/WEB-INF/pages/inc/header_setup.ftl</a></code>

<p><span style="color:red;">Important</span>: Remember that installing a new IPT version will overwrite all the customized files, so make sure to back them up! I will try to figure out a way to reapply my customization automatically after an update and post about that experience in a follow-up post. In the meantime, I hope that this post will help others in the customization of their IPT.</p><img src="http://feeds.feedburner.com/~r/canadensys/~4/aN8QMp541Tk" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.canadensys.net/2011/customizing-the-ipt/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.canadensys.net/2011/customizing-the-ipt</feedburner:origLink></item>
		<item>
		<title>Specify at the Biodiversity Centre</title>
		<link>http://feedproxy.google.com/~r/canadensys/~3/DzeWZhnukB4/specify-at-the-biodiversity-centre</link>
		<comments>http://www.canadensys.net/2011/specify-at-the-biodiversity-centre#comments</comments>
		<pubDate>Sat, 09 Jul 2011 21:19:12 +0000</pubDate>
		<dc:creator>Peter Desmet</dc:creator>
				<category><![CDATA[Collection management]]></category>
		<category><![CDATA[Entomological collections]]></category>
		<category><![CDATA[Herbaria]]></category>

		<guid isPermaLink="false">http://www.canadensys.net/?p=1498</guid>
		<description><![CDATA[This post replaces an obsolete static page of the Canadensys website, which was last updated on June 8, 2010. We have since moved to FileMaker Pro for our database needs, which &#8211; although not perfect or cheap &#8211; allows batch importing, editing and exporting: features missing from Specify at the time (see also my message [...]]]></description>
			<content:encoded><![CDATA[<p class="note">This post replaces an obsolete static page of the Canadensys website, which was last updated on June 8, 2010. We have since moved to <a href="http://www.filemaker.com/">FileMaker Pro</a> for our database needs, which &#8211; although not perfect or cheap &#8211; allows batch importing, editing and exporting: features missing from <a href="http://specifysoftware.org/">Specify</a> at the time (see also <a href="https://groups.google.com/group/canadensys/browse_thread/thread/fdc5c136ca338b84/32ffc569d9329661">my message to Specify</a>)  and absolutely necessary for our data cleaning procedures. The database we developed in FileMaker is also less complicated (and thus more easily understandable) and highly customizable. The files and guidelines presented below are published in the hope that they will be useful for other Specify users, but they are no longer updated and we offer <strong>no warranty</strong> they will work. Please use at your own risk.</p>

<h2>Introduction</h2>
<p>We are using <a href="http://specifysoftware.org/">Specify</a> for the collections at the <a href="http://www.biodiversite.umontreal.ca/?lang=en">Université de Montréal Biodiversity Centre</a> and we want to share our experience and guidelines on this page. The collections that will be using Specify are:</p>

<ul>
	<li><a href="http://www.biodiversite.umontreal.ca/herbier-marie-victorin?lang=en">Marie-Victorin Herbarium (MT)</a></li>
	<li><a href="http://www.biodiversite.umontreal.ca/collection-entomologique-ouellet-robert?lang=en">Ouellet-Robert entomological collection (QMOR)</a></li>
</ul>

<h2>List of Specify fields we use</h2>
<p>The <a href="https://docs.google.com/spreadsheet/ccc?key=0Apk_TuFGIiOPdDNnNHhrbU5Rbklxc0RpZmpvTHZNa2c" class="doc_google">following document</a> is a list of the Specify fields we use per collection. We decided to follow the intended use of the Specify fields as closely as possible, as it will be more understandable for future database managers and to avoid problems with software updates. The file was first introduced in this <a href="http://groups.google.ca/group/canadensys/browse_thread/thread/26b2b9ea00debe6b">Canadensys forum post</a> and is updated as we go.</p>

<p><iframe src="https://spreadsheets.google.com/pub?key=0Apk_TuFGIiOPdDNnNHhrbU5Rbklxc0RpZmpvTHZNa2c&#038;hl=en&#038;output=html" height="420" width="840">Your browser does not support iframes.</iframe></p>

<h3>Information about <a href="https://docs.google.com/spreadsheet/ccc?key=0Apk_TuFGIiOPdDNnNHhrbU5Rbklxc0RpZmpvTHZNa2c" class="doc_google">this document</a></h3>

<p><strong>Column 1 &#038; 2</strong>: Specify table and field name, logically grouped by colour.</p>

<p><strong>Column 3</strong>: Correlating field in the Specify WorkBench.</p>
<ul>
	<li><strong>White</strong>: The field is available in the WorkBench by default.</li>
	<li><strong>Orange</strong>: The field is not available in the WorkBench.</li>
	<li><strong>Yellow (added to xml)</strong>: The field is added by us to the WorkBench by editing the xml files (see below). We hope to add all the fields we need eventually.</li>
</ul>

<p><strong>Column 4</strong>: Indication if the field is visible in the Specify user interface.</p>
<ul>
	<li><strong>Green (y)</strong>: The field is visible by default. This may depend on the collection type: botany, insect.</li>
	<li><strong>Green (xml)</strong>: We added the field to the user interface by editing the xml files. We hope we can add all the fields we need eventually.</li>
	<li><strong>Orange (n)</strong>: The field is not (yet) visible in the user interface.</li>
</ul>

<p><strong>Column 5</strong>: Explanation and remarks regarding the field or why we choose that field.</p>

<p><strong>Column 6</strong>: Field length in Specify. Useful to detect limitations.</p>

<h2>Adding fields to the WorkBench</h2>
<p>Not all fields of the Specify database are available in the Specify WorkBench (<a href="http://www.canadensys.net/wp-content/uploads/specify-workbench-fields.xls" class="doc_excel">see this outdated list</a>), but most can be added rather easily by editing two <a href="http://en.wikipedia.org/wiki/XML">XML</a> files. Below you can find the versions we created. To see what fields we have added, see column 3 of <a href="https://docs.google.com/spreadsheet/ccc?key=0Apk_TuFGIiOPdDNnNHhrbU5Rbklxc0RpZmpvTHZNa2c" class="doc_google">Specify fields used at the Université de Montréal Biodiversity Centre</a> (see above).</p>
<ul>
	<li>/Specify/config/<a href='http://www.canadensys.net/wp-content/uploads/specify_workbench_datamodel.xml'>specify_workbench_datamodel</a></li>
	<li>/Specify/config/<a href='http://www.canadensys.net/wp-content/uploads/specify_workbench_upload_def.xml'>specify_workbench_upload_def</a></li>
</ul>

<h2>How to install a Specify xml file</h2>
<ul>
	<li>Download the file. Select ‘File &gt; Save page as’ if the xml file  opens in your browser.</li>
	<li>Go to indicated folder of your Specify installation folder, for  example: /Specify/config.</li>
	<li>Rename the current (default) file to ‘&lt;filename&gt;_default.xml’  so it is backed up.</li>
	<li>Move the downloaded (custom) file to the folder.</li>
	<li>Open Specify to check if everything works correctly.</li>
</ul>

<p><strong>Important</strong>: if you update Specify, it will overwrite the custom  files with the default ones. Backup the custom files before you update!</p><img src="http://feeds.feedburner.com/~r/canadensys/~4/DzeWZhnukB4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.canadensys.net/2011/specify-at-the-biodiversity-centre/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.canadensys.net/2011/specify-at-the-biodiversity-centre</feedburner:origLink></item>
	</channel>
</rss>

