<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Inundata</title>
	<atom:link href="http://inundata.org/feed/" rel="self" type="application/rss+xml" />
	<link>http://inundata.org</link>
	<description></description>
	<lastBuildDate>Mon, 09 Nov 2015 20:09:32 +0000</lastBuildDate>
	<language>en-US</language>
		<sy:updatePeriod>hourly</sy:updatePeriod>
		<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.9.11</generator>
	<item>
		<title>A quick introduction to ggplot2</title>
		<link>http://inundata.org/2013/04/10/a-quick-introduction-to-ggplot2/</link>
		<comments>http://inundata.org/2013/04/10/a-quick-introduction-to-ggplot2/#comments</comments>
		<pubDate>Wed, 10 Apr 2013 20:14:33 +0000</pubDate>
		<dc:creator><![CDATA[Karthik Ram]]></dc:creator>
				<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://inundata.org/?p=816</guid>
		<description><![CDATA[My friend Jonah asked me to guest lecture in his R seminar aimed at grad students and postdocs in Integrative Biology. I gave Jonah a bunch of topic options ranging from reproducible research with R to data manipulation. The consensus was data visualization so I put together a 2 hour talk/hands on presentation for ggplot2 [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=A+quick+introduction+to+ggplot2&amp;rft.source=Inundata&amp;rft.date=2013-04-10&amp;rft.identifier=http%3A%2F%2Finundata.org%2F2013%2F04%2F10%2Fa-quick-introduction-to-ggplot2%2F&amp;rft.au=Karthik+Ram&amp;rft.format=text&amp;rft.language=English"></span><p>My friend <a href="https://sites.google.com/site/jpioviascott/">Jonah</a> asked me to guest lecture in his R seminar aimed at grad students and postdocs in <a href="http://ib.berkeley.edu/">Integrative Biology</a>. I gave Jonah a bunch of topic options ranging from reproducible research with R to data manipulation. The consensus was data visualization so I put together a 2 hour talk/hands on presentation for ggplot2 beginners. Here are my slides and code in case anyone else might benefit from it.</p>
<p><script async class="speakerdeck-embed" data-id="ce4889d0822701304e2812313d0544b5" data-ratio="1.33333333333333" src="//speakerdeck.com/assets/embed.js"></script></p>
<h2>What worked</h2>
<ul>
<li>People got a good sense of what&#8217;s possible with <code>ggplot2</code> even if they couldn&#8217;t keep up with the examples.</li>
<li>The code worked for most attendees except for a few glitches (see below).</li>
</ul>
<h2>What didn&#8217;t work</h2>
<ul>
<li>Several people still had very old copies of R or outdated copies of ggplot dependencies (which made their upgrade a little less intuitive). </li>
<li>Sourcing code/data from GitHub using <a href="https://github.com/hadley/devtools">devtools</a> <code>source_url</code> doesn&#8217;t seem to work on Windows machines. I can&#8217;t replicate the problem without access to one but I might have to work out a better way to share data.</li>
</ul>
<p>
<a href="https://speakerdeck.com/karthik/introduction-to-ggplot2">Slides are on Speakerdeck</a> and the <a href="https://github.com/karthikram/ggplot-lecture">full repository is on GitHub</a>. Please feel free to reuse, remix, or contribute to this presentation.</p>
]]></content:encoded>
			<wfw:commentRss>http://inundata.org/2013/04/10/a-quick-introduction-to-ggplot2/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Version control for science</title>
		<link>http://inundata.org/2013/02/28/version-control-for-science/</link>
		<comments>http://inundata.org/2013/02/28/version-control-for-science/#comments</comments>
		<pubDate>Thu, 28 Feb 2013 21:24:43 +0000</pubDate>
		<dc:creator><![CDATA[Karthik Ram]]></dc:creator>
				<category><![CDATA[Reproducible research]]></category>

		<guid isPermaLink="false">http://inundata.org/?p=806</guid>
		<description><![CDATA[I&#8217;ve been thinking a lot about the importance of version control in science of late. This is not just because of my involvement with multiple collaborative efforts that would be a nightmare to move forward without a structured workflow. I fortuitously got involved in a collaboration between GitHub, BiomedCentral, and a handful of bioinformatics scientists [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Version+control+for+science&amp;rft.source=Inundata&amp;rft.date=2013-02-28&amp;rft.identifier=http%3A%2F%2Finundata.org%2F2013%2F02%2F28%2Fversion-control-for-science%2F&amp;rft.au=Karthik+Ram&amp;rft.format=text&amp;rft.language=English"></span><p>I&#8217;ve been thinking a lot about the importance of version control in science of late. This is not just because of my involvement with multiple collaborative efforts that would be a nightmare to move forward without a structured workflow. I fortuitously got involved in a collaboration between GitHub, BiomedCentral, and a handful of bioinformatics scientists to explore the importance of version control in a scholarly communication context.</p>
<p>I&#8217;m happy to announce that the first outcome of the project, <a href="http://www.scfbm.org/content/8/1/7/abstract">a paper</a> (open access) outlining various use-cases for Git in science just went in press today.  Here is the <a href="http://blogs.biomedcentral.com/bmcblog/2013/02/28/github-and-biomed-central/">official announcement</a> from BMC and also a <a href="http://blogs.biomedcentral.com/bmcblog/2013/02/28/version-control-for-scientific-research/">blog post</a> that I co-wrote with <a href="http://ged.msu.edu/">C. Titus Brown</a>.</p>
<p><code><br />
Ram, K. (2013). git can facilitate greater reproducibility and increased transparency in science. Source Code Biol Med, 8, 7.<br />
</code></p>
<div data-badge-type='medium-donut' class='altmetric-embed' data-badge-details='right' data-doi='10.1186/1751-0473-8-7'></div>
]]></content:encoded>
			<wfw:commentRss>http://inundata.org/2013/02/28/version-control-for-science/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Altmetrics as a discovery tool</title>
		<link>http://inundata.org/2013/01/23/altmetrics-as-a-discovery-tool/</link>
		<comments>http://inundata.org/2013/01/23/altmetrics-as-a-discovery-tool/#comments</comments>
		<pubDate>Thu, 24 Jan 2013 00:10:46 +0000</pubDate>
		<dc:creator><![CDATA[Karthik Ram]]></dc:creator>
				<category><![CDATA[academia]]></category>
		<category><![CDATA[altmetrics]]></category>

		<guid isPermaLink="false">http://inundata.org/?p=796</guid>
		<description><![CDATA[Altmetrics is all the rage these days in the scientometrics world. One rationale for developing these metrics has been to quantify the entire range of academic output beyond publications to include everything from datasets and code to presentations. The idea is that these metrics would one day be used in tenure committees (and tenure track [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Altmetrics+as+a+discovery+tool&amp;rft.source=Inundata&amp;rft.date=2013-01-23&amp;rft.identifier=http%3A%2F%2Finundata.org%2F2013%2F01%2F23%2Faltmetrics-as-a-discovery-tool%2F&amp;rft.au=Karthik+Ram&amp;rft.format=text&amp;rft.language=English"></span><p><a href="http://altmetrics.org/manifesto/">Altmetrics</a> is all the rage these days in the scientometrics world. One rationale for developing these metrics has been to quantify the entire range of academic output beyond publications to include everything from datasets and code to presentations. The idea is that these metrics would one day be used in tenure committees (and tenure track applications?) to get a more complete picture of a researcher&#8217;s contributions. As much as I love the idea (since I do so much more than write papers) and the people behind these efforts, I honestly don&#8217;t think these metrics will have any impact on folks currently in the academic trenches (pre-tenure faculty and postdocs on the job market). But I&#8217;d love to be proven wrong here.</p>
<p>However, I do think altmetrics are a terrific discovery tool. I was recently approached with an idea for a collaboration (yes, yes I know my plate currently overfloweth but this would be several months down the line) on an emerging research topic. I was given a few of hot off the press articles as starting points to get my feet wet. When getting into new topics, I usually pull up some highly cited articles, look at reverse citations on Web of Science and go from there. This is somewhat harder to do for research that is really really new. In this case, without a second thought (probably because I hang around the altmetrics community a good bit and also develop <a href="http://ropensci.org/packages/index.html#altmetrics">some tools</a> on that front), I popped the articles I had into <a href="http://impactstory.org/">ImpactStory</a> and <a href="http://altmetric.com/">Altmetric</a> and boom! pay dirt. </p>
<p>Both altmetrics providers led me to some <strong><em>really insightful blogs</em></strong> (not currently on my reading list) that gave me a lot more context about how the topic emerged and where it is likely headed. I also found Tweeps (in this case scientists on Twitter) who are working on this topic. A quick look through their lab pages and recent pubs and I have a pretty good sense of what this is all about. All over the course of a few hours. Doing the same thing a few years ago would have been impossible.</p>
]]></content:encoded>
			<wfw:commentRss>http://inundata.org/2013/01/23/altmetrics-as-a-discovery-tool/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Make your research a little more open this year</title>
		<link>http://inundata.org/2013/01/03/make-your-research-a-little-more-open-this-year/</link>
		<comments>http://inundata.org/2013/01/03/make-your-research-a-little-more-open-this-year/#comments</comments>
		<pubDate>Thu, 03 Jan 2013 19:44:53 +0000</pubDate>
		<dc:creator><![CDATA[Karthik Ram]]></dc:creator>
				<category><![CDATA[open-science]]></category>

		<guid isPermaLink="false">http://inundata.org/?p=748</guid>
		<description><![CDATA[After a winter break of working at half-speed, I&#8217;m finding it a little daunting to face the overwhelming number of projects that need my attention in the new year. As I sort through and prioritize the ones where the biggest fires are raging, I also like to use this time to reevaluate how I go [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Make+your+research+a+little+more+open+this+year&amp;rft.source=Inundata&amp;rft.date=2013-01-03&amp;rft.identifier=http%3A%2F%2Finundata.org%2F2013%2F01%2F03%2Fmake-your-research-a-little-more-open-this-year%2F&amp;rft.au=Karthik+Ram&amp;rft.format=text&amp;rft.language=English"></span><p>After a winter break of working at half-speed, I&#8217;m finding it a little daunting to face the overwhelming number of projects that need my attention in the new year. As I sort through and prioritize the ones where the biggest fires are raging, I also like to use this time to reevaluate how I go about these activites and identify areas for improvement.  If you&#8217;re in the same mental state right now, I&#8217;d like you to consider making your science a little more open as a challenge to take on in 2013.</p>
<p>More and more research published these days is <a href="http://www.youtube.com/watch?v=N2zK3sAtr-4&#038;feature=youtu.be" title="A slightly scary panda illustrating the problem">difficult to replicate</a>, validate, or build upon without all the critical components such as the underlying data and the code that was used to analyze it. Although support for open data and open science is steadily growing in the research community, putting this into practice requires some upfront investment. Since there is often no immediate incentives or payoff, activities such as documenting code, metadata, and making both available in permanent repositories with appropriate licences end up taking the back seat.<br />
<span id="more-748"></span><br />
At <a href="http://ropensci.org/">rOpenSci</a>, my fantastic <a href="http://ropensci.org/about/#devteam">colleagues</a> and I have been building <a href="http://ropensci.org/packages/">various R packages</a> that make it easy to retrieve and reuse existing data and also share your research output through persistent repositories. If you&#8217;ve come across some of these before and found them useful, but hesitated because of the learning curve associated with using them for a real world project, you&#8217;re in luck! We&#8217;re offering our time and expertize to help you make your efforts (however small) to reuse data (or share your own research output) a reality. Our current suite of packages get you access to a rich variety of data from <a href="https://github.com/ropensci/treeBASE">phylogenetic</a> databases, <a href="https://github.com/ropensci/taxize_">taxonomic</a> databases, fisheries time series, to <a href="https://github.com/ropensci/rplos">full-text of any PLOS article</a> and <a href="http://ropensci.github.com/rImpactStory/">various</a> <a href="http://ropensci.github.com/rAltmetric/">scienceometrics</a> datasets. Perhaps the <a href="http://ropensci.org/tutorials/">tutorials</a> might inspire you. Even if you&#8217;re only working with data you collected, you can use rOpenSci tools to programmatically clean, and submit your data, code, and/or manuscript pre-prints to <a href="http://figshare.com/">figshare</a>, a free science repository that will give you a permanent location (with a doi) to share with colleagues. If you have additional data sources in mind that don&#8217;t have an associated R package, drop us a line. We might be able to put together something fairly quickly for you to use (or we&#8217;ll add it to our existing todo list). </p>
<p><a href="http://ropensci.org/open-science-challenge/"><img src="http://inundata.org/wp-content/uploads/2013/01/ropensci_challenge.png" alt="rOpenSci Challenge"  class="alignnone size-full wp-image-751" /></a></p>
<p>Learn more about the <a href="http://ropensci.org/open-science-challenge/" title="rOpenSci's open science challenge.">Open Science challenge</a> and get in touch. </p>
]]></content:encoded>
			<wfw:commentRss>http://inundata.org/2013/01/03/make-your-research-a-little-more-open-this-year/feed/</wfw:commentRss>
		<slash:comments>22</slash:comments>
		</item>
		<item>
		<title>Formatting tables in markdown</title>
		<link>http://inundata.org/2012/12/25/formatting-tables-in-markdown/</link>
		<comments>http://inundata.org/2012/12/25/formatting-tables-in-markdown/#comments</comments>
		<pubDate>Tue, 25 Dec 2012 22:48:55 +0000</pubDate>
		<dc:creator><![CDATA[Karthik Ram]]></dc:creator>
				<category><![CDATA[markdown]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://inundata.org/?p=725</guid>
		<description><![CDATA[Since someone asked about tables in markdown in the comments section of an earlier post, I thought I&#8217;d elaborate a little more. Since the appeal of markdown is its minimalism, options for formatting tables are also fairly limited. LaTeX is a much better tool if one needs to work with complicated tables (like cells that [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Formatting+tables+in+markdown&amp;rft.source=Inundata&amp;rft.date=2012-12-25&amp;rft.identifier=http%3A%2F%2Finundata.org%2F2012%2F12%2F25%2Fformatting-tables-in-markdown%2F&amp;rft.au=Karthik+Ram&amp;rft.format=text&amp;rft.language=English"></span><p>Since <a href="http://inundata.org/2012/12/06/pre-print-servers/#comment-353">someone asked about tables in markdown</a> in the comments section of an earlier post, I thought I&#8217;d elaborate a little more. Since the appeal of markdown is its minimalism, options for formatting tables are also fairly limited. LaTeX is a much better tool if one needs to work with complicated tables (like cells that span multiple columns).</p>
<h2>Pandoc flavored tables</h2>
<p>Since I&#8217;m still discussing <a href="http://johnmacfarlane.net/pandoc/" target="_blank">Pandoc</a> as the markdown parser, I&#8217;ll stick with the table formatting options that it can correctly parse. The <a href="http://johnmacfarlane.net/pandoc/README.html#tables">pandoc user guide</a> has several examples for formatting tables. One can create simple, multi-line (although not multi-cell), and gridded (with borders) tables. Although it is possible to manually format tables (using a fixed-width font editor), it&#8217;s much easier to do it programatically with R and one of several packages (<strong>ascii</strong>, <strong>pander</strong>). If the data were entered into a spreadsheet, those could easily be exported as  <strong>csv</strong> files. Alternatively, if the data to be used in the tables were generated programmatically, perhaps as the output of an analysis in R, those could also be easily saved as <strong>csv</strong> files or directly formatted into markdown-flavored tables with R and knitr.</p>
<h2>Load the data</h2>
<p>If the data are already in a spreadsheet, simply read those in R. For the sake of this post, I&#8217;ll create a really simple table to use in a markdown document.</p>
<pre><code lang="r">&gt; foo &lt;- data.frame(x = 1:3, y = rnorm(3))
&gt; foo
  x          y
1 1 -1.3665947
2 2 -0.9967103
3 3 -0.6870180
</code></pre>
<p>Using the <strong>pandoc.table</strong> function in the <a href="http://cran.r-project.org/web/packages/pander/index.html">pander</a>, this data.frame could easily be formatted with:</p>
<pre><code lang="r">
pandoc.table(foo)
</code></pre>
<p>resulting in</p>
<pre><code lang="r">-----------
 x     y   
--- -------
 1  -1.3666

 2  -0.9967

 3  -0.6870
-----------
</code></pre>
<p>with a <strong>caption</strong></p>
<pre><code lang="r">
&gt; pandoc.table(foo, caption = "This is the table caption")

-----------
 x     y   
--- -------
 1  -1.3666

 2  -0.9967

 3  -0.6870
-----------

Table: This is the table caption
</code></pre>
<p>as a <strong>gridded</strong> table</p>
<pre><code lang="r">
&gt; pandoc.table(foo, caption = "This is the table caption", 
style = "grid")


+-----+---------+
|  x  |    y    |
+=====+=========+
|  1  | -1.3666 |
+-----+---------+
|  2  | -0.9967 |
+-----+---------+
|  3  | -0.6870 |
+-----+---------+

Table: This is the table caption
</code></pre>
<p>One could also use <strong>multiline</strong> as a style option for tables containing lots of text. By default, the text is split at 30 characters but one could specify one with <strong>split.cells</strong>. Wide tables can also be split (default is 80 characters) using <strong>split.table</strong>.</p>
<p>Table headers can be justified (left, right, or center) with the <strong>justify</strong> option. Not so complicated, right?</p>
]]></content:encoded>
			<wfw:commentRss>http://inundata.org/2012/12/25/formatting-tables-in-markdown/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Thoughts on a preprint server</title>
		<link>http://inundata.org/2012/12/06/pre-print-servers/</link>
		<comments>http://inundata.org/2012/12/06/pre-print-servers/#comments</comments>
		<pubDate>Thu, 06 Dec 2012 14:45:25 +0000</pubDate>
		<dc:creator><![CDATA[Karthik Ram]]></dc:creator>
				<category><![CDATA[academia]]></category>
		<category><![CDATA[markdown]]></category>

		<guid isPermaLink="false">http://inundata.org/?p=697</guid>
		<description><![CDATA[In my last post I sang praises for markdown as a way to write and collaborate on manuscripts and other scientific documents. As easy as it is to use, the one command line step is enough of a barrier for most academics. This brought back an old idea that I batted around with a few [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Thoughts+on+a+preprint+server&amp;rft.source=Inundata&amp;rft.date=2012-12-06&amp;rft.identifier=http%3A%2F%2Finundata.org%2F2012%2F12%2F06%2Fpre-print-servers%2F&amp;rft.au=Karthik+Ram&amp;rft.format=text&amp;rft.language=English"></span><p>In my <a href="http://inundata.org/2012/12/04/how-to-ditch-word/">last post</a> I sang praises for markdown as a way to write and collaborate on manuscripts and other scientific documents. As easy as it is to use, the one command line step is enough of a barrier for most academics. This brought back an old idea that I batted around with a few folks right after <a href="http://esa.org/portland/">ESA</a>. With all the tools and web technologies that currently exist, it would be possible to create a pre-print server that runs on markdown and version controlled with git. Here&#8217;s a rough vision for how all the pieces might fit together.</p>
<p><img src="http://inundata.org/images/mockup.png" alt="A quick draft of what this might look like" /></p>
<h2>Creating a new manuscript</h2>
<p>An author would begin by logging in with their <a href="https://github.com/">GitHub</a> (or similar) account. When a new manuscript is created, it automatically initializes a new git repository for the paper. Collaborators can be added using their GitHub accounts which automatically gives them write access. Documents are publicly readable by default (just like any GitHub repo) although one could change it to private if need be (which turns the repo private at the GitHub end).</p>
<h2>Choosing a reference library</h2>
<p>Next, authors choose a library to use with this manuscript. This could be either Mendeley or Zotero (or any other alternative) since both services have APIs and mechanisms for collaboration. With Mendeley, for example, it is trivial to read an up-to-date list of documents from a shared library <a href="http://apidocs.mendeley.com/home/user-specific-methods/user-library-document-details">using existing API methods</a>. Both services also have bookmarklets which makes it easy to update missing references without ever leaving the browser (and all these efforts would show up on the desktop library at the next sync). An author could also manually upload a bib file. As authors type out citations in the editor window, the engine behind the web app autocompletes the process by reading from the JSON file (if reading from an API call).</p>
<h2>Adding in tables and figures</h2>
<p>Although it would be ideal to embed R/Python/other code directly into the MS and have knitr add in the results and figures, it&#8217;ll leave that out from version 1 of this hypothetical server. Instead, authors can just use an uploader and add in tables (as csv files), and figures (as images).</p>
<p>As the author builds up the manuscript, snapshots (git commits) can be saved at anytime with a human-friendly commit message. Even if an author doesn&#8217;t commit often, the document (and associated files) remain autosaved. At any time the manuscript can be previewed and exported into any format (with pandoc or other document conversion tool powering this part of the engine).</p>
<p>Authors familiar with git can skip the web app entirely and simply clone a copy, work locally, and commit back to the repo. This will appear seamlessly on the web version and remain transparent to co-authors.</p>
<h2>Submitting the manuscript</h2>
<p><img src="http://inundata.org/images/pull_request.png" alt="How neat would this be if you could send a pull request to a journal?" /></p>
<p>When authors are ready to submit, it could be as simple as <em>forking</em> the repo over (although a little too soon for something this efficient). For now, one could export the final PDF, or if the journal has a write API, then submit directly via an API call and quickly fill out author info with a form. Ideally there would also be a link to the full repository so reviewers can see everything.</p>
<h2>Some existing pieces</h2>
<p>There are several pieces that could be hacked together to make a first draft of this work.</p>
<ul>
<li>
<p><strong>markdown preview</strong> &#8211; There are several implementations of live rendering a markdown preview to html. <a href="http://socrates.io/#lGFHqLa">Here&#8217;s a particularly elegant one</a>. Pandoc (or even <a href="http://mojavelinux.github.com/decks/asciidoc-with-pleasure/">Asciidoc</a>) could also run behind the scenes and quickly parse the document.</p>
</li>
<li>
<p><strong>Git bindings</strong> &#8211; Abstracting git from the user (avoiding issues like merging and merge conflicts for the time being) could be done using GitHub&#8217;s existing API.</p>
</li>
<li>
<p><strong>Citations</strong> &#8211; This is already possible with the current version of Mendeley/Zotero API.</p>
</li>
<li>
<p><strong>Stats</strong> &#8211; With all the rapid development on <a href="http://shiny.rstudio.org/">Shiny</a>, executable papers with embedded R code aren&#8217;t far off. Here&#8217;s a <a href="http://glimmer.rstudio.com/yihui/knitr/">neat prototype</a> of a live, in-browser markdown file with embedded R code  being parsed by <a href="http://yihui.name/knitr/">knitr</a>.</p>
</li>
<li>
<p><strong>Comments</strong> &#8211; The issues feature on GitHub could be repurposed to serve as a feedback mechanism. Reviewers could refer to specific blocks of text using the line highlight feature.</p>
</li>
</ul>
<p><strong>Note</strong>: Although I mention a few services in the post (GitHub, Mendeley), the system is not dependent on these specific providers. Git repositories can be hosted anywhere (or even in multiple locations) and almost every reference manager can export citations as bibtex. GitHub just has the advantage of being a popular service (so a large user base), and already hosts the most number of academic papers, software, and code used in data analyses.</p>
]]></content:encoded>
			<wfw:commentRss>http://inundata.org/2012/12/06/pre-print-servers/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>How to ditch Word</title>
		<link>http://inundata.org/2012/12/04/how-to-ditch-word/</link>
		<comments>http://inundata.org/2012/12/04/how-to-ditch-word/#comments</comments>
		<pubDate>Tue, 04 Dec 2012 20:48:36 +0000</pubDate>
		<dc:creator><![CDATA[Karthik Ram]]></dc:creator>
				<category><![CDATA[markdown]]></category>
		<category><![CDATA[open-science]]></category>
		<category><![CDATA[writing]]></category>

		<guid isPermaLink="false">http://inundata.org/?p=479</guid>
		<description><![CDATA[I spent an hour this morning polishing up a proposal. This mostly involved running spell-checks, cleaning up tables, and making sure I added in all the right references. That&#8217;s when I realized something. I haven&#8217;t used Microsoft Word to write anything in over 6 months. How fantastic! Like everyone else I&#8217;ve been complaining about MS [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=How+to+ditch+Word&amp;rft.source=Inundata&amp;rft.date=2012-12-04&amp;rft.identifier=http%3A%2F%2Finundata.org%2F2012%2F12%2F04%2Fhow-to-ditch-word%2F&amp;rft.au=Karthik+Ram&amp;rft.format=text&amp;rft.language=English"></span><p>I spent an hour this morning polishing up a proposal. This mostly involved running spell-checks, cleaning up tables, and making sure I added in all the right references. That&#8217;s when I realized something. I haven&#8217;t used <strong>Microsoft Word</strong> to write anything in over 6 months. How fantastic!</p>
<p>Like everyone else I&#8217;ve been complaining about MS Word since the last ice age but never had a better alternative. When <a href="http://daringfireball.net/projects/markdown/">Markdown</a> came around I was smitten but there were several things missing. Tables were hard to format and I still had to get the final text back into Word to insert citations. I&#8217;m happy to report that I&#8217;ve found great solutions to both these issues. So here&#8217;s a quick how-to (following up from my earlier <a href="http://inundata.org/2012/06/01/markdown-and-the-future-of-collaborative-manuscript-writing/">post</a>) for switching your writing workflow away from Word. There is a small learning curve but the payoff is wonderful.</p>
<h2>Software you&#8217;ll need</h2>
<ul>
<li>
<p><a href="http://johnmacfarlane.net/pandoc/">Pandoc</a> &#8211; It&#8217;s like the swiss army knife of document conversion. Although it&#8217;s command-line only, pandoc is easy to use and quickly converts any document into whatever format you desire.</p>
</li>
<li>
<p><a href="http://mendeley.com/">Mendeley</a> &#8211; A free reference manager. Mendeley is great for two reasons. First, it allows you to collaborate via shared libraries (especially when writing with multiple authors). Second, Mendeley can automatically export those libraries to a bib file anytime you make changes to them. <em>(update: You are free to use any reference manager. Just export a bibtex file to the folder containing your writing)</em>.</p>
</li>
<li>
<p>A markdown editor (optional) &#8211; Technically you don&#8217;t need any special software to write in markdown. Any text editor will do. However, there are several tools and helpers that make the process easier and more fun to use. <a href="http://markedapp.com/">Marked</a> for e.g. renders a live preview into one of several styles (or custom ones). If you&#8217;re on a mac, here is a <a href="http://mac.appstorm.net/roundups/productivity-roundups/35-markdown-apps-for-the-mac/">complete roundup of Markdown editors</a>. My favorites are <a href="http://www.iawriter.com/">iA Writer</a>, and <a href="http://www.sublimetext.com/">Sublime Text</a> with the <a href="https://github.com/demon386/SmartMarkdown">SmartMarkdown package</a>. <a href="http://mouapp.com/">Mou</a> is great for beginners.</p>
</li>
<li>
<p><a href="http://yihui.name/knitr/">knitr</a> (optional) &#8211; If you plan to insert data tables, either from a spreadsheet or if you need to incorporate summary statistics, knitr will run the code for you in R and insert the output in pandoc friendly format (with the help of the <a href="http://rapporter.github.com/pander/">pander package</a>). This step isn&#8217;t necessary if you don&#8217;t require tables. I&#8217;ll describe this process in more detail in my next post.</p>
</li>
</ul>
<p>That&#8217;s it as far as set up goes.</p>
<h2>Writing your document</h2>
<p>The markdown syntax is super easy to learn. It takes all of 5 minutes to learn and the documents are easily readable even when unparsed (unlike LaTeX). Here&#8217;s a <a href="http://warpedvisions.org/projects/markdown-cheat-sheet/">quick guide</a> to markdown syntax. Here&#8217;s what a simple markdown document looks like:</p>
<pre><code># Title
some text. 
some more text.
## a sub-heading
More text. A [link](http://google.com/). 
A figure
![Figure 1: caption](figure.png)</code></pre>
<p>This <a href="http://mouapp.com/images/Mou_Screenshot_1.png">screenshot</a> shows you unparsed and parsed markdown side by side.</p>
<h2>Adding in citations</h2>
<p>Now if you need to cite anything, first add documents to your Mendeley folder or group and have it automatically export to a bib file into the same folder as your document (see Mendeley desktop&#8217;s settings). To cite any document, look at the details pane for a <strong>citation key</strong>.</p>
<p><img src="http://inundata.org/images/mendeley_key.png" alt="What the Mendeley citation key looks like" /><br />
To cite this reference, add it in like so:</p>
<p><code>some statement [@Costello2009].<br />
statement with multiple citations [@Costello2009; @Costello2010].</code></p>
<h2>Generating a pdf</h2>
<p>Now you can use Pandoc to turn this markdown file into any format you like. Word (docx), rtf, pdf, html, LaTeX, plain text. Just change the pdf extension to the output format you need.</p>
<p>The simple way:</p>
<p><code>pandoc document.md -o document.pdf</code></p>
<p>With citations:</p>
<p><code>pandoc document.md -o document.pdf --bibliography citations.bib</code></p>
<p>Formatting for a journal? Grab the citation styles from <a href="https://github.com/citation-style-language/styles">here</a> and drop it into your folder. Then specify that style during document generation:</p>
<p><code>pandoc document.md -o document.pdf --bibliography cite.bib --csl style.csl</code></p>
<p>You can create a Make file for each project and run that instead of typing in the pandoc call into your terminal (although this is super easy to remember once you use it a few times). That&#8217;s really it. You can do a lot more like adding in results, tables, figures, and equations using <a href="http://www.mathjax.org/">mathjax</a> but I&#8217;ll save the more advanced stuff for a future post. </p>
<h2> Workflow</h2>
<p>When starting any new writing project, I create a new folder with two files (my markdown document and a small script). If this folder doesn&#8217;t already sit inside a git repository, I initialize one so my writing is version controlled (to avoid <a href="http://www.phdcomics.com/comics/archive/phd101212s.gif" title="So this doesn't happen." target="_blank">this</a>) from the very beginning. Version control makes it really easy to return the document to any stage, remotely back it up on GitHub (and or other locations), and edit asynchronously with multiple coauthors (all of which are impossible with Word). When I need the formatted version, I run the script which:<br />
* Copies in the most current version of the bib file from Mendeley<br />
* Parses my markdown with pandoc using the settings I need (citations, equations, margins) and outputs a pdf (for viewing) and Word (for some collaborators that still prefer this format).
</p>
<p><strong>Update:</strong> <a href="https://github.com/karthikram/smb_git">Here is a real world example</a> of how I do this. Just click the zip icon to grab a copy and test this out for yourself.</p>
]]></content:encoded>
			<wfw:commentRss>http://inundata.org/2012/12/04/how-to-ditch-word/feed/</wfw:commentRss>
		<slash:comments>41</slash:comments>
		</item>
		<item>
		<title>PLOS Altmetrics workshop</title>
		<link>http://inundata.org/2012/11/08/plos-altmetrics-workshop/</link>
		<comments>http://inundata.org/2012/11/08/plos-altmetrics-workshop/#comments</comments>
		<pubDate>Thu, 08 Nov 2012 23:48:59 +0000</pubDate>
		<dc:creator><![CDATA[Karthik Ram]]></dc:creator>
				<category><![CDATA[altmetrics]]></category>

		<guid isPermaLink="false">http://inundata.org/?p=448</guid>
		<description><![CDATA[I was fortunate enough to be invited to the PLOS altmetrics workshop held last week in Fort Mason as part of the rOpenSci team. For those of you that haven&#8217;t heard of the term altmetrics, it refers to alternative measures of scholarly impact beyond just citations which can take a very long time before being [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=PLOS+Altmetrics+workshop&amp;rft.source=Inundata&amp;rft.date=2012-11-08&amp;rft.identifier=http%3A%2F%2Finundata.org%2F2012%2F11%2F08%2Fplos-altmetrics-workshop%2F&amp;rft.au=Karthik+Ram&amp;rft.format=text&amp;rft.language=English"></span><p>I was fortunate enough to be invited to the <a href="http://www.plos.org/">PLOS</a> <a href="https://sites.google.com/site/altmetricsworkshop/">altmetrics workshop</a> held last week in Fort Mason as part of the <a href="http://ropensci.org/about/#devteam">rOpenSci team</a>. For those of you that haven&#8217;t heard of the term <strong>altmetrics</strong>, it refers to alternative measures of scholarly impact beyond just citations which can take a very long time before being useful and may still not be such a good indicator of real impact. A recent <a href="http://sciencecareers.sciencemag.org/career_magazine/previous_issues/articles/2012_11_09/caredit.a1200124">news piece in Science</a> as well as the <a href="http://altmetrics.org/manifesto/">original manifesto</a> written by <a href="http://jasonpriem.org/">Jason</a>, <a href="http://www.few.vu.nl/~pgroth/Site/Welcome.html">Paul</a>, and <a href="http://nitens.org/taraborelli/home">Dario</a> is also worth reading. Pedro Beltrao also posted a <a href="http://pbeltrao.blogspot.com/2012/11/scholarly-metrics-with-heart.html">summary</a> of the meeting.</p>
<p>We discussed a lot of challenges and approaches to using altmetrics, gaining wider adoption, and dealing with issues such as gaming, sentiment analysis, and context. Even though I am a strong supporter of the idea, I still struggle with these issues (as an academic) so I was happy to see some of the smartest people in this field tackle these ideas.</p>
<p>After two whole days of breakout groups, and idea development, we spent day three at the really cool PLOS HQ <a href="https://sites.google.com/site/altmetricsworkshop/altmetrics-hackathon">hacking</a> together several of these ideas. I had a great time working with several others in the Alt Viz group (developing visualizations for article level metrics) where we brainstormed and implemented a few ideas for best ways to capture metrics for single articles over time and building snapshots of multiple articles. You can see some of our efforts <a href="http://karthikram.github.com/almviz/">here</a> and <a href="https://github.com/karthikram/almviz">here</a>.</p>
<p>As far as rOpenSci&#8217;s contribution to ALMs, we briefly demo&#8217;ed our 3 altmetric packages: <a href="https://github.com/ropensci/raltmet">raltmet</a>, <a href="http://ropensci.github.com/rAltmetric/">rAltmetric</a> (which incidentally became available on <a href="http://cran.r-project.org/web/packages/rAltmetric/">CRAN</a> today), and <a href="http://ropensci.github.com/rImpactStory/">rImpactStory</a>. You can see slides from the demo <a href="http://ropensci.org/alm/">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://inundata.org/2012/11/08/plos-altmetrics-workshop/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Markdown and the future of collaborative manuscript writing</title>
		<link>http://inundata.org/2012/06/01/markdown-and-the-future-of-collaborative-manuscript-writing/</link>
		<comments>http://inundata.org/2012/06/01/markdown-and-the-future-of-collaborative-manuscript-writing/#comments</comments>
		<pubDate>Fri, 01 Jun 2012 16:25:43 +0000</pubDate>
		<dc:creator><![CDATA[Karthik Ram]]></dc:creator>
				<category><![CDATA[academia]]></category>
		<category><![CDATA[markdown]]></category>
		<category><![CDATA[writing]]></category>

		<guid isPermaLink="false">http://inundata.org/?p=375</guid>
		<description><![CDATA[When I first started using markdown a couple of years ago, I expected its popularity to be somewhat short lived and mostly in a blogging/note taking context. The greatest appeal of markdown is the fact the learning curve is non-existent, unparsed documents are easily readable (Latex on the other hand is not), and content can [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Markdown+and+the+future+of+collaborative+manuscript+writing&amp;rft.source=Inundata&amp;rft.date=2012-06-01&amp;rft.identifier=http%3A%2F%2Finundata.org%2F2012%2F06%2F01%2Fmarkdown-and-the-future-of-collaborative-manuscript-writing%2F&amp;rft.au=Karthik+Ram&amp;rft.format=text&amp;rft.language=English"></span><p>When I first started using <a href="http://en.wikipedia.org/wiki/Markdown">markdown</a> a couple of years ago, I expected its popularity to be somewhat short lived and mostly in a blogging/note taking context. The greatest appeal of markdown is the fact the learning curve is non-existent, unparsed documents are easily readable (Latex on the other hand is not), and content can easily be parsed to a variety of formats with minimal effort. Little did I envision that it might one day revolutionize the world of collaborative academic writing.</p>
<p>I believe a few factors have made this possible:</p>
<ol>
<li>
<p>In the last few years, <a href="https://github.com/">Github</a> has skyrocketed in popularity among academics as a way to collaborate on statistical analyses. Github is extremely markdown friendly (and has its own <a href="http://github.github.com/github-flavored-markdown/">flavored version</a>) allowing people to effortlessly document code.</p>
</li>
<li>
<p>Although document generation tools have existed in <a href="http://r-project.org">R</a> (<a href="http://r4stats.com/articles/popularity/">currently one of the most widely used statistical software tools in the academic community</a>) for quite some time, recent efforts such <a href="http://yihui.name/">Yihui&#8217;s</a> <a href="http://yihui.name/knitr/">knitr</a> package have made is much easier for people to weave in results and figures both into traditional document formats such as Latex but also into markdown. The clutter-free, readability factor of markdown makes it easy to write and edit text alongside results and has a lower barrier to entry compared to Latex. Combine this with the free, cross-platform document generator <a href="http://johnmacfarlane.net/pandoc/">Pandoc</a>, one could easily embed citations and journal styles to programatically generate a final document in any desired format.</p>
</li>
<li>
<p>Github&#8217;s <a href="https://github.com/blog/831-issues-2-0-the-next-generation">powerful issue tracker</a> provide a quick and easy way to solicit feedback from collaborators, track milestones, and more importantly leverage <a href="http://git-scm.com/">Git&#8217;s version control</a> capabilities (no more Word document clutter).</p>
</li>
</ol>
<p>Although only a handful of people are currently writing manuscripts in markdown, I&#8217;m really excited at the prospect of making this my primary workflow for all future (especially collaborative) manuscripts. All the results and figures can be generated by knitr, citations embedded using Pandoc, and the final document converted on the fly into one of many formats (latex, word, rtf, markdown) while the entire workflow (code, analyses, manuscript) remains synched with all collaborators via Github.</p>
<p>I&#8217;m planning to write a series of detailed posts describing my workflow that involves Github + Knitr + Pandoc. Stay tuned.</p>
]]></content:encoded>
			<wfw:commentRss>http://inundata.org/2012/06/01/markdown-and-the-future-of-collaborative-manuscript-writing/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Imposter week</title>
		<link>http://inundata.org/2012/04/30/imposter-week/</link>
		<comments>http://inundata.org/2012/04/30/imposter-week/#comments</comments>
		<pubDate>Mon, 30 Apr 2012 15:04:57 +0000</pubDate>
		<dc:creator><![CDATA[Karthik Ram]]></dc:creator>
				<category><![CDATA[academia]]></category>
		<category><![CDATA[ecology]]></category>

		<guid isPermaLink="false">http://inundata.org/?p=356</guid>
		<description><![CDATA[I&#8217;ll freely admit that even as a postdoc I suffer from quite a bit of impostor syndrome, more so than when I was a grad student. Although this feeling is widespread among academics, it is not impossible to beat. Looks like everyone has decided to speak out about it this week on the academic blogosphere. [&#8230;]]]></description>
				<content:encoded><![CDATA[<!-- coins metadata inserted by kblog-metadata -->
<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=kblog-metadata.php&amp;rft.title=Imposter+week&amp;rft.source=Inundata&amp;rft.date=2012-04-30&amp;rft.identifier=http%3A%2F%2Finundata.org%2F2012%2F04%2F30%2Fimposter-week%2F&amp;rft.au=Karthik+Ram&amp;rft.format=text&amp;rft.language=English"></span><p>I&#8217;ll freely admit that even as a postdoc I suffer from quite a bit of impostor syndrome, more so than when I was a grad student. Although this feeling is widespread among academics, it is not impossible to beat. Looks like everyone has decided to speak out about it this week on the academic blogosphere. It started out last week with a great post by fellow blogger and tweep <a href="http://contemplativemammoth.wordpress.com/2012/04/25/how-i-cured-my-imposter-syndrome/">Jacqueline Gill</a> on how she overcame her impostor syndrome. There is also this really comprehensive post (with a bucket load of links) at <a href="http://contemplativemammoth.wordpress.com/2012/04/25/how-i-cured-my-imposter-syndrome/">Neurotic Physiology</a>.<br />
If you&#8217;re a postdoc reading this, this post (<a href="http://thetightropeblog.wordpress.com/2012/04/19/some-days-i-just-want-to-crawl-under-my-desk-and-cry-16/">Some days, I just want to crawl under my desk and cry</a>) best describes how I feel some days.  </p>
<p>PS: It gets better.</p>
]]></content:encoded>
			<wfw:commentRss>http://inundata.org/2012/04/30/imposter-week/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
