<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">
<channel>
	<title>Campagne Laboratory</title>
	
	<link>http://campagnelab.org</link>
	<description>News about research from the Campagne laboratory.</description>
	<lastBuildDate>Wed, 22 Feb 2012 19:41:25 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/campagnelab" /><feedburner:info uri="campagnelab" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><geo:lat>40.76842</geo:lat><geo:long>-73.96045</geo:long><item>
		<title>Estimating methylation rate over genomic regions and sites</title>
		<link>http://feedproxy.google.com/~r/campagnelab/~3/aS99ollYzn0/</link>
		<comments>http://campagnelab.org/3445/#comments</comments>
		<pubDate>Wed, 22 Feb 2012 13:45:48 +0000</pubDate>
		<dc:creator>Fabien Campagne</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">http://campagnelab.org/?p=3445</guid>
		<description><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/3445/"></g:plusone></div>
&#160; We posted a new tutorial that explains how to use Goby to estimate methylation rates over regions (requires Goby 1.9.8.3+). The illustration on the left shows estimates of methylation over both regions and individual sites. Files were generated with GobyWeb ready for visualization in IGV. The tutorial explains how to produce these files on [...]]]></description>
			<content:encoded><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/3445/"></g:plusone></div>
<div id="attachment_3436" class="wp-caption alignleft" style="width: 160px"><a href="http://campagnelab.org/wp-content/uploads/2012/02/Dnmt3b-region-sites-snapshot-23.png"><img class="size-thumbnail wp-image-3436" title="Dnmt3b-region-sites-snapshot-2" src="http://campagnelab.org/wp-content/uploads/2012/02/Dnmt3b-region-sites-snapshot-23-150x150.png" alt="Visualizing both region and site estimates of methylation rates with IGV" width="150" height="150" /></a><p class="wp-caption-text">Visualizing both region and site estimates of methylation rates with IGV</p></div>
<p>&nbsp;</p>
<p>We posted a new tutorial that explains how to use Goby to <a href="http://campagnelab.org/software/goby/tutorials/methylation-analyses-over-annotated-regions/">estimate methylation rates over regions</a> (requires Goby 1.9.8.3+). The illustration on the left shows estimates of methylation over both regions and individual sites. Files were generated with <a href="http://gobyweb.campagnelab.org">GobyWeb</a> ready for visualization in IGV. The tutorial explains how to produce these files on the command line with <a href="http://goby.campagnelab.org">Goby</a>. [<a href="http://campagnelab.org/software/goby/tutorials/methylation-analyses-over-annotated-regions/">tutorial</a>]</p>
<img src="http://feeds.feedburner.com/~r/campagnelab/~4/aS99ollYzn0" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://campagnelab.org/3445/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://campagnelab.org/3445/</feedburner:origLink></item>
		<item>
		<title>Goby 1.9.8.3 released</title>
		<link>http://feedproxy.google.com/~r/campagnelab/~3/d9C5xzfBXis/</link>
		<comments>http://campagnelab.org/goby-1-9-8-3-released/#comments</comments>
		<pubDate>Thu, 16 Feb 2012 22:19:51 +0000</pubDate>
		<dc:creator>Fabien Campagne</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">http://campagnelab.org/?p=3403</guid>
		<description><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/goby-1-9-8-3-released/"></g:plusone></div>
We have release Goby 1.9.8.3. This version makes it possible to estimate methylation rates over annotated regions of the genome and fixes a few minor bugs.  The recently released version of GobyWeb (version 1.7.1) requires Goby 1.9.8.3+. See the project change log for more details.]]></description>
			<content:encoded><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/goby-1-9-8-3-released/"></g:plusone></div>
<p>We have release Goby 1.9.8.3. This version makes it possible to estimate methylation rates over annotated regions of the genome and fixes a few minor bugs.  The recently released version of GobyWeb (version 1.7.1) requires Goby 1.9.8.3+. See the project <a href="http://campagnelab.org/software/goby/change-log/">change log</a> for more details.</p>
<img src="http://feeds.feedburner.com/~r/campagnelab/~4/d9C5xzfBXis" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://campagnelab.org/goby-1-9-8-3-released/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://campagnelab.org/goby-1-9-8-3-released/</feedburner:origLink></item>
		<item>
		<title>GobyWeb 1.7.1 released</title>
		<link>http://feedproxy.google.com/~r/campagnelab/~3/aaWzZx1RF1Y/</link>
		<comments>http://campagnelab.org/gobyweb-1-7-1-released/#comments</comments>
		<pubDate>Thu, 16 Feb 2012 22:13:48 +0000</pubDate>
		<dc:creator>Fabien Campagne</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">http://campagnelab.org/?p=3399</guid>
		<description><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/gobyweb-1-7-1-released/"></g:plusone></div>
We have released the binary distribution for GobyWeb 1.7.1 (release dated Feb 16 2012). This is a stable release with a few bug fixes and is the first public release to include the plugin system. The plugin mechanism makes it easier to integrate new aligners and analysis tools with GobyWeb. New plugins can be added at [...]]]></description>
			<content:encoded><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/gobyweb-1-7-1-released/"></g:plusone></div>
<p><a href="http://campagnelab.org/wp-content/uploads/2010/03/gobyweb_logo.png"><img class="alignleft size-full wp-image-1861" title="gobyweb_logo" src="http://campagnelab.org/wp-content/uploads/2010/03/gobyweb_logo.png" alt="" width="181" height="116" /></a>We have released the binary distribution for <a href="http://gobyweb.campagnelab.org/">GobyWeb</a> 1.7.1 (release dated Feb 16 2012). This is a stable release with a few bug fixes and is the first public release to include the plugin system. The plugin mechanism makes it easier to integrate new aligners and analysis tools with GobyWeb. New plugins can be added at run-time and the user interface adapts as needed. The binary distribution can be obtained <a href="http://campagnelab.org/software/gobyweb/license-binary-distribution-and-installation-instructions/">here</a>. We are in the process of updating the installation instructions (the plugin system has more hooks for configuration and to adapt to a local system).</p>
<img src="http://feeds.feedburner.com/~r/campagnelab/~4/aaWzZx1RF1Y" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://campagnelab.org/gobyweb-1-7-1-released/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://campagnelab.org/gobyweb-1-7-1-released/</feedburner:origLink></item>
		<item>
		<title>Goby 1.9.8.2</title>
		<link>http://feedproxy.google.com/~r/campagnelab/~3/pjt1rmlCJmg/</link>
		<comments>http://campagnelab.org/goby-1-9-8-2/#comments</comments>
		<pubDate>Sat, 28 Jan 2012 18:24:33 +0000</pubDate>
		<dc:creator>Fabien Campagne</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">http://campagnelab.org/?p=3373</guid>
		<description><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/goby-1-9-8-2/"></g:plusone></div>
We have released Goby 1.9.8.2. This version offers the vcf-subset and vcf-compare replacements tools I mentioned in my earlier VCF post. The release also packs an option to call indels with Goby. We use the method of Krawitz et al (Bioinformatics 2010) to find equivalent indel regions (EIR). This approach can re-conciliate distinct indel observations into canonical  indel [...]]]></description>
			<content:encoded><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/goby-1-9-8-2/"></g:plusone></div>
<p>We have released Goby 1.9.8.2. This version offers the vcf-subset and vcf-compare replacements tools I mentioned in my earlier VCF post.</p>
<p>The release also packs an option to call indels with Goby. We use the method of Krawitz et al (Bioinformatics 2010) to find equivalent indel regions (EIR). This approach can re-conciliate distinct indel observations into canonical  indel boundaries (an EIR). The genotype and compare-groups formats of the discover-sequence-variants mode will output EIRs at a frequency that sum over all the possible indel variations observed at the site that can be explained by that EIR. Of course, there is quite more to the Goby indel calling approach than the Krawitz method. For instance, the approach is integrated with the fast algorithm for local realignment around indels, so that indels that open when realigning end of reads contribute to the frequency of an EIR.</p>
<p>Programmers will find that Goby represents  observed indels at a site in a very similar way to base genotypes. Reading a base or indel frequency at a position in a sample is done with the same API (see the <a href="http://icbtools.med.cornell.edu/javadocs/goby/">SampleCountInfo</a> class). This makes it easy to support indels in different output formats.</p>
<p>The vcf-compare replacement (new in this release) can keep random samples of positions that differ between input files according to each category of differences it tallies (e.g., missed one allele RA vs RR, missed two alleles AA vs RR, genotypes differ C/T vs A/T where R=G). This is quite useful in inspecting positions in a genome viewer to try and understand differences between calls made by two approaches.</p>
<p>More details about this release are in the <a href="http://campagnelab.org/software/goby/change-log/">ChangeLog</a>.</p>
<img src="http://feeds.feedburner.com/~r/campagnelab/~4/pjt1rmlCJmg" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://campagnelab.org/goby-1-9-8-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://campagnelab.org/goby-1-9-8-2/</feedburner:origLink></item>
		<item>
		<title>Stumbled on PLINK/SEQ while looking for a tool to estimate Ti/Tv from VCF files. At the moment, P…</title>
		<link>http://feedproxy.google.com/~r/campagnelab/~3/kwiv_RRa_kI/</link>
		<comments>http://campagnelab.org/stumbled-on-plinkseq-while-looking-for-a-tool-to-estimate-titv-from-vcf-files-at-the-moment-p/#comments</comments>
		<pubDate>Thu, 12 Jan 2012 00:09:58 +0000</pubDate>
		<dc:creator>Fabien Campagne</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">http://campagnelab.org/stumbled-on-plinkseq-while-looking-for-a-tool-to-estimate-titv-from-vcf-files-at-the-moment-p/</guid>
		<description><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/stumbled-on-plinkseq-while-looking-for-a-tool-to-estimate-titv-from-vcf-files-at-the-moment-p/"></g:plusone></div>
Stumbled on PLINK/SEQ while looking for a tool to estimate Ti/Tv from VCF files. At the moment, PLINK/SEQ seems limited to some older version of VCF, so it does not quite work with the files Goby generate (4.1), but I am interested in the mention of a binary file alternative to VCF, which could speed [...]]]></description>
			<content:encoded><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/stumbled-on-plinkseq-while-looking-for-a-tool-to-estimate-titv-from-vcf-files-at-the-moment-p/"></g:plusone></div>
<p>Stumbled on PLINK/SEQ while looking for a tool to estimate Ti/Tv from VCF files. At the moment, PLINK/SEQ seems limited to some older version of VCF, so it does not quite work with the files Goby generate (4.1), but I am interested in the mention of a binary file alternative to VCF, which could speed up some of the work we do (see section about project creation).
<div class="g-crossposting-att">
<div class="g-crossposting-att-title"><a href="http://atgu.mgh.harvard.edu/plinkseq/tutorial.shtml" target="_blank">PLINK/SEQ genetics library</a></div>
<div class="g-crossposting-att-img" style="float:left"><a href="http://atgu.mgh.harvard.edu/plinkseq/tutorial.shtml" target="_blank"><img src="http://images0-focus-opensocial.googleusercontent.com/gadgets/proxy?container=focus&amp;gadget=a&amp;resize_h=100&amp;url=http%3A%2F%2Fatgu.mgh.harvard.edu%2Fplinkseq%2Fimg%2Fsing-by-dp.png" /></a></div>
<div class="g-crossposting-att-txt">Tutorial: working with 1000 Genomes Pilot 3 VCFs. By way of introducing some of the features and approaches of PLINK/Seq, this page provides a tutorial that uses PSEQ and the R interface to PLINK/Seq &#8230;</div>
</div>
<div class="g-crossposting-backlink"><a href="https://plus.google.com/116874816214311977726/posts/a1rFXT4q58U" target="_blank">This was posted on Google+&hellip;</a></div>
<img src="http://feeds.feedburner.com/~r/campagnelab/~4/kwiv_RRa_kI" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://campagnelab.org/stumbled-on-plinkseq-while-looking-for-a-tool-to-estimate-titv-from-vcf-files-at-the-moment-p/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://campagnelab.org/stumbled-on-plinkseq-while-looking-for-a-tool-to-estimate-titv-from-vcf-files-at-the-moment-p/</feedburner:origLink></item>
		<item>
		<title>Here’s a nice blog-review about studies that discovered variations causing diseases with next-gen…</title>
		<link>http://feedproxy.google.com/~r/campagnelab/~3/tRpcDFz7mdk/</link>
		<comments>http://campagnelab.org/heres-a-nice-blog-review-about-studies-that-discovered-variations-causing-diseases-with-next-gen/#comments</comments>
		<pubDate>Wed, 11 Jan 2012 16:21:07 +0000</pubDate>
		<dc:creator>Fabien Campagne</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">http://campagnelab.org/heres-a-nice-blog-review-about-studies-that-discovered-variations-causing-diseases-with-next-gen</guid>
		<description><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/heres-a-nice-blog-review-about-studies-that-discovered-variations-causing-diseases-with-next-gen/"></g:plusone></div>
Here&#39;s a nice blog-review about studies that discovered variations causing diseases with next-generation sequencing. In addition to the content, I like the comment of the author that there is no time to write a review paper given the speed of the field development. Obviously what he means (since he has written the material already), is [...]]]></description>
			<content:encoded><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/heres-a-nice-blog-review-about-studies-that-discovered-variations-causing-diseases-with-next-gen/"></g:plusone></div>
<p>Here&#39;s a nice blog-review about studies that discovered variations causing diseases with next-generation sequencing. </p>
<p>In addition to the content, I like the comment of the author that there is no time to write a review paper given the speed of the field development. Obviously what he means (since he has written the material already), is that publishing in a journal has very high overheads for the author(s) that simply slow down communication of information. In this case, I would argue that the comments have served a similar role as peer-review, prompting the author to add links to the papers, adding to the information or modulating the claims of novelty (see comment about RET).
<div class="g-crossposting-att">
<div class="g-crossposting-att-title"><a href="http://www.massgenomics.org/2011/12/disease-causing-mutations-discovered-by-ngs-in-2011.html" target="_blank">Disease-causing Mutations Discovered by NGS | MassGenomics</a></div>
<div class="g-crossposting-att-txt">The number of human genetic diseases unraveled by next-generation sequencing skyrocketed this year. Several factors contributed to this growth, two of which were the ever-increasing throughput of sequ&#8230;</div>
</div>
<div class="g-crossposting-backlink"><a href="https://plus.google.com/116874816214311977726/posts/UUidexCw1U3" target="_blank">This was posted on Google+&hellip;</a></div>
<img src="http://feeds.feedburner.com/~r/campagnelab/~4/tRpcDFz7mdk" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://campagnelab.org/heres-a-nice-blog-review-about-studies-that-discovered-variations-causing-diseases-with-next-gen/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://campagnelab.org/heres-a-nice-blog-review-about-studies-that-discovered-variations-causing-diseases-with-next-gen/</feedburner:origLink></item>
		<item>
		<title>Evaluating Goby against the 1000 genome genotype calls and why is VCF so inefficient?</title>
		<link>http://feedproxy.google.com/~r/campagnelab/~3/pjkPwrDV564/</link>
		<comments>http://campagnelab.org/evaluating-goby-against-the-1000-genome-genotype-calls-and-why-is-vcf-so-inefficient/#comments</comments>
		<pubDate>Sat, 07 Jan 2012 19:22:08 +0000</pubDate>
		<dc:creator>Fabien Campagne</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">http://campagnelab.org/?p=3328</guid>
		<description><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/evaluating-goby-against-the-1000-genome-genotype-calls-and-why-is-vcf-so-inefficient/"></g:plusone></div>
We have recently started a large-scale evaluation of the genotype calling features of Goby and GobyWeb. To this end, we decided to obtain exome data from the 1000 genome project, and compare the genotypes called by Goby when all processing is done with GobyWeb (alignment and genotype calls). Since the 1000g project has way more [...]]]></description>
			<content:encoded><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/evaluating-goby-against-the-1000-genome-genotype-calls-and-why-is-vcf-so-inefficient/"></g:plusone></div>
<p>We have recently started a large-scale evaluation of the genotype calling features of Goby and GobyWeb. To this end, we decided to obtain exome data from the 1000 genome project, and compare the genotypes called by Goby when all processing is done with GobyWeb (alignment and genotype calls). Since the 1000g project has way more data than we need for this evaluation, we picked two exome samples semi-randomly. Both are paired-end and one has length 76bp, while the other is 90bp long.</p>
<h3>Exome data realignments</h3>
<p>Realigning the reads to the 1000g reference was no trouble, we simply converted the bam files distributed by the 1000g project to the compact-reads format and uploaded this to GobyWeb. The rest is pretty much automated and was done in a matter of hours.</p>
<h3>Extracting a few samples from the 1000g VCF files</h3>
<p>The 1000g genome project distributes many versions of the genotype calls, in the VCF format. Locating the version that was produced against the 1000g reference (based on hg19) that we have installed in GobyWeb was a bit tricky since there is really no summary of all the versions. Thanks to Juan Rodriguez-Flores (in Jason Mesei&#8217;s lab at the ICB), for recommending this version:</p>
<p><a href="ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/technical/working/20111111_old_phase1_release_files/">ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/technical/working/20111111_old_phase1_release_files/</a></p>
<p>(we had initially gone directly to the very latest release, being wary of the &#8220;old&#8221; keyword in this version, but that the most recent version turned out to have been aligned against some ancestral reference reconstructed from primates, as far as we could tell and would not work for this validation).</p>
<p>As one can see form this directory, genotype calls are given in the VCF format, and split in one file per chromosome (excluding the X and Y chromosome and MT). The files are in the tens of GB even though they are compressed. The files are large because they contain genotypes and annotations for hundreds of samples studied in the 1000g genome.</p>
<p>Trying vcf-compare against just one of these files convinced us the comparison would be too slow against the complete files. We decided to extract just the two samples we selected for validation to yield smaller files that could be compared more efficiently.</p>
<p>Fortunately&#8212;we initially thought&#8212;VCF-tools provides a program called vcf-subset. Let&#8217;s just run this program with the names of the two samples we need to extract, on each of these chromosome, then concat the result. It turns out that vcf-subset is incredibly slow for the work it needs to perform. To be more specific, on a fast server, after a day of processing, we had not finished extracting the two samples from the chromosome 1 file. Upon closer inspection, the perl process was running at 100% CPU, but did not appear to make much process through this file (as judged from the speed at which results were added to the output). At this point, rather than throwing more CPUs at the problem and go on vacation (Chrismas time), we decided to go green (consider that inefficient programs are just as detrimental to our environment than other wasteful ways to burn oil).</p>
<p>Since Goby provides an efficient VCF parser, we reasoned we could write a more efficient way to extract the data we needed without too much trouble. To this end, we added a vcf-subset mode in Goby (an early implementation of this mode made it to 1.9.8.1, but we suggest getting the source code directly from our <a href="http://campagnelab.org/software/goby/download-goby/">subversion server</a> until we push 1.9.8.2 since the mode has improved a lot since that first release). Re-implementing the mode indeed provided a performance boost, but also offered an opportunity to add new options. One new feature that we added quickly was to process a number of files in parallel. We are now able to subset the 1000g VCF files in a few hours on a multi-threaded server.</p>
<h3>Why is vcf-subset so slow?</h3>
<p>This is obviously much better, but one has to wonder what is taking so long? After all, the input files are only a few Gigabytes, and we don&#8217;t need to do anything complicated, just extract a subset of information. It turns out that the design of the VCF format makes the task very computationally demanding (much more so than it would need to be). First of all, VCF is a text-based format, which by definition is slow to parse. To complicate matters, the format of the file can vary from line to line (look at the specification of the FORMAT and sample column). In my opinion, this is a very poor design decision. Consider  whether you have yet to encountered a VCF file that use the feature (e.g., that has different fields in the FORMAT field on different lines)? The need is not common, yet every program must be written to support this &#8220;flexibility&#8221; and there is a clear computing cost for supporting the feature. This is a typical red flag that should have told the designers of the added flexibility was not worth the compute cost.</p>
<p>Another complication introduced by VCF is that the type of delimiters varies by field (the INFO fields are delimited by ;, FORMAT fields use :, while other fields are delimited by tabs). All these &#8220;features&#8221; may seem to provide flexibility, but they combine to create significant inefficiencies. The format looks like as if it was designed so that humans can read it, yet is now used to store gigabytes of data. All this suggests to me that the committee/members of the mailing list who have been  responsible for the design of VCF could have paid more attention to the actual uses of the format and should have weighted  the impact of adding &#8220;nice to have, but not that frequent&#8221; features against the computational cost of these features. Then again, most people don&#8217;t care if a program or format is inefficient, at least until the computational cost makes some needed tasks impractical. It seems that VCF may be ripe for a redesign, it clearly meets requirements, but it is so complicated and inefficient that it would make sense to replace it with a leaner and efficient alternative.</p>
<p>I would not be surprised if the techniques we used in Goby formats could yield a more compressed VCF format alternative that could be subset in minutes rather than hours. Whether the community is feeling enough pain to consider adopting an alternative is a different question. What do you think?</p>
<img src="http://feeds.feedburner.com/~r/campagnelab/~4/pjkPwrDV564" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://campagnelab.org/evaluating-goby-against-the-1000-genome-genotype-calls-and-why-is-vcf-so-inefficient/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://campagnelab.org/evaluating-goby-against-the-1000-genome-genotype-calls-and-why-is-vcf-so-inefficient/</feedburner:origLink></item>
		<item>
		<title>Goby 1.9.8.1</title>
		<link>http://feedproxy.google.com/~r/campagnelab/~3/-40vv57axpo/</link>
		<comments>http://campagnelab.org/goby-1-9-8-1/#comments</comments>
		<pubDate>Sat, 17 Dec 2011 15:30:34 +0000</pubDate>
		<dc:creator>Fabien Campagne</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">http://campagnelab.org/?p=3319</guid>
		<description><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/goby-1-9-8-1/"></g:plusone></div>
We have released critical performance enhancements and bug fixes in Goby 1.9.8.1. A detailed list of changes can be found in the Change Log. All users are encouraged to install this latest distribution. Highlight include much better performance merging alignments with large &#62;100MB .tmh files (as required when calling variations across many samples) and several [...]]]></description>
			<content:encoded><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/goby-1-9-8-1/"></g:plusone></div>
<p>We have released critical performance enhancements and bug fixes in Goby 1.9.8.1. A detailed list of changes can be found in the <a href="http://campagnelab.org/software/goby/change-log/">Change Log</a>. All users are encouraged to install this <a href="http://campagnelab.org/software/goby/download-goby/">latest distribution</a>. Highlight include much better performance merging alignments with large &gt;100MB .tmh files (as required when calling variations across many samples) and several fixes for subtle bugs that extensive testing has recently uncovered.</p>
<img src="http://feeds.feedburner.com/~r/campagnelab/~4/-40vv57axpo" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://campagnelab.org/goby-1-9-8-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://campagnelab.org/goby-1-9-8-1/</feedburner:origLink></item>
		<item>
		<title>GobyWeb 1.6.1</title>
		<link>http://feedproxy.google.com/~r/campagnelab/~3/L7eJUmWPcyQ/</link>
		<comments>http://campagnelab.org/gobyweb-1-6-1/#comments</comments>
		<pubDate>Thu, 10 Nov 2011 03:57:02 +0000</pubDate>
		<dc:creator>Fabien Campagne</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">http://campagnelab.org/?p=3298</guid>
		<description><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/gobyweb-1-6-1/"></g:plusone></div>
We have just released a binary distribution of GobyWeb version 1.6.1. This is the first public release of GobyWeb. Detailled installation instructions are available on the download page. Please let us know if you are planning a local installation and have questions not covered in the instructions. See the change log for details about this version.]]></description>
			<content:encoded><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/gobyweb-1-6-1/"></g:plusone></div>
<table border="0">
<tbody>
<tr>
<td><a href="http://campagnelab.org/wp-content/uploads/2010/03/gobyweb_logo.png"><img class="alignleft size-full wp-image-1861" title="gobyweb_logo" src="http://campagnelab.org/wp-content/uploads/2010/03/gobyweb_logo.png" alt="" width="181" height="116" /></a></td>
<td>We have just released a binary distribution of <a href="http://gobyweb.campagnelab.org">GobyWeb</a> version 1.6.1. This is the first public release of GobyWeb. Detailled installation instructions are available on the <a href="http://campagnelab.org/software/gobyweb/license-binary-distribution-and-installation-instructions/">download page</a>. Please let us know if you are planning a local installation and have questions not covered in the instructions. See the <a href="http://campagnelab.org/software/gobyweb/change-log/">change log</a> for details about this version.</td>
</tr>
</tbody>
</table>
<p></p>
<img src="http://feeds.feedburner.com/~r/campagnelab/~4/L7eJUmWPcyQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://campagnelab.org/gobyweb-1-6-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://campagnelab.org/gobyweb-1-6-1/</feedburner:origLink></item>
		<item>
		<title>Goby 1.9.8</title>
		<link>http://feedproxy.google.com/~r/campagnelab/~3/SvU0iGpZi8E/</link>
		<comments>http://campagnelab.org/goby-1-9-8/#comments</comments>
		<pubDate>Wed, 09 Nov 2011 12:31:00 +0000</pubDate>
		<dc:creator>Fabien Campagne</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<guid isPermaLink="false">http://campagnelab.org/?p=3229</guid>
		<description><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/goby-1-9-8/"></g:plusone></div>
We have released Goby 1.9.8. This version includes the enhancements and bug fixes that are needed in the version of GobyWeb that we are preparing for release. See detailed change information in the project Change Log. &#160; &#160;]]></description>
			<content:encoded><![CDATA[<div style="display:inline;float:right;margin-left:1em"><g:plusone href="http://campagnelab.org/goby-1-9-8/"></g:plusone></div>
<div>We have released <a href="http://campagnelab.org/software/goby/download-goby/">Goby 1.9.8</a>. This version includes the enhancements and bug fixes that are needed in the version of GobyWeb that we are preparing for release. See detailed change information in the project <a href="http://campagnelab.org/software/goby/change-log/">Change Log</a>.</div>
<div>
<p>&nbsp;</p>
<p>&nbsp;</p>
</div>
<img src="http://feeds.feedburner.com/~r/campagnelab/~4/SvU0iGpZi8E" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://campagnelab.org/goby-1-9-8/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://campagnelab.org/goby-1-9-8/</feedburner:origLink></item>
	</channel>
</rss>

