<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">
  <id>http://watson.nci.nih.gov/~sdavis/</id>
  <title>Blog for Sean Davis</title>
  <updated>2013-05-10T04:00:00Z</updated>
  <link rel="alternate" href="http://watson.nci.nih.gov/~sdavis/" />
  
  <author>
    <name>Sean Davis</name>
    <uri>http://watson.nci.nih.gov/~sdavis/</uri>
  </author>
  <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/Bio-blue" /><feedburner:info uri="bio-blue" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><geo:lat>39.175004</geo:lat><geo:long>-76.853199</geo:long><feedburner:emailServiceId>Bio-blue</feedburner:emailServiceId><feedburner:feedburnerHostname>http://feedburner.google.com</feedburner:feedburnerHostname><feedburner:feedFlare href="http://add.my.yahoo.com/rss?url=http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://us.i1.yimg.com/us.yimg.com/i/us/my/addtomyyahoo4.gif">Subscribe with My Yahoo!</feedburner:feedFlare><feedburner:feedFlare href="http://www.newsgator.com/ngs/subscriber/subext.aspx?url=http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://www.newsgator.com/images/ngsub1.gif">Subscribe with NewsGator</feedburner:feedFlare><feedburner:feedFlare href="http://feeds.my.aol.com/add.jsp?url=http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://o.aolcdn.com/favorites.my.aol.com/webmaster/ffclient/webroot/locale/en-US/images/myAOLButtonSmall.gif">Subscribe with My AOL</feedburner:feedFlare><feedburner:feedFlare href="http://www.bloglines.com/sub/http://feeds.feedburner.com/Bio-blue" src="http://www.bloglines.com/images/sub_modern11.gif">Subscribe with Bloglines</feedburner:feedFlare><feedburner:feedFlare href="http://www.netvibes.com/subscribe.php?url=http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://www.netvibes.com/img/add2netvibes.gif">Subscribe with Netvibes</feedburner:feedFlare><feedburner:feedFlare href="http://fusion.google.com/add?feedurl=http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://buttons.googlesyndication.com/fusion/add.gif">Subscribe with Google</feedburner:feedFlare><feedburner:feedFlare href="http://www.pageflakes.com/subscribe.aspx?url=http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://www.pageflakes.com/ImageFile.ashx?instanceId=Static_4&amp;fileName=ATP_blu_91x17.gif">Subscribe with Pageflakes</feedburner:feedFlare><feedburner:feedFlare href="http://www.plusmo.com/add?url=http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://plusmo.com/res/graphics/fbplusmo.gif">Subscribe with Plusmo</feedburner:feedFlare><feedburner:feedFlare href="http://www.thefreedictionary.com/_/hp/AddRSS.aspx?http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://img.tfd.com/hp/addToTheFreeDictionary.gif">Subscribe with The Free Dictionary</feedburner:feedFlare><feedburner:feedFlare href="http://www.bitty.com/manual/?contenttype=rssfeed&amp;contentvalue=http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://www.bitty.com/img/bittychicklet_91x17.gif">Subscribe with Bitty Browser</feedburner:feedFlare><feedburner:feedFlare href="http://www.live.com/?add=http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://tkfiles.storage.msn.com/x1piYkpqHC_35nIp1gLE68-wvzLZO8iXl_JMledmJQXP-XTBOLfmQv4zhj4MhcWEJh_GtoBIiAl1Mjh-ndp9k47If7hTaFno0mxW9_i3p_5qQw">Subscribe with Live.com</feedburner:feedFlare><feedburner:feedFlare href="http://mix.excite.eu/add?feedurl=http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://image.excite.co.uk/mix/addtomix.gif">Subscribe with Excite MIX</feedburner:feedFlare><feedburner:feedFlare href="http://www.webwag.com/wwgthis.php?url=http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://www.webwag.com/images/wwgthis.gif">Subscribe with Webwag</feedburner:feedFlare><feedburner:feedFlare href="http://www.podcastready.com/oneclick_bookmark.php?url=http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://www.podcastready.com/images/podcastready_button.gif">Subscribe with Podcast Ready</feedburner:feedFlare><feedburner:feedFlare href="http://www.wikio.com/subscribe?url=http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://www.wikio.com/shared/img/add2wikio.gif">Subscribe with Wikio</feedburner:feedFlare><feedburner:feedFlare href="http://www.dailyrotation.com/index.php?feed=http%3A%2F%2Ffeeds.feedburner.com%2FBio-blue" src="http://www.dailyrotation.com/rss-dr2.gif">Subscribe with Daily Rotation</feedburner:feedFlare><entry>
    <id>tag:watson.nci.nih.gov,2013-05-10:/~sdavis/blog/flexible_bioinformatics_pipelines_with_snakemake/</id>
    <title type="html">flexible bioinformatics pipelines with snakemake</title>
    <published>2013-05-10T04:00:00Z</published>
    <updated>2013-05-10T04:00:00Z</updated>
    <author>
      <name>Sean Davis</name>
      <uri>http://watson.nci.nih.gov/~sdavis/</uri>
    </author>
    <link rel="alternate" href="http://feedproxy.google.com/~r/Bio-blue/~3/iVKDS2bSAQ8/" />
    <content type="html">
&lt;p&gt;Building bioinformatics pipelines can be a somewhat tedious process that seemingly never ends and often results in significant refactoring when assumptions or workflows change.  The &lt;a href="https://bitbucket.org/johanneskoester/snakemake/wiki/Home"&gt;snakemake project&lt;/a&gt; is a small framework that facilitates building flexible, reuseable pipelines.&lt;/p&gt;

&lt;p&gt;I am not going to go through the installation process in this entry, but since snakemake requires python3, suffice it to say that virtualenvs almost certainly will be involved since most systems default to python2.X.&lt;/p&gt;

&lt;p&gt;I had a small revelation last night about how snakemake can be used to run the GATK pipeline.  Like many folks, I use filename “tags” to track individual transformations and ordering of the tags in the filename to capture the ordering of the transformations.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Tool&lt;/th&gt;
      &lt;th&gt;Added filename extension&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;mark duplicates&lt;/td&gt;
      &lt;td&gt;md&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;quality recalibration&lt;/td&gt;
      &lt;td&gt;recal&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;indel realignment&lt;/td&gt;
      &lt;td&gt;realigned&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;So, a file with the name “stuff.realigned.recal.md.bam” has been realigned, then recalibrated, then had duplicates marked.  Different orderings of tags like “md”, “realigned”, and “recal” represent different processing orders.  A simple snakefile with rules for these transformations might look like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;rule qualrecal:
    input: bam="{base}.bam"
    output: bam="{base}.recal.bam"
    shell: "touch {output}"

rule indelrealigner:
    input: bam="{base}.bam",intervals="{base}.intervals"
    output: bam="{base}.realigned.bam"
    shell: "touch {output}"

rule realignertargetcreator:
    input: "{base}.bam"
    output: "{base}.intervals"
    shell: 'touch {output}'

rule indexbam:
    input: '{base}.bam'
    output: '{base}.bam.bai'
    shell: 'touch {output}'

rule markdups:
    input: '{base}.bam'
    output: bam='{base}.md.bam',metrics='{base}.dupmetrics'
    shell: "touch {output.bam}; touch {output.metrics}"
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I have not put any shell commands in this file just to keep things simple.  Obviously, this Snakefile will not produce any meaningful output, but it is enough to be instructive.&lt;/p&gt;

&lt;p&gt;If I add a rule, “final” that contains the output filename(s) of interest and then run &lt;code&gt;snakemake -n&lt;/code&gt;, we can see how the finename(s) dictate the run order for the rules.  This allows one to change the pipeline by simply adjusting output filenames.  Quite cool!&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;rule final:
    input: "stuff.realigned.md.bam", "stuff.realigned.md.bam.bai"
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Running &lt;code&gt;snakemake -n&lt;/code&gt; with this rule included as the first rule in the snakefile yields:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;rule realignertargetcreator:
        input: stuff.bam
        output: stuff.intervals
rule indelrealigner:
        input: stuff.bam, stuff.intervals
        output: stuff.realigned.bam
rule markdups:
        input: stuff.realigned.bam
        output: stuff.realigned.md.bam, stuff.realigned.dupmetrics
rule indexbam:
        input: stuff.realigned.md.bam
        output: stuff.realigned.md.bam.bai
rule final:
        input: stuff.realigned.md.bam, stuff.realigned.md.bam.bai
Job counts:
        count	jobs
        1	final
        1	indelrealigner
        1	realignertargetcreator
        1	indexbam
        1	markdups
        5
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Changing the rule “final” to this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;rule final:
    input: "stuff.md.realigned.bam", "stuff.md.realigned.bam.bai"
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and running &lt;code&gt;snakemake -n&lt;/code&gt; again gives a different ordering of steps.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;rule markdups:
        input: stuff.bam
        output: stuff.md.bam, stuff.dupmetrics
rule realignertargetcreator:
        input: stuff.md.bam
        output: stuff.md.intervals
rule indelrealigner:
        input: stuff.md.bam, stuff.md.intervals
        output: stuff.md.realigned.bam
rule indexbam:
        input: stuff.md.realigned.bam
        output: stuff.md.realigned.bam.bai
rule final:
        input: stuff.md.realigned.bam, stuff.md.realigned.bam.bai
Job counts:
        count	jobs
        1	final
        1	realignertargetcreator
        1	markdups
        1	indelrealigner
        1	indexbam
        5
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Adding recalibration to the process is as simple as including “.recal” in the filename in the correct order.  This works with one bam file or 1000 bam files.  Nice….&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Bio-blue?a=iVKDS2bSAQ8:gsD__eVMCeo:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Bio-blue?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Bio-blue/~4/iVKDS2bSAQ8" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://watson.nci.nih.gov/~sdavis/blog/flexible_bioinformatics_pipelines_with_snakemake/</feedburner:origLink></entry>
  <entry>
    <id>tag:watson.nci.nih.gov,2013-05-09:/~sdavis/blog/ipython-autoreload-notes/</id>
    <title type="html">autoreload in ipython to ease development</title>
    <published>2013-05-09T04:00:00Z</published>
    <updated>2013-05-09T04:00:00Z</updated>
    <author>
      <name>Sean Davis</name>
      <uri>http://watson.nci.nih.gov/~sdavis/</uri>
    </author>
    <link rel="alternate" href="http://feedproxy.google.com/~r/Bio-blue/~3/db9E7ewI6nI/" />
    <content type="html">
&lt;p&gt;New versions of ipython (after 0.12) ship with a convenient autoreload functionality.  This is just a note-to-self about usage.&lt;/p&gt;

&lt;p&gt;From the &lt;a href="http://ipython.org/ipython-doc/dev/config/extensions/autoreload.html" title="Ipython autoreload documentation"&gt;ipython documentation&lt;/a&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;In [1]: %load_ext autoreload

In [2]: %autoreload 2

In [3]: from foo import some_function

In [4]: some_function()
Out[4]: 42

In [5]: # open foo.py in an editor and change some_function to return 43

In [6]: some_function()
Out[6]: 43
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And here are the usage notes:&lt;/p&gt;

&lt;dl&gt;
  &lt;dt&gt;&lt;code&gt;%autoreload&lt;/code&gt;&lt;/dt&gt;
  &lt;dd&gt;Reload all modules (except those excluded by %aimport) automatically now.&lt;/dd&gt;
  &lt;dt&gt;&lt;code&gt;%autoreload 0&lt;/code&gt;&lt;/dt&gt;
  &lt;dd&gt;Disable automatic reloading.&lt;/dd&gt;
  &lt;dt&gt;&lt;code&gt;%autoreload 1&lt;/code&gt;&lt;/dt&gt;
  &lt;dd&gt;Reload all modules imported with %aimport every time before executing the Python code typed.&lt;/dd&gt;
  &lt;dt&gt;&lt;code&gt;%autoreload 2&lt;/code&gt;&lt;/dt&gt;
  &lt;dd&gt;Reload all modules (except those excluded by %aimport) every time before executing the Python code typed.&lt;/dd&gt;
  &lt;dt&gt;&lt;code&gt;%aimport&lt;/code&gt;&lt;/dt&gt;
  &lt;dd&gt;List modules which are to be automatically imported or not to be imported.&lt;/dd&gt;
  &lt;dt&gt;&lt;code&gt;%aimport foo&lt;/code&gt;&lt;/dt&gt;
  &lt;dd&gt;Import module ‘foo’ and mark it to be autoreloaded for %autoreload 1&lt;/dd&gt;
  &lt;dt&gt;&lt;code&gt;%aimport -foo&lt;/code&gt;&lt;/dt&gt;
  &lt;dd&gt;Mark module ‘foo’ to not be autoreloaded.&lt;/dd&gt;
&lt;/dl&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Bio-blue?a=db9E7ewI6nI:0YT6_srSfNk:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Bio-blue?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Bio-blue/~4/db9E7ewI6nI" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://watson.nci.nih.gov/~sdavis/blog/ipython-autoreload-notes/</feedburner:origLink></entry>
  <entry>
    <id>tag:watson.nci.nih.gov,2013-03-22:/~sdavis/blog/testing-for-non-differentially-expressed-genes/</id>
    <title type="html">Testing for Non-Differentially-Expressed Genes</title>
    <published>2013-03-22T04:00:00Z</published>
    <updated>2013-03-22T04:00:00Z</updated>
    <author>
      <name>Sean Davis</name>
      <uri>http://watson.nci.nih.gov/~sdavis/</uri>
    </author>
    <link rel="alternate" href="http://feedproxy.google.com/~r/Bio-blue/~3/IaBz-hwQJ6g/" />
    <content type="html">&lt;div class="document"&gt;
&lt;p&gt;There has been an ongoing discussion on the &lt;a class="reference external" href="http://thread.gmane.org/gmane.science.biology.informatics.conductor/47098/focus=47127"&gt;Bioconductor mailing list&lt;/a&gt; dealing with a commonly-asked question (at least by folks I work with) but uncommonly answered.  "How can I test for genes that are statistically not changing?"&lt;/p&gt;
&lt;p&gt;Albyn Jones mentions that a common approach for &lt;strong&gt;tests of equivalence&lt;/strong&gt; is the two-one-sided-tests approach that was apparently described by DJ Shuirmann in &lt;a class="reference external" href="http://www.ncbi.nlm.nih.gov/pubmed/3450848"&gt;A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Simon Anders answered the question quite thoroughly in text like so.  And I quote:&lt;/p&gt;
&lt;blockquote&gt;
One of the new features of DESeq2 is that it provides a empirical-Bayes style shrinkage estimates of coefficients (i.e., log fold changes) and also estimates standard errors for these coefficients (taken from the the reciprocal curvature of the posterior). Your task is one of the application we had in mind for this.  You want to know whether the fold change is zero or nearly zero, and you also want to be ascertained that this estimate of a nearly-zero fold change is a precise one. So, to find genes whose absolute log fold change is _reliably_ smaller than some threshold, take all those genes for which the estimated abs log fold change is below the threshold and the standard error is well below the threshold, too.  The reason, why I write "below a threshold" rather than "equal to zero" is that it's biologically unreasonable to assume that a treatment has exactly zero effect on a gene's expression. Gene regulation is such an interconnected network that, at least in my view, every gene will react ever so slightly to every perturbance, but often, this reaction is too small to be measurable. This is why you should define a threshold for "biologically unlikely to be relevant". Let's say, we don't believe that an expression change of less than 15% (0.2 on a log2 scale) is worth bothering. So, if you might take all genes with abs log2 fold change below 0.2, and furthermore require that the standard error of this log2 fold change estimate is below, say, 0.05, than you will get gene with true values well below at least 0.3 or so.  In the end, the thresholds won't matter too much but if you want real p values, this is possible, too: Simply divide the log2 fold change estimates by their standard errors to get z scores (this works because the sampling distribution of our fold change estimates is reasonably close to normal), and do two one-sided tests (TOST).&lt;/blockquote&gt;
&lt;p&gt;Simon goes on to give code to do this analysis using the DESeq2 bioconductor package.  He introduces the code with this comment:&lt;/p&gt;
&lt;blockquote&gt;
To recap, you want to test the null hypothesis that your log fold change beta is larger than some threshold theta, i.e., you want to find genes for which you can reject this null hypothesis and hence say with some certainty that |beta| &amp;lt; theta. The TOST scheme suggests, as its name implies to perform two pone sided tests, one with the null hypothesis H0_A: beta &amp;gt; theta, the other with H0_B: beta &amp;lt; -theta, and then use the larger of the two p values.&lt;/blockquote&gt;
&lt;p&gt;The code is so clear and well-commented that it speaks for itself.&lt;/p&gt;
&lt;pre class="code r literal-block"&gt;
&lt;span class="c1"&gt;# Let's use the example data from the pasilla package&lt;/span&gt;
library&lt;span class="p"&gt;(&lt;/span&gt; DESeq2 &lt;span class="p"&gt;)&lt;/span&gt;
library&lt;span class="p"&gt;(&lt;/span&gt; pasilla &lt;span class="p"&gt;)&lt;/span&gt;
data&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="s"&gt;"pasillaGenes"&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Create a DESeq2 data object from the pasilla data&lt;/span&gt;
dse &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; DESeqSummarizedExperimentFromMatrix&lt;span class="p"&gt;(&lt;/span&gt;
   counts&lt;span class="p"&gt;(&lt;/span&gt;pasillaGenes&lt;span class="p"&gt;),&lt;/span&gt; pData&lt;span class="p"&gt;(&lt;/span&gt;pasillaGenes&lt;span class="p"&gt;)[,&lt;/span&gt;c&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"condition"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
   &lt;span class="o"&gt;~&lt;/span&gt; type &lt;span class="o"&gt;+&lt;/span&gt; condition &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Perform a standard DESeq2 analysis&lt;/span&gt;
dse &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; DESeq&lt;span class="p"&gt;(&lt;/span&gt;dse&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# The log2 fold changes are found here&lt;/span&gt;
beta &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; results&lt;span class="p"&gt;(&lt;/span&gt;dse&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;log2FoldChange

&lt;span class="c1"&gt;# Just to make the following clearer, I should point out that the&lt;/span&gt;
&lt;span class="c1"&gt;# "results" accessor is just a short-cut for this access to the rowData:&lt;/span&gt;
all&lt;span class="p"&gt;(&lt;/span&gt; beta &lt;span class="o"&gt;==&lt;/span&gt; mcols&lt;span class="p"&gt;(&lt;/span&gt;rowData&lt;span class="p"&gt;(&lt;/span&gt;dse&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;conditionuntreated&lt;span class="p"&gt;,&lt;/span&gt; na.rm&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# (returns TRUE)&lt;/span&gt;

&lt;span class="c1"&gt;# The log fold change estimates all come with standard error information&lt;/span&gt;
&lt;span class="c1"&gt;# which we find in the rowData (maybe we should copy this to the&lt;/span&gt;
&lt;span class="c1"&gt;# 'results', too)&lt;/span&gt;
betaSE &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; mcols&lt;span class="p"&gt;(&lt;/span&gt;rowData&lt;span class="p"&gt;(&lt;/span&gt;dse&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;SE_conditionuntreated

&lt;span class="c1"&gt;# Internally, the Wald test is implemented as a simple two-sided&lt;/span&gt;
&lt;span class="c1"&gt;# z test of beta/betaSE. Two demonstrate this, we to the test&lt;/span&gt;
&lt;span class="c1"&gt;# manually and compare&lt;/span&gt;
pvalDE &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; pnorm&lt;span class="p"&gt;(&lt;/span&gt; abs&lt;span class="p"&gt;(&lt;/span&gt; beta &lt;span class="p"&gt;),&lt;/span&gt; sd &lt;span class="o"&gt;=&lt;/span&gt; betaSE&lt;span class="p"&gt;,&lt;/span&gt; lower.tail&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;FALSE&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;
all&lt;span class="p"&gt;(&lt;/span&gt; abs&lt;span class="p"&gt;(&lt;/span&gt; pvalDE &lt;span class="o"&gt;-&lt;/span&gt; results&lt;span class="p"&gt;(&lt;/span&gt;dse&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;$&lt;/span&gt;pvalue &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;1e-15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; na.rm&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# (returns TRUE)&lt;/span&gt;

&lt;span class="c1"&gt;# This was the test for DE, of course, i.e., small pvalDE means that&lt;/span&gt;
&lt;span class="c1"&gt;# the gene's expression change (the true value of beta) is not zero&lt;/span&gt;

&lt;span class="c1"&gt;# What we want is the opposite, namely find gene, for which abs(beta)&lt;/span&gt;
&lt;span class="c1"&gt;# is smaller than some threshold, theta&lt;/span&gt;
theta &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="m"&gt;.3&lt;/span&gt;

&lt;span class="c1"&gt;# So, we do our two one-sided tests. For a one-sided z test, we&lt;/span&gt;
&lt;span class="c1"&gt;# simply use tail probabilities from the normal distribution.&lt;/span&gt;

&lt;span class="c1"&gt;# First, the test of H0_A: true_beta &amp;gt; thr&lt;/span&gt;
pA &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; pnorm&lt;span class="p"&gt;(&lt;/span&gt; beta&lt;span class="p"&gt;,&lt;/span&gt; thr&lt;span class="p"&gt;,&lt;/span&gt; betaSE&lt;span class="p"&gt;,&lt;/span&gt; lower.tail&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Next, the test of H0_B: true_beta &amp;lt; -thr&lt;/span&gt;
pB &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; pnorm&lt;span class="p"&gt;(&lt;/span&gt; beta&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;thr&lt;span class="p"&gt;,&lt;/span&gt; betaSE&lt;span class="p"&gt;,&lt;/span&gt; lower.tail&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;FALSE&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# The overall p value is the maximum, because we want to reject H0_A&lt;/span&gt;
&lt;span class="c1"&gt;# and H0_B simultaneously&lt;/span&gt;
pvalTOST &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; pmax&lt;span class="p"&gt;(&lt;/span&gt; pA&lt;span class="p"&gt;,&lt;/span&gt; pB &lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="c1"&gt;# Let's adjust our two p values with BH:&lt;/span&gt;
sigDE &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; p.adjust&lt;span class="p"&gt;(&lt;/span&gt; pvalDE&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"BH"&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;.1&lt;/span&gt;
sigSmall &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; p.adjust&lt;span class="p"&gt;(&lt;/span&gt; pvalTOST&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"BH"&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="m"&gt;.1&lt;/span&gt;

&lt;span class="c1"&gt;# And make an MA plot, with sigDE in red and sigSmall in green&lt;/span&gt;
plot&lt;span class="p"&gt;(&lt;/span&gt;
   rowMeans&lt;span class="p"&gt;(&lt;/span&gt; counts&lt;span class="p"&gt;(&lt;/span&gt;dse&lt;span class="p"&gt;,&lt;/span&gt;normalized&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;),&lt;/span&gt; beta&lt;span class="p"&gt;,&lt;/span&gt;
   log&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; pch&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; cex&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
   col &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; sigDE &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;sigSmall &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Plot is attached.&lt;/span&gt;
&lt;/pre&gt;
&lt;p&gt;The plot that is generated by the last few lines shows in green the genes that are, indeed, statistically likely to be unchanged between conditions.&lt;/p&gt;
&lt;img alt="/assets/Rplot001.png" src="/assets/Rplot001.png"&gt;
&lt;/div&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Bio-blue?a=IaBz-hwQJ6g:EfbQVp0Kw0U:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Bio-blue?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Bio-blue/~4/IaBz-hwQJ6g" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://watson.nci.nih.gov/~sdavis/blog/testing-for-non-differentially-expressed-genes/</feedburner:origLink></entry>
  <entry>
    <id>tag:watson.nci.nih.gov,2013-03-21:/~sdavis/blog/sqlalchemy-with-python-3-and-mysql/</id>
    <title type="html">sqlalchemy with python 3 and mysql</title>
    <published>2013-03-21T04:00:00Z</published>
    <updated>2013-03-21T04:00:00Z</updated>
    <author>
      <name>Sean Davis</name>
      <uri>http://watson.nci.nih.gov/~sdavis/</uri>
    </author>
    <link rel="alternate" href="http://feedproxy.google.com/~r/Bio-blue/~3/q_awo8sbCY0/" />
    <content type="html">
&lt;p&gt;I have a few sqlalchemy-based projects that I have wanted to move over to
python 3.  Interestingly, the usual python-mysql is not supported.  There are 
a number of other options, but I settled on &lt;a href="https://pypi.python.org/pypi/mysql-connector-python/"&gt;Oracle’s mysql-connector-python&lt;/a&gt;.  The usual pip incantation
gets you the library (pure python).&lt;/p&gt;

&lt;pre&gt;
pip install mysql-connector-python
&lt;/pre&gt;

&lt;p&gt;Using the library is a matter of &lt;a href="http://docs.sqlalchemy.org/en/latest/dialects/mysql.html#module-sqlalchemy.dialects.mysql.mysqlconnector"&gt;using the right connection string&lt;/a&gt;:&lt;/p&gt;

&lt;pre&gt;
"mysql+mysqlconnector://..."
&lt;/pre&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Bio-blue?a=q_awo8sbCY0:gNNGmlZHzUs:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Bio-blue?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Bio-blue/~4/q_awo8sbCY0" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://watson.nci.nih.gov/~sdavis/blog/sqlalchemy-with-python-3-and-mysql/</feedburner:origLink></entry>
  <entry>
    <id>tag:watson.nci.nih.gov,2012-02-28:/~sdavis/blog/latex_conversion_notes/</id>
    <title type="html">Latex conversion notes</title>
    <published>2012-02-28T05:00:00Z</published>
    <updated>2012-02-28T05:00:00Z</updated>
    <author>
      <name>Sean Davis</name>
      <uri>http://watson.nci.nih.gov/~sdavis/</uri>
    </author>
    <link rel="alternate" href="http://feedproxy.google.com/~r/Bio-blue/~3/Oobwck1ReYk/" />
    <content type="html">

&lt;p&gt;I have been avoiding latex lately because of a desire to publish documents in HTML as well as PDF and my sense was that converters from latex to other formats were lacking.  After watching several beautiful presentations given based on the latex beamer class and being immersed in a Bioconductor course where vignettes are the coin of the realm, I decided to look a bit more fully.  Since I work on a Mac much of the time, I started poking around in /usr/texbin and found a command, htlatex that looked promising.&lt;/p&gt;

&lt;p&gt;To convert a .tex file to html is as simple as:
&lt;/p&gt;&lt;pre&gt;
htlatex filename.tex
&lt;/pre&gt;
This, by default, produces HTML4 output as file "filename.html".  To request xhtml output is as simple as supplying an argument:
&lt;pre&gt;
htlatex filename.tex "xhtml"
&lt;/pre&gt;
Noting that OpenOffice is XML-based, someone has written a configuration file for converting latex to a file that OpenOffice (or, my current flavor, LibreOffice) can read and display.  
&lt;pre&gt;
htlatex filename.tex "xhtml,ooffice" "ooffice/! -cmozhtf" "-coo -cvalidate"
&lt;/pre&gt;
&lt;p&gt;The little incarnation above creates an .odt file that opens directly in OpenOffice.  Of course, OpenOffice can then convert to &lt;emph&gt;many&lt;/emph&gt; other formats.  Conversion to MS Word is also possible.&lt;/p&gt;
&lt;pre&gt;
htlatex filename "html,word" "symbol/!" "-cvalidate"
&lt;/pre&gt;

&lt;h3&gt;References&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href="http://www.tug.org/TUGboat/tb25-1/gurari.pdf"&gt;Tex4ht: HTML conversion&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="http://biostat.mc.vanderbilt.edu/wiki/Main/SweaveConvert"&gt;Conversion from Sweave documents&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href="http://uce.uniovi.es/tips/indexch6.html"&gt;Some notes on configuration of tex4ht&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/Bio-blue?a=Oobwck1ReYk:hfbNGqcD8T0:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/Bio-blue?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/Bio-blue/~4/Oobwck1ReYk" height="1" width="1"/&gt;</content>
  <feedburner:origLink>http://watson.nci.nih.gov/~sdavis/blog/latex_conversion_notes/</feedburner:origLink></entry>
</feed>
