<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:blogger='http://schemas.google.com/blogger/2008' xmlns:georss='http://www.georss.org/georss' xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-7223713061342925300</id><updated>2026-05-08T23:51:04.826+05:30</updated><category term="HOW  TO"/><category term="Bioinformatics resources"/><category term="Perl Script"/><category term="Bioinformatics Video Tutorial"/><category term="Sequence analysis"/><category term="Society"/><category term="PERL"/><category term="NCBI"/><category term="Software"/><category term="BLAST"/><category term="R"/><category term="हिंदी"/><category term="Learn PERL"/><category term="Protein Structure Analysis"/><category term="Noisy News"/><category term="tool"/><category term="Conferences"/><category term="Microarray"/><category term="My Favorites"/><category term="bash"/><category term="राम कहानी"/><category term="समाज"/><title type='text'>Bioinformatics Made Simple.com</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default?start-index=26&amp;max-results=25'/><author><name>Priyanka Paul</name><uri>http://www.blogger.com/profile/17644327799399630455</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='28' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirsLO8hr8bPes1b60iXlTSNntif6YpT5KLCmSO-bvwmgHz7zsjZw-aNfipzq8kfozOq-GvtJrIpCUvQF30T5cpMCwM4PSp2B6EQtAhks3wS80ZXGASfRt5bARS4D5_Fg/s220/Untitled-1.png'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>176</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-492707548844459969</id><published>2020-09-02T23:06:00.007+05:30</published><updated>2020-09-02T23:23:56.902+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="R"/><title type='text'>How to create a 3D pie chart in R</title><content type='html'>&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
Pie Chart, represented in the circular chart symbol, is easy to understand complex data. Each section of the circle shows the data value proportions. The Pie charts can be of two-dimensional view or three-dimensional views based upon the taste of the user. In this example, I am going to use R package plotrix to draw a 3D pie chart. 
 &lt;br/&gt; 
  
  
  &lt;h3&gt;
Prerequisite
&lt;/h3&gt;
We need the following R libraries to run the script
&lt;ul&gt;
&lt;li&gt;&lt;b&gt; plotrix&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt; 
  
 &lt;h3&gt;
File format
&lt;/h3&gt;
&lt;ul&gt;
&lt;pre style=&quot;border: 1px dashed rgb(153, 153, 153); color: black; font-family: &amp;quot;andale mono&amp;quot;, &amp;quot;lucida console&amp;quot;, monaco, fixed, monospace; font-size: 12px; line-height: 14px; overflow: auto; padding: 5px; width: 100%;&quot;&gt;&lt;code&gt;Category    Number of genes    Color
Transcription_O    24    #ed2f52
Transport_O    13    #efc023
Catalytic     31    #008080
Phosphorylation    5    #8FBC8B
Cell Wall     7    #AFEEEE
Defense    16    #CD853F
Secondary metabolites    4    #A0522D
Unknown    26    #9ACD32
Miscellaneous    28    #D8BFD8
Uncharacterized    10    #E6E6FA

&lt;/code&gt;&lt;/pre&gt;
&lt;/ul&gt; 
  
&lt;h3&gt;
Script
&lt;/h3&gt;
&lt;script src=&quot;http://gist-it.appspot.com/https://github.com/sanjaysingh765/Generals/blob/master/3Dpie.R&quot;&gt;&lt;/script&gt;
  
&lt;h3&gt;
Result
&lt;/h3&gt;
  &lt;div class=&quot;separator&quot; style=&quot;clear: both;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgqYOOXypVQfxvcgRBHrf9ipC4ENwqEYU0doWrw5vM38I2wcwIzr87-7yvnvpf1sT7QmJcnq83_BTJl4olWJrIqnfevfPagxMrTLLb_TFMKeaT2pUPcrZ38NMmtau1GrPu5vMkGbIpyI8wP/s1800/diet3d.png&quot; style=&quot;display: block; padding: 1em 0; text-align: center;&quot;&gt;&lt;img alt=&quot;&quot; border=&quot;0&quot; width=&quot;320&quot; data-original-height=&quot;900&quot; data-original-width=&quot;1800&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgqYOOXypVQfxvcgRBHrf9ipC4ENwqEYU0doWrw5vM38I2wcwIzr87-7yvnvpf1sT7QmJcnq83_BTJl4olWJrIqnfevfPagxMrTLLb_TFMKeaT2pUPcrZ38NMmtau1GrPu5vMkGbIpyI8wP/s320/diet3d.png&quot;/&gt;&lt;/a&gt;&lt;/div&gt;  
  
  

&lt;div style=&quot;border-bottom: 1px solid rgb(204, 204, 204); border-top: 1px solid rgb(204, 204, 204); color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;How to add function descriptions to FASTA sequences &lt;a href=&quot;http://bioinformatics-made-simple.blogspot.com/2019/08/how-to-add-function-descriptions-to.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/492707548844459969/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2020/09/how-to-create-3d-pie-chart-in-r.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/492707548844459969'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/492707548844459969'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2020/09/how-to-create-3d-pie-chart-in-r.html' title='How to create a 3D pie chart in R'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgqYOOXypVQfxvcgRBHrf9ipC4ENwqEYU0doWrw5vM38I2wcwIzr87-7yvnvpf1sT7QmJcnq83_BTJl4olWJrIqnfevfPagxMrTLLb_TFMKeaT2pUPcrZ38NMmtau1GrPu5vMkGbIpyI8wP/s72-c/diet3d.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-9058773349614499901</id><published>2020-04-09T00:11:00.001+05:30</published><updated>2020-09-02T23:10:34.250+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="Bioinformatics resources"/><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><title type='text'>KEGG Sequence Downloader : retrieve gene sequences in Fasta format from KEGG database</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
I wanted to download the gene sequence of &lt;a href=&quot;http://www.bioinformatics-made-simple.com/search/label/NCBI&quot;&gt;tobacco from NCBI&lt;/a&gt;. Since NCBI also contains the isoform and some other unwanted genes, therefore I choose to get it from KEGG. Although KEGGREST is a wonderful &lt;a href=&quot;http://www.bioinformatics-made-simple.com/search/label/R&quot;&gt;R package&lt;/a&gt; to retrieve the data from KEGG, but it limits the retrieval. The following bash script can help to download the thousands of sequences in a single go without any limitation. Although this is a crude solution and there must be an efficient way to do it but it worked for me. Basically, this bash script works in three steps:&lt;br /&gt;

&lt;ul&gt;
&lt;li&gt;Split IDs in a given chunk&amp;nbsp;&lt;/li&gt;
&lt;li&gt;Download fasta sequences as HTML file&amp;nbsp;&lt;/li&gt;
&lt;li&gt;&amp;nbsp;Clean HTML file and save the result
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
Uses&lt;/h3&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;bash KEGG_sequence_downloader.sh query_file number_of_sequence
&lt;/pre&gt;
&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
How to download only viridiplantae miRNA from miRBase &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2016/01/how-to-download-only-viridiplantae.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;h3&gt;
Script&lt;/h3&gt;
&lt;br /&gt;
&lt;center&gt;
&lt;table border=&quot;1px&quot; style=&quot;width: 100%px;&quot;&gt;
  &lt;tbody&gt;
&lt;tr&gt;
    &lt;th align=&quot;middle&quot; bgcolor=&quot;#db3341&quot;&gt;&lt;span style=&quot;color: #f2edf2;&quot;&gt;Script name&lt;/span&gt;&lt;/th&gt;
    &lt;th align=&quot;middle&quot; bgcolor=&quot;#db3341&quot;&gt;&lt;span style=&quot;color: #f2edf2;&quot;&gt;Download&lt;/span&gt;&lt;/th&gt;
  &lt;/tr&gt;
&lt;tr&gt;
    &lt;td align=&quot;middle&quot; bgcolor=&quot;#e9e9f2&quot;&gt;&lt;span style=&quot;color: #0e040f; font-size: large;&quot;&gt;KEGG_sequence_downloader.sh&lt;/span&gt;&lt;/td&gt;
    &lt;td align=&quot;middle&quot;&gt;&lt;a href=&quot;https://github.com/sanjaysingh765/KEGG_sequence_downloader&quot; target=&quot;_blank&quot;&gt;&lt;img src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5Kf09VcZfLResOosU4S2BRFnMQtqp5NzT3siXwInqpcH7qtRjitTs4I7YokviiUQAivJZSQuvz6hpeKUIDuWRqB8U_jZcGKbFCnUL0nug8yCj4MJNHM91bVg-jltzDO-9Vt9VnXh-A_EK/s144/readmore1.png&quot; /&gt;&lt;/a&gt;&lt;/td&gt;
  &lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;/center&gt;
&lt;/div&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/9058773349614499901/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2020/04/kegg-sequence-downloader-retrieve.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/9058773349614499901'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/9058773349614499901'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2020/04/kegg-sequence-downloader-retrieve.html' title='KEGG Sequence Downloader : retrieve gene sequences in Fasta format from KEGG database'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5Kf09VcZfLResOosU4S2BRFnMQtqp5NzT3siXwInqpcH7qtRjitTs4I7YokviiUQAivJZSQuvz6hpeKUIDuWRqB8U_jZcGKbFCnUL0nug8yCj4MJNHM91bVg-jltzDO-9Vt9VnXh-A_EK/s72-c/readmore1.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-8242831823757457288</id><published>2020-01-22T06:22:00.004+05:30</published><updated>2020-09-02T23:15:15.027+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="R"/><title type='text'>Easiest way to find number of cluster in gene expression data</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
Gene clustering is a common method to find the groups of the gene with similar expression patterns.&amp;nbsp; However, it is not always easy to decide the number of clusters in the whole datasets. The following R script uses the most popular methods for determining the optimal clusters. This R script uses &quot;&lt;i&gt;&lt;b&gt;TF_average.csv&lt;/b&gt;&lt;/i&gt;&quot; as input and saves the result as &quot;&lt;i&gt;&lt;b&gt;optial_cluster.png&lt;/b&gt;&lt;/i&gt;&quot;.&lt;br /&gt;
&lt;br /&gt;

&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
How to perform Non-metric multidimensional scaling (NMDS) analysis script &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2017/10/how-to-perform-non-metric.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;h3&gt;
Prerequisite
&lt;/h3&gt;
We need the following R libraries to run the script
&lt;br /&gt;
&lt;ul&gt;
&lt;li&gt;&lt;b&gt;factoextra&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;NbClust&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
optimal_cluster_finder.R
&lt;/h3&gt;
&lt;br /&gt;
&lt;script src=&quot;http://gist-it.appspot.com/https://github.com/sanjaysingh765/Generals/blob/master/optimal_cluster_finder.R&quot;&gt;&lt;/script&gt;

&lt;/div&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/8242831823757457288/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2020/01/easiest-way-to-find-number-of-cluster.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/8242831823757457288'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/8242831823757457288'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2020/01/easiest-way-to-find-number-of-cluster.html' title='Easiest way to find number of cluster in gene expression data'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-4864939276969194989</id><published>2020-01-18T00:09:00.002+05:30</published><updated>2020-01-22T06:26:23.764+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="R"/><title type='text'>Easiest way to calculate Ka Ks ratio and divergence time</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
The Ka/Ks ratio is used to estimate the nature of evolution among neutral, purifying selection and beneficial mutations acting on a set of homologous protein-coding genes. Although there are several wonderful programs out there to calculate the Ka, Ks and Ka/Ks ratio but &lt;i&gt;&lt;b&gt;kaks&lt;/b&gt;&lt;/i&gt; function of R package &lt;i&gt;&lt;b&gt;seqinr&lt;/b&gt;&lt;/i&gt; is the easiest way to do it. The function&lt;i&gt;&lt;b&gt; kaks&lt;/b&gt;&lt;/i&gt; use the method published by Li et al (J. Mol. Evol., 36:96-99, 1993) to calculate the Ka, Ks and Ka/Ks ratio.
In the following script take aligned the &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2012/07/perl-script-7-how-to-extract-fasta.html&quot; target=&quot;_blank&quot;&gt;fasta &lt;/a&gt;sequences in the clustal format and, finally,  R package &lt;i&gt;&lt;b&gt;seqinr&lt;/b&gt;&lt;/i&gt; will be used to calculate the Ka, Ks and Ka/Ks ratio.

&lt;br /&gt;
&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
Cheat Sheet to Install and work with R on Ubuntu &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2019/02/cheat-sheet-to-install-and-work-with-r.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;h3&gt;
Prerequisite
&lt;/h3&gt;
We need the following R libraries to run the script
&lt;br /&gt;
&lt;ul&gt;
&lt;li&gt;&lt;b&gt;seqinr&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;ape&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;phangorn&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
Ka/Ks_calculator.R
&lt;/h3&gt;
&lt;br /&gt;
&lt;script src=&quot;http://gist-it.appspot.com/https://github.com/sanjaysingh765/Generals/blob/master/Ka_Ks_calculator.R&quot;&gt;&lt;/script&gt;


&lt;/div&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/4864939276969194989/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2020/01/easiest-way-to-calculate-ka-ks-ratio.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/4864939276969194989'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/4864939276969194989'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2020/01/easiest-way-to-calculate-ka-ks-ratio.html' title='Easiest way to calculate Ka Ks ratio and divergence time'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-7073244281691250038</id><published>2019-09-10T21:47:00.003+05:30</published><updated>2020-01-18T01:06:14.319+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="R"/><title type='text'>Draw a heatmap with Custom Symbol in Cell</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpl4OBH__dLdn28pjjOGYHZNPJKcTLBoPKh1ut0j6kZMBkc8vPPQkerZ7O_2lbgzzMMieOcU5DkO0Tq5R8okNIElD9iuJR6qvLaRyIo8s7jQOdFvsg1GSQSJz7T3MjtjLXorf3wtjxOE-P/s1600/Expression.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;600&quot; data-original-width=&quot;750&quot; height=&quot;256&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpl4OBH__dLdn28pjjOGYHZNPJKcTLBoPKh1ut0j6kZMBkc8vPPQkerZ7O_2lbgzzMMieOcU5DkO0Tq5R8okNIElD9iuJR6qvLaRyIo8s7jQOdFvsg1GSQSJz7T3MjtjLXorf3wtjxOE-P/s320/Expression.png&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;a href=&quot;http://www.bioinformatics-made-simple.com/2018/04/how-to-make-heatmap-with-multiple.html&quot; target=&quot;_blank&quot;&gt;Heatmap &lt;/a&gt;is a good way to save some space when you want to compose a figure with lots of panels. I got some gene expression data which were supposed to insert in a big figure. Although I can easily create a bar graph for that, but I choose the draw a heatmap to save some space. The easiest way was to create this &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2014/07/easiet-way-to-create-heat-map-in-excel.html&quot; target=&quot;_blank&quot;&gt;heatmap by the Excel &lt;/a&gt;but I choose ggplot2 R package to draw the heatmap because it was easy to handle the big data and customize the annotation. My aim was to draw the heatmap and annotate the cell where the difference of gene expression is statistically significantly from the control.&amp;nbsp; I choose star (*) to show the cells which are significantly different from control.&lt;br /&gt;

&lt;h3&gt;
Prerequisite &lt;/h3&gt;
We need the following R libraries to run the script&lt;br /&gt;
&lt;ul&gt;
&lt;li&gt;ggplot2&amp;nbsp;&lt;/li&gt;
&lt;li&gt;plyr&amp;nbsp;&lt;/li&gt;
&lt;li&gt;scales


&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
Cheat Sheet to Install and work with R on Ubuntu &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2019/02/cheat-sheet-to-install-and-work-with-r.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;h3&gt;
Sample data&lt;/h3&gt;
&lt;br /&gt;
&lt;script src=&quot;http://gist-it.appspot.com/https://github.com/sanjaysingh765/Generals/blob/master/expression.csv&quot;&gt;&lt;/script&gt;


&lt;br /&gt;
&lt;h3&gt;
ggplot2_heatmap.R&lt;/h3&gt;
&lt;br /&gt;
&lt;script src=&quot;http://gist-it.appspot.com/https://github.com/sanjaysingh765/Generals/blob/master/ggplot2_heatmap.R&quot;&gt;&lt;/script&gt;




&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/7073244281691250038/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/09/draw-heatmap-with-custom-symbol-in-cell.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/7073244281691250038'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/7073244281691250038'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/09/draw-heatmap-with-custom-symbol-in-cell.html' title='Draw a heatmap with Custom Symbol in Cell'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpl4OBH__dLdn28pjjOGYHZNPJKcTLBoPKh1ut0j6kZMBkc8vPPQkerZ7O_2lbgzzMMieOcU5DkO0Tq5R8okNIElD9iuJR6qvLaRyIo8s7jQOdFvsg1GSQSJz7T3MjtjLXorf3wtjxOE-P/s72-c/Expression.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-7137320245149076504</id><published>2019-08-28T23:36:00.002+05:30</published><updated>2019-09-10T21:48:38.866+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="bash"/><category scheme="http://www.blogger.com/atom/ns#" term="BLAST"/><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><title type='text'>How to add function descriptions to FASTA sequences</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
Short descriptions in fasta sequence help us to quickly gain insight into important information about a sequence. Automatic assignment of Human Readable Descriptions (AHRD) is popularly used to add select descriptions and Gene Ontology terms that are concise, informative and precise. This bash script run DIAMOND to search homology to three different databases (TAIR, uniprot_sprot, and uniprot_trembl), then run AHRD and, finally perform the addition of new information fasta description.&lt;br /&gt;
&lt;br /&gt;
&lt;h3&gt;
USES&amp;nbsp;&lt;/h3&gt;
&lt;br /&gt;
&lt;blockquote class=&quot;tr_bq&quot;&gt;
bash run_AHRD.sh fasta_input

&lt;/blockquote&gt;
&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
Extract Part of a FASTA Sequences with Position by python script &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2013/10/actually-i-have-hundreds-of-protein.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;h3&gt;
Dependencies&lt;/h3&gt;
&lt;br /&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;Install AHRD&lt;/li&gt;
&lt;li&gt;Install DIAMOND (move to &lt;b&gt;dist&lt;/b&gt; directory)&lt;/li&gt;
&lt;li&gt;Download database sequences for TAIR, uniprot_sprot, and uniprot_trembl. make the DIAMOND blast database and name them uniprot_sprot, arabidopsis, and uniprot_trembl and move the file to a new directory&lt;b&gt; database.&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;Download and decompress the&amp;nbsp;&lt;b&gt;resources.tar &lt;/b&gt;in the working directory&lt;b&gt;.&lt;/b&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;br /&gt;
&lt;h3&gt;
run_AHRD.sh&lt;/h3&gt;
&lt;br /&gt;
&lt;script src=&quot;http://gist-it.appspot.com/https://github.com/sanjaysingh765/Generals/blob/master/run_AHRD.sh&quot;&gt;&lt;/script&gt;







&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/7137320245149076504/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/08/how-to-add-function-descriptions-to.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/7137320245149076504'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/7137320245149076504'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/08/how-to-add-function-descriptions-to.html' title='How to add function descriptions to FASTA sequences'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-5024924214381877120</id><published>2019-08-26T21:34:00.001+05:30</published><updated>2019-08-28T23:37:48.186+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="PERL"/><category scheme="http://www.blogger.com/atom/ns#" term="Perl Script"/><title type='text'>How to rename fasta headers according to a matching name list</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
&lt;a href=&quot;http://http//www.bioinformatics-made-simple.com/2012/02/fabox-best-free-fasta-file-editor.html&quot;&gt;FaBox&lt;/a&gt;&amp;nbsp;has several utilities to manipulate&amp;nbsp;the FASTA sequence. I wanted to replace the FASTA header with the new header or description which are saved in a file. Although I can do it with FaBox, but it handles difficult when the number of sequences&amp;nbsp;is huge. This PERL script will rename the fasta&amp;nbsp;sequence as per store in another file.&lt;br /&gt;
&lt;h3 style=&quot;text-align: left;&quot;&gt;
Header&lt;/h3&gt;
Header and new FASTA header should be separated by TAB
&lt;br /&gt;
&lt;pre style=&quot;border: 1px dashed #999999; color: black; font-family: &amp;quot;andale mono&amp;quot; , &amp;quot;lucida console&amp;quot; , &amp;quot;monaco&amp;quot; , &amp;quot;fixed&amp;quot; , monospace; line-height: 14px; overflow: auto; padding: 5px; width: 100%;&quot;&gt;&lt;code&gt;M54089d protein1
M54089c protein2
M54089b protein3
M54089a protein4
&lt;/code&gt;&lt;/pre&gt;
&lt;br /&gt;
&lt;h3 style=&quot;text-align: left;&quot;&gt;
Sequence&amp;nbsp;&lt;/h3&gt;
FASTA should be in one line

&lt;br /&gt;
&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
Convert Multi line Fasta file into a Single line FASTA File  &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2012/02/perl-script-2-convert-multi-fasta-file.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;pre style=&quot;background-color: #eeeeee; border: 1px dashed #999999; color: black; font-family: &amp;quot;andale mono&amp;quot; , &amp;quot;lucida console&amp;quot; , &amp;quot;monaco&amp;quot; , &amp;quot;fixed&amp;quot; , monospace; font-size: 12px; line-height: 14px; overflow: auto; padding: 5px; width: 100%;&quot;&gt;&lt;code&gt;&amp;gt;M54089d
MEQCRQGSRQNGSVTSGKGLALRAGHGGPSPEPVGCRWTARAAPAARAGRRVPAGGRTGNGSFGGLPRASHSQLRTGTDKGNPTV
&amp;gt;M54089c
MINFDHLFACLHGHYGEVENKLKCILHYFGRICSSMPLGYVSFERKVLSLECTPSCIPYPKEKAWSQSNISLCPIEITISGLIEDQSREAIEVDFANMYLGGGALVRGCVQQEEIRFMINPELIAGMLFLPCMADNEAVEIVGTERFSSYTGRLTKHFVASWINSSVISINSFSKMMASWDFNMIKMLKTPVEGPLLIFCRLVILQLHLKKLRKHRKTS
&amp;gt;M54089b
MIGRADIEGSKSNVAMNAWLHKPVIPVVTFLTPLASNSEGLKIVRPRFHGSYSYWKSESNELLPSVPHEISVRVELILGHLRYLLTDVPPQPNSPPDNVFRRIGLQASLGSKKRGSAPLPLHGISKITLEVVVFHFRLSAPTYTTPLKSFTKSD
&amp;gt;M54089a
MNGLTRFHCPCLLSSETTAKGTGLAESAGKEDPVELDSSRLCEMT
&lt;/code&gt;&lt;/pre&gt;
&lt;br /&gt;
&lt;h3 style=&quot;text-align: left;&quot;&gt;
Script&amp;nbsp;&lt;/h3&gt;
This PERL script will ask for header list and FASTA sequences (file format given above) and save the FASTA file with new header in &lt;b&gt;result.fasta&lt;/b&gt;
&lt;br /&gt;
&lt;script src=&quot;http://gist-it.appspot.com/https://github.com/sanjaysingh765/Generals/blob/master/fasta_header_replace.pl&quot;&gt;&lt;/script&gt;


If you are working with unix based system, then this AWK one-liner will be very useful 
&lt;pre style=&quot;font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%&quot;&gt;&lt;code&gt;awk &#39;FNR==NR{  a[&amp;quot;&amp;gt;&amp;quot;$1]=$2;next}$1 in a{  sub(/&amp;gt;/,&amp;quot;&amp;gt;&amp;quot;a[$1]&amp;quot;&amp;#124;&amp;quot;,$1)}1&#39; header_list.txt sequence.fasta
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/5024924214381877120/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/08/how-to-renaming-fasta-headers-according.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/5024924214381877120'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/5024924214381877120'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/08/how-to-renaming-fasta-headers-according.html' title='How to rename fasta headers according to a matching name list'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-4106541027173433686</id><published>2019-08-16T20:06:00.003+05:30</published><updated>2019-08-26T21:43:21.918+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="R"/><title type='text'>How to get gene expression value from Arrayexpress</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
ArrayExpress has a wonderful &lt;a href=&quot;https://bioconductor.org/packages/release/bioc/html/ArrayExpress.html&quot; target=&quot;_blank&quot;&gt;R package &lt;/a&gt;for data search, download and analysis but it doesn&#39;t always work in perfection. Therefore it always good to have an alternative to download the raw data/CEL and analyze it. This R script will simply take the accession number in command line argument and save the expression data in a file named as&amp;nbsp;&amp;nbsp;&lt;span style=&quot;color: #008800; font-family: &amp;quot;bitstream vera sans mono&amp;quot; , &amp;quot;courier&amp;quot; , monospace; font-size: 16px; text-align: left;&quot;&gt;Gene_expression.txt.&lt;/span&gt;&lt;br /&gt;
&lt;blockquote class=&quot;tr_bq&quot;&gt;
&lt;span style=&quot;color: #a5a4a4; font-style: italic; font-weight: bold; text-align: center;&quot;&gt;How to download expression data set from NCBI GEO &lt;/span&gt;&lt;a href=&quot;http://www.bioinformatics-made-simple.com/2018/03/how-to-download-expression-data-set.html&quot; style=&quot;font-style: italic; font-weight: bold; text-align: center;&quot;&gt;HERE&lt;/a&gt;&lt;/blockquote&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;h3 style=&quot;text-align: left;&quot;&gt;
Prerequisite&amp;nbsp;&lt;/h3&gt;
&lt;br /&gt;
We need the following R libraries to run the script&lt;br /&gt;
&lt;ul style=&quot;text-align: left;&quot;&gt;
&lt;li&gt;ArrayExpress&amp;nbsp;&lt;/li&gt;
&lt;li&gt;aff&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style=&quot;text-align: left;&quot;&gt;
Uses&amp;nbsp;&lt;/h3&gt;
&lt;br /&gt;
&lt;blockquote class=&quot;tr_bq&quot;&gt;
Rscript script_name accession_number

&lt;/blockquote&gt;
&lt;h3 style=&quot;text-align: left;&quot;&gt;
Limitations&amp;nbsp;&lt;/h3&gt;
&lt;div&gt;
&lt;ul style=&quot;text-align: left;&quot;&gt;
&lt;li&gt;This script is written specifically for Arabidopsis data sets, therefore, you have to modify it as per your requirement.&amp;nbsp;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;br /&gt;
&lt;h3 style=&quot;text-align: left;&quot;&gt;
Script&amp;nbsp;&lt;/h3&gt;
&lt;br /&gt;
&lt;script src=&quot;http://gist-it.appspot.com/https://github.com/sanjaysingh765/Generals/blob/master/Arrayexpress_analysis.R&quot;&gt;&lt;/script&gt;



&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/4106541027173433686/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/08/how-to-get-gene-expression-value-from.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/4106541027173433686'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/4106541027173433686'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/08/how-to-get-gene-expression-value-from.html' title='How to get gene expression value from Arrayexpress'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-6435065202518347901</id><published>2019-07-25T17:49:00.001+05:30</published><updated>2019-08-16T20:10:22.854+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="Bioinformatics resources"/><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="NCBI"/><title type='text'>Easiest way to download multiple sequences from NCBI</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
NCBI and me have shared several tricks to download large set of sequence from database &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/guide/howto/dwn-records/&quot;&gt;HERE&lt;/a&gt; and &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2012/07/some-easy-ways-to-download-multiple.html&quot;&gt;HERE&lt;/a&gt;, respectively. In this post. I am going to share another easy way to download multiple sequences from NCBI. This script will take the file accession list ( one accession number in each line) and download sequence in individual files. Finally, concatenate those files in a single &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2012/02/perl-script-2-convert-multi-fasta-file.html&quot;&gt;multiline fasta&lt;/a&gt; file and delete them.  

&lt;div style=&quot;text-align: center; font-style:italic; color: #a5a4a4; font-weight:bold; border-top: 1px solid #ccc; border-bottom: 1px solid #ccc; padding:30px; margin:30px; &quot;&gt;How to BLAST multiple sequences against NCBI database using PERL script &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2012/02/perl-script-4-how-to-blast-multiple.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;

&lt;pre class=&quot;brush:perl&quot;&gt;
#!/bin/bash
 
while read i; do curl -s  &quot;https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=protein&amp;id=${i}&amp;rettype=fasta&amp;retmode=txt&quot;&gt;$i.fasta; done &lt; id.list

#find . -name &#39;*.fasta&#39; -exec cat {} \; &gt;protein.fas
cat *.fasta &gt;protein.fas
rm *.fasta
&lt;/pre&gt;


&lt;br /&gt;

&lt;/div&gt;


&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/6435065202518347901/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/07/easiest-way-to-download-multiple.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/6435065202518347901'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/6435065202518347901'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/07/easiest-way-to-download-multiple.html' title='Easiest way to download multiple sequences from NCBI'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-4936815776081406466</id><published>2019-02-24T00:48:00.001+05:30</published><updated>2019-08-26T21:43:14.612+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="Bioinformatics resources"/><category scheme="http://www.blogger.com/atom/ns#" term="BLAST"/><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><title type='text'>How to perform parallel BLAST</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhOKhXtyljwxpRJ-lN3nE7MTGaw0n6JFNuVyIPRlfOnj-Mq_mSDpkSXXTcFS47ChwBj9MzjJk-8EJ_5uo4MRCTiLpkyuwHq_ubAvemBO4NkP2a0iQwaRqDwSVaXofyjUeEabU1yQpNIs8MH/s1600/ncbi_blast.gif&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;100&quot; data-original-width=&quot;200&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhOKhXtyljwxpRJ-lN3nE7MTGaw0n6JFNuVyIPRlfOnj-Mq_mSDpkSXXTcFS47ChwBj9MzjJk-8EJ_5uo4MRCTiLpkyuwHq_ubAvemBO4NkP2a0iQwaRqDwSVaXofyjUeEabU1yQpNIs8MH/s1600/ncbi_blast.gif&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;a href=&quot;http://www.bioinformatics-made-simple.com/2011/11/sequence-similarity-search-i-basic.html&quot; target=&quot;_blank&quot;&gt;BLAST&lt;/a&gt;&amp;nbsp;can be time-consuming especially when it includes a large number of the query sequence. Therefore, parallel BLAST can be useful. This Bash script can perform parallel BLASt in three common steps&lt;br /&gt;

&lt;ul&gt;
&lt;li&gt;Split fasta file into files with a given number of sequences&amp;nbsp;&lt;/li&gt;
&lt;li&gt;BLAST them in parallel fashion&lt;/li&gt;
&lt;li&gt;Combine their result&lt;/li&gt;
&lt;/ul&gt;

Since it doesn&#39;t include any additional&amp;nbsp;software except &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2012/06/how-to-install-ncbi-blast-on-window-7.html&quot; target=&quot;_blank&quot;&gt;BLAST+&lt;/a&gt;, therefore, it is easy to use.&amp;nbsp; If you want to parse your BLAST result you can always&amp;nbsp;use &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2012/07/ncbi-blast-parser-extract-query-and.html&quot; target=&quot;_blank&quot;&gt;NCBI BLAST parser from here&lt;/a&gt;.
&lt;h2&gt;
&lt;b&gt;USES&lt;/b&gt;&lt;/h2&gt;
This bash script require&amp;nbsp;following&amp;nbsp;commands&lt;br /&gt;
&lt;b&gt;script_name: &lt;/b&gt;Name of given script&lt;br /&gt;
&lt;b&gt;blast_type: &lt;/b&gt;What kind of BLAST you want to perform e.g. blastn&lt;br /&gt;
&lt;b&gt;query_file: &lt;/b&gt;Name of query file&lt;br /&gt;
&lt;b&gt;database_name:&amp;nbsp;&lt;/b&gt;Name of the database&lt;br /&gt;
&lt;b&gt;number_of_sequence_each_file: &lt;/b&gt;how many sequences you want per query file

&lt;h2&gt;SCRIPT&lt;/h2&gt;

&lt;pre class=&quot;brush:perl&quot;&gt;#!/bin/bash

########################################################################################################################
#
#     split fasta file into files with given number of sequences 
#     blast them in parallel fashion
#     combine their result
########################################################################################################################

prog=&quot;$1&quot;
query=&quot;$2&quot;
database=&quot;$3&quot;
num_seq=&quot;$4&quot;


if [[ $# -lt 3 ]] ; then
    printf &quot;\033[1;31mGive me a proper command\033[0m\n&quot;
    printf &quot;\033[1;31mUsage: script_name blast_type query_file database name number_of_sequence_each_file\033[0m\n\n&quot;
    
exit;


else

start=`date +%s`

#split fasta sequences in given number of sequences per file 
awk -v a_seq=$num_seq &#39;BEGIN {n_seq=0;} /^&amp;gt;/ {if(n_seq%a_seq==0){file=sprintf(&quot;myseq%d.fa&quot;,n_seq);} print &amp;gt;&amp;gt; file; n_seq++; next;} { print &amp;gt;&amp;gt; file; }&#39; &amp;lt; $query



#run blast
ls *.fa | parallel -a - $prog -query {} -db $database -out {.}.out -evalue 0.001 -num_descriptions 1 -num_alignments 1 -num_threads 8 

cat *.out &amp;gt;$query.blast.result


while true; do
    read -p &quot;Do you want to delete intermediate files?&quot; yn
    case $yn in
        [Yy]* ) rm -f *.out *.fa; break;;
        [Nn]* ) exit;;
        * ) echo &quot;Please answer yes or no.&quot;;;

    esac
done

runtime=$((end-start))
printf &quot;\033[1;31mYour analysis was done in $((($(date +%s)-$start)/60)) minutes\033[0m\n&quot;${reset}
printf &quot;\033[1;31mTHANKS\033[0m\n&quot;${reset}
   
fi


&lt;/pre&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/4936815776081406466/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/02/how-to-perform-parallel-blast.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/4936815776081406466'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/4936815776081406466'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/02/how-to-perform-parallel-blast.html' title='How to perform parallel BLAST'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhOKhXtyljwxpRJ-lN3nE7MTGaw0n6JFNuVyIPRlfOnj-Mq_mSDpkSXXTcFS47ChwBj9MzjJk-8EJ_5uo4MRCTiLpkyuwHq_ubAvemBO4NkP2a0iQwaRqDwSVaXofyjUeEabU1yQpNIs8MH/s72-c/ncbi_blast.gif" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-7184424457317427388</id><published>2019-02-08T22:09:00.000+05:30</published><updated>2019-11-06T02:17:05.725+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="R"/><title type='text'>Cheat Sheet to Install and work with R on Ubuntu</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;

&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
Tips and tricks to Install and work with R on Ubuntu&lt;br /&gt;
&lt;h3&gt;
Run these commands in terminal&lt;/h3&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;#Update and Install R
sudo apt-get update
sudo apt-get install r-base r-base-dev
sudo apt-get upgrade

# Know R version
R --version
&lt;/pre&gt;




&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
Extract Part of a FASTA Sequences with Position by python script &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2013/10/actually-i-have-hundreds-of-protein.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;h3&gt;
Run these commands in &lt;span style=&quot;color: red;&quot;&gt;R&lt;/span&gt; terminal&lt;/h3&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;# Know R version
sessionInfo()

# Know version of specific packages
packageVersion(&quot;ggplot2&quot;)

# check installed packages
installed.packages()

# list all packages where an update is available
old.packages()

# update all available packages
update.packages()

# update, without prompts for permission/clarification
update.packages(ask = FALSE)
 update.packages(checkBuilt = TRUE, ask = FALSE, repos = &quot;https://cran.rstudio.com&quot;)

# update only a specific package use install.packages()
install.packages(&quot;ggplot2&quot;)


# install packages from bioconductor
source(&quot;https://bioconductor.org/biocLite.R&quot;)
biocLite(&quot;ComplexHeatmap&quot;)

# install multiple packages from bioconductor
source(&quot;https://bioconductor.org/biocLite.R&quot;)
biocLite(&quot;ComplexHeatmap&quot;, &quot;ggplot2&quot;)


# install packages from github
library(devtools)
install_github(&quot;jokergoo/ComplexHeatmap&quot;)


# start library
library(&#39;ggplot2&#39;)

&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/7184424457317427388/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/02/cheat-sheet-to-install-and-work-with-r.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/7184424457317427388'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/7184424457317427388'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/02/cheat-sheet-to-install-and-work-with-r.html' title='Cheat Sheet to Install and work with R on Ubuntu'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-7077891414777939070</id><published>2019-01-28T00:06:00.005+05:30</published><updated>2019-02-24T00:52:42.551+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><title type='text'>Get multiple strings from a file and replace them in another file with AWK</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
&lt;div style=&quot;text-align: center;&quot;&gt;
&lt;h3 style=&quot;text-align: left;&quot;&gt;
Get multiple strings from a file and replace them in another file&lt;/h3&gt;
&lt;/div&gt;
I have multiple strings (old strings) and their sustitution (new strings) in tab limited format in a file named as &lt;b&gt;string&lt;/b&gt;. I want to replace them in another file named as &lt;b&gt;Inputfile &lt;/b&gt;and save in &lt;b&gt;Outfile.&amp;nbsp;&lt;/b&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;h3&gt;
Script&lt;/h3&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;awk &#39;
NR==FNR {
    old[NR] = $1
    gsub(/&amp;amp;/,RS,$2)
    new[NR] = $2
    next
}
{
    for (i=1; i in old; i++) {
        gsub(old[i],new[i])
    }
    gsub(RS,&quot;\\&amp;amp;&quot;)
    print
}
&#39; string Inputfile &amp;gt;Outfile
&lt;/pre&gt;

&lt;br /&gt;
&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
Extract Part of a FASTA Sequences with Position by python script &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2013/10/actually-i-have-hundreds-of-protein.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;h3&gt;
Terminal screenshot&lt;/h3&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinSLDQ88HMhNNsyKO_xXJphJZ-esAlDm4J4LUGKd8-E2JnQWgQT3oHCP_UlXgCySyaopdwDqhjKXCCmKH7hHDwaD6cGIWdoC92HEoMtY3AA0Dm8URrF_Gjvoan5-16obz_doW_EClTztb4/s1600/string.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;506&quot; data-original-width=&quot;774&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinSLDQ88HMhNNsyKO_xXJphJZ-esAlDm4J4LUGKd8-E2JnQWgQT3oHCP_UlXgCySyaopdwDqhjKXCCmKH7hHDwaD6cGIWdoC92HEoMtY3AA0Dm8URrF_Gjvoan5-16obz_doW_EClTztb4/s1600/string.png&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;br /&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/7077891414777939070/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/01/get-multiple-strings-from-file-and.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/7077891414777939070'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/7077891414777939070'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/01/get-multiple-strings-from-file-and.html' title='Get multiple strings from a file and replace them in another file with AWK'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEinSLDQ88HMhNNsyKO_xXJphJZ-esAlDm4J4LUGKd8-E2JnQWgQT3oHCP_UlXgCySyaopdwDqhjKXCCmKH7hHDwaD6cGIWdoC92HEoMtY3AA0Dm8URrF_Gjvoan5-16obz_doW_EClTztb4/s72-c/string.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-3670269355055995156</id><published>2019-01-25T00:18:00.000+05:30</published><updated>2019-08-16T20:09:41.461+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><title type='text'>Examples For Sed Linux Command In Text Manipulation and File Handling</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
&lt;br /&gt;
&lt;script language=&quot;JavaScript&quot;&gt; 
function show(c) { 
if (document.getElementById &amp;&amp; document.getElementById(c)!= null) 
node = document.getElementById(c).style.display=&#39;&#39;; 
else if (document.layers &amp;&amp; document.layers[c]!= null) 
document.layers[c].display = &#39;&#39;; 
else if (document.all) 
document.all[c].style.display = &#39;&#39;; 
} 
function hide(c) { 
if (document.getElementById &amp;&amp; document.getElementById(c)!= null) 
node = document.getElementById(c).style.display=&#39;none&#39;; 
else if (document.layers &amp;&amp; document.layers[c]!= null) 
document.layers[c].display = &#39;none&#39;; 
else if (document.all) 
document.all[c].style.display = &#39;none&#39;; 
} 
&lt;/script&gt; 

        &lt;center&gt;&lt;a href=&quot;&quot; class=&quot;myButton&quot;&gt;Click on red strip to expand it &lt;/a&gt;&lt;/center&gt;&lt;br/&gt;
&lt;style&gt;
.myButton {
 -moz-box-shadow: 3px 4px 0px 0px #8a2a21;
 -webkit-box-shadow: 3px 4px 0px 0px #8a2a21;
 box-shadow: 3px 4px 0px 0px #8a2a21;
 background-color:#c62d1f;
 -moz-border-radius:18px;
 -webkit-border-radius:18px;
 border-radius:18px;
 border:1px solid #d02718;
 display:inline-block;
 cursor:pointer;
 color:#ffffff;
 font-family:Times New Roman;
 font-size:17px;
 font-weight:bold;
 padding:7px 25px;
 text-decoration:none;
 text-shadow:0px 1px 0px #810e05;
}
.myButton:hover {
 background-color:#f24437;
}
.myButton:active {
 position:relative;
 top:1px;
}
&lt;/style&gt;
&lt;br /&gt;
&lt;table&gt; 
&lt;tbody&gt;
&lt;tr&gt;&lt;td align=&quot;middle&quot; bgcolor=&quot;#db3341&quot; span=&quot;&quot; style=&quot;color: #f2edf2; font-size: 16; font-weight: bold;&quot;&gt;&lt;div onclick=&quot;show(&#39;1&#39;)&quot;&gt;
1. Replace all occurrence &lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

&lt;div id=&quot;1&quot; style=&quot;display: none;&quot;&gt;

1. Replace all occurrence &lt;b&gt;Ram&lt;/b&gt; in &lt;b&gt;&lt;gwmw class=&quot;ginger-module-highlighter-mistake-type-1&quot; id=&quot;gwmw-15484243819174273741082&quot;&gt;Inputfile&lt;/gwmw&gt;&lt;/b&gt; with &lt;b&gt;Shyam&lt;/b&gt; and save the result in &lt;b&gt;Outfile&lt;/b&gt;
&lt;br /&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;sed &#39;s/Ram/Shyam/&#39; Inputfile  &amp;gt;Outfile&lt;/pre&gt;&lt;/div&gt;

&lt;table&gt; 
&lt;tbody&gt;
&lt;tr&gt;&lt;td align=&quot;middle&quot; bgcolor=&quot;#db3341&quot; span=&quot;&quot; style=&quot;color: #f2edf2; font-size: 16; font-weight: bold;&quot;&gt;&lt;div onclick=&quot;show(&#39;2&#39;)&quot;&gt;
2. Replace all occurrence of multiple string &lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

&lt;div id=&quot;2&quot; style=&quot;display: none;&quot;&gt;

2. Replace all occurrence &lt;b&gt;Ram&lt;/b&gt; and &lt;b&gt;Sita&lt;/b&gt; in &lt;b&gt;&lt;gwmw class=&quot;ginger-module-highlighter-mistake-anim ginger-module-highlighter-mistake-type-1&quot; id=&quot;gwmw-15484243826440093207744&quot;&gt;Inputfile&lt;/gwmw&gt;&lt;/b&gt; with &lt;b&gt;Shyam&lt;/b&gt; and &lt;b&gt;Geeta&lt;/b&gt; respectively, and save the result in &lt;b&gt;Outfile&lt;/b&gt;
&lt;br /&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;sed -e &#39;s/Ram/Shyam/; s/Sita/Geeta/&#39; Inputfile  &amp;gt;Outfile&lt;/pre&gt;

&lt;/div&gt;

&lt;table&gt; 
&lt;tbody&gt;
&lt;tr&gt;&lt;td align=&quot;middle&quot; bgcolor=&quot;#db3341&quot; span=&quot;&quot; style=&quot;color: #f2edf2; font-size: 16; font-weight: bold;&quot;&gt;&lt;div onclick=&quot;show(&#39;3&#39;)&quot;&gt;
3. Reading sed commands from a file &lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;div id=&quot;3&quot; style=&quot;display: none;&quot;&gt;
3. &lt;b&gt;Reading Commands From a File&lt;/b&gt;: I have multiple strings saved in &lt;b&gt;Inputfile&lt;/b&gt; and want to replace them with multiple strings saved in a file &lt;b&gt;string&lt;/b&gt; and save the in &lt;b&gt;Outfile&lt;/b&gt;
&lt;br /&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;sed -f string Inputfile  &amp;gt;Outfile&lt;/pre&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3kbewmD1NlefP8vLI5CA4UODSsE7ZV1o-tpn2etCcmkKTNV-3CBiwACLcR6IDuEQ5rA1t-VAOu4KANulGqt0-JAbAdvLHd0M1KsZjbyia83lWwghRFzvu5Jxav4my4I-zeXT0xRCr7oyu/s1600/Screenshot+from+2019-01-24+14-58-07.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;198&quot; data-original-width=&quot;336&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3kbewmD1NlefP8vLI5CA4UODSsE7ZV1o-tpn2etCcmkKTNV-3CBiwACLcR6IDuEQ5rA1t-VAOu4KANulGqt0-JAbAdvLHd0M1KsZjbyia83lWwghRFzvu5Jxav4my4I-zeXT0xRCr7oyu/s1600/Screenshot+from+2019-01-24+14-58-07.png&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;h4&gt;&lt;/div&gt;



&lt;table&gt; 
&lt;tbody&gt;
&lt;tr&gt;&lt;td align=&quot;middle&quot; bgcolor=&quot;#db3341&quot; span=&quot;&quot; style=&quot;color: #f2edf2; font-size: 16; font-weight: bold;&quot;&gt;&lt;div onclick=&quot;show(&#39;4&#39;)&quot;&gt;
4. &lt;b&gt;Substituting Flags&lt;/b&gt;: to control the replacement in file &lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

&lt;div id=&quot;4&quot; style=&quot;display: none;&quot;&gt;
4. &lt;b&gt;Substituting Flags&lt;/b&gt;: There are 4 kinds of substitutions:&lt;/h4&gt;
&lt;ul&gt;
&lt;li&gt;g, replace all occurrences.&amp;nbsp;&lt;/li&gt;
&lt;li&gt;A number, the occurrence number for the new text that you want to substitute.&amp;nbsp;&lt;/li&gt;
&lt;li&gt;p, print the original content.&amp;nbsp;&lt;/li&gt;
&lt;li&gt;w file: means write the results to a file.&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiim1OzsWgengHUM56-DgkS8HQ9Ksoz6Fq35nz4fHAx4q6JdBFsYGpbMFXQLO0O0d5v4QYQwk_SGNdItXXY_TnsjffV9m-iGJOF4Mb5woR_BSbT29BjEKYE3LZzXgfiWIOvF76CZrkQU68f/s1600/Screenshot+from+2019-01-24+15-15-44.png&quot; imageanchor=&quot;1&quot; style=&quot;margin-left: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-width=&quot;783&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiim1OzsWgengHUM56-DgkS8HQ9Ksoz6Fq35nz4fHAx4q6JdBFsYGpbMFXQLO0O0d5v4QYQwk_SGNdItXXY_TnsjffV9m-iGJOF4Mb5woR_BSbT29BjEKYE3LZzXgfiWIOvF76CZrkQU68f/s1600/Screenshot+from+2019-01-24+15-15-44.png&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;
&lt;br /&gt;
&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
Extract Part of a FASTA Sequences with Position by python script &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2013/10/actually-i-have-hundreds-of-protein.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;table&gt; 
&lt;tbody&gt;
&lt;tr&gt;&lt;td align=&quot;middle&quot; bgcolor=&quot;#db3341&quot; span=&quot;&quot; style=&quot;color: #f2edf2; font-size: 16; font-weight: bold;&quot;&gt;&lt;div onclick=&quot;show(&#39;5&#39;)&quot;&gt;
5. &lt;b&gt;Limiting sed in a file &lt;/b&gt;&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

&lt;div id=&quot;5&quot; style=&quot;display: none;&quot;&gt;
5. &lt;b&gt;Limiting sed in a file&lt;/b&gt;: By default, Sed command processes your entire file. However, you can limit the sed command to process your file in different ways:&lt;br /&gt;
&lt;br /&gt;
&lt;ul style=&quot;text-align: left;&quot;&gt;
&lt;li&gt;A range of lines.&amp;nbsp;&lt;/li&gt;
- following command will replace string in &lt;b&gt;second line&lt;/b&gt; only &lt;pre class=&quot;brush:perl&quot;&gt;sed &#39;2s/Ram/Shyam/&#39; Inputfile  &amp;gt;Outfile&lt;/pre&gt;
&lt;li&gt;A pattern that matches a specific range.&lt;/li&gt;
- following command will replace string in &lt;b&gt;second and third line&lt;/b&gt; only &lt;pre class=&quot;brush:perl&quot;&gt;sed &#39;2,3s/Ram/Shyam/&#39; Inputfile  &amp;gt;Outfile&lt;/pre&gt;
- Following command will replace string in &lt;b&gt;second line to the end of the file&lt;/b&gt; only &lt;pre class=&quot;brush:perl&quot;&gt;sed &#39;2,$s/Ram/Shyam/&#39; Inputfile  &amp;gt;Outfile&lt;/pre&gt;
&lt;/ul&gt;&lt;/div&gt;





&lt;table&gt; 
&lt;tbody&gt;
&lt;tr&gt;&lt;td align=&quot;middle&quot; bgcolor=&quot;#db3341&quot; span=&quot;&quot; style=&quot;color: #f2edf2; font-size: 16; font-weight: bold;&quot;&gt;&lt;div onclick=&quot;show(&#39;6&#39;)&quot;&gt;
6. &lt;b&gt;Delete Lines in a file with sed &lt;/b&gt;&lt;/div&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;

&lt;div id=&quot;6&quot; style=&quot;display: none;&quot;&gt;
&lt;b&gt;6. Delete Lines in a file with &lt;b&gt;sed&amp;nbsp;&lt;/b&gt;&lt;/b&gt;: The &lt;b&gt;delete (d) flag&lt;/b&gt; deletes the text from the stream, not the original file.&lt;br /&gt;
- following command will delete &lt;b&gt;second line&lt;/b&gt;&lt;/div&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;sed &#39;2d&#39; Inputfile  &amp;gt;Outfile&lt;/pre&gt;
- following command will delete &lt;b&gt;second and third line&lt;/b&gt;
&lt;br /&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;sed &#39;2,3d&#39; Inputfile  &amp;gt;Outfile&lt;/pre&gt;
- following command will delete &lt;b&gt;second line to the end&lt;/b&gt; of the file.&lt;br /&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;sed &#39;2,$d&#39; Inputfile  &amp;gt;Outfile&lt;/pre&gt;
&lt;/div&gt;



</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/3670269355055995156/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/01/50-examples-for-sed-linux-command-in.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/3670269355055995156'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/3670269355055995156'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2019/01/50-examples-for-sed-linux-command-in.html' title='Examples For Sed Linux Command In Text Manipulation and File Handling'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj3kbewmD1NlefP8vLI5CA4UODSsE7ZV1o-tpn2etCcmkKTNV-3CBiwACLcR6IDuEQ5rA1t-VAOu4KANulGqt0-JAbAdvLHd0M1KsZjbyia83lWwghRFzvu5Jxav4my4I-zeXT0xRCr7oyu/s72-c/Screenshot+from+2019-01-24+14-58-07.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-3454495058032250921</id><published>2018-07-05T19:59:00.000+05:30</published><updated>2019-01-28T00:07:46.120+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="Bioinformatics resources"/><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="R"/><title type='text'>How to compare multiple sets using UpsetR </title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivk_VkBqryLmzVE-o6lTTVQbRznJKM7TA5EnvHimaEP7jB7jW2YBhYK1BgOq4C2Lz_KltDB4hDRu6uOGroitP09G1GiLWylpCRPhrp4NAQVpEpFyIlMbeZz05L0Z7_Gqr08CrebmbY8iGP/s1600/output.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; data-original-height=&quot;706&quot; data-original-width=&quot;1600&quot; height=&quot;141&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivk_VkBqryLmzVE-o6lTTVQbRznJKM7TA5EnvHimaEP7jB7jW2YBhYK1BgOq4C2Lz_KltDB4hDRu6uOGroitP09G1GiLWylpCRPhrp4NAQVpEpFyIlMbeZz05L0Z7_Gqr08CrebmbY8iGP/s320/output.png&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;h1&gt;
Why UpSet&lt;/h1&gt;
Everyday I face the problems that need to understand the relationships between sets. Ven diagram always a great job if the number of sets is limited (like up to 4)but it gets clumsy when the number of sets increases. A Venn diagram with multiple sets is difficult to interpret and easy to be lost. So &lt;b&gt;&lt;a href=&quot;https://www.ncbi.nlm.nih.gov/pubmed/28645171&quot; target=&quot;_blank&quot;&gt;UpSet&lt;/a&gt;&lt;/b&gt; is another &quot;visualization technique for the quantitative analysis of sets, their intersections, and aggregates of intersections&quot;.&lt;br /&gt;
&lt;h1&gt;
How to use UpSet&lt;/h1&gt;
The source code of Python implementation of UpSet can be downloaded from &lt;b&gt;&lt;a href=&quot;https://github.com/ImSoErgodic/py-upset&quot; target=&quot;_blank&quot;&gt;HERE&lt;/a&gt;&lt;/b&gt; while R version is &lt;b&gt;&lt;a href=&quot;https://github.com/hms-dbmi/UpSetR&quot; target=&quot;_blank&quot;&gt;HERE&lt;/a&gt;&lt;/b&gt;. The web version of UpSet can be used from &lt;a href=&quot;http://vcg.github.io/upset/?dataset=0&amp;amp;duration=1000&amp;amp;orderBy=subsetSize&amp;amp;grouping=groupByIntersectionSize&amp;amp;selection=&quot; target=&quot;_blank&quot;&gt;&lt;b&gt;HERE&lt;/b&gt;&lt;/a&gt; or &lt;b&gt;&lt;a href=&quot;https://gehlenborglab.shinyapps.io/upsetr/&quot; target=&quot;_blank&quot;&gt;HERE&lt;/a&gt;&lt;/b&gt;. Obviously web versions are easy to use for any project but unfortunately, our taste doesn&#39;t match always. I just modified two main functions which draw the main plots. New functions give more flexibility to the plot such as&lt;br /&gt;
&lt;ol&gt;
&lt;li&gt;can automatically calculate the number of unique colours for each comparison.&lt;/li&gt;
&lt;li&gt;colours of numbers can be changed and it doesn&#39;t have to be the same as bar color.&lt;/li&gt;
&lt;li&gt;&amp;nbsp;fonts are also changeable.&amp;nbsp;&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;
Prerequisite&lt;/h1&gt;
We need following R libraries to run the script&lt;br /&gt;
&lt;ol style=&quot;text-align: left;&quot;&gt;
&lt;li&gt;&lt;b&gt;UpSetR&amp;nbsp;&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;ggplot2&amp;nbsp;&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;grid&amp;nbsp;&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;RColorBrewer&amp;nbsp;&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;&lt;b&gt;extrafont&lt;/b&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h1&gt;Downloads&lt;/h1&gt;
All files can be downloaded from here&lt;br /&gt;
&lt;blockquote class=&quot;embedly-card&quot;&gt;
&lt;h4&gt;
&lt;a href=&quot;https://github.com/sanjaysingh765/UpsetR_modified&quot;&gt;UpsetR_modified&lt;/a&gt;&lt;/h4&gt;
UpsetR_modified - This repository contains the R script the make plot using UpsetR library. Two functions were modified to make the package more flexible.&lt;/blockquote&gt;
&lt;script async=&quot;&quot; charset=&quot;UTF-8&quot; src=&quot;https://raw.githubusercontent.com/embedly/embedly-jquery/master/jquery.embedly.js&quot;&gt;&lt;/script&gt;
&lt;/div&gt;
&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
Extract Part of a FASTA Sequences with Position by python script &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2013/10/actually-i-have-hundreds-of-protein.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;

&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/3454495058032250921/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2018/07/why-upset-everyday-i-face-problems-that.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/3454495058032250921'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/3454495058032250921'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2018/07/why-upset-everyday-i-face-problems-that.html' title='How to compare multiple sets using UpsetR '/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivk_VkBqryLmzVE-o6lTTVQbRznJKM7TA5EnvHimaEP7jB7jW2YBhYK1BgOq4C2Lz_KltDB4hDRu6uOGroitP09G1GiLWylpCRPhrp4NAQVpEpFyIlMbeZz05L0Z7_Gqr08CrebmbY8iGP/s72-c/output.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-7107802415250353851</id><published>2018-05-19T01:22:00.000+05:30</published><updated>2019-01-25T01:00:16.493+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="Bioinformatics resources"/><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="R"/><title type='text'>How to make a group bar graph with error bars and split y axis </title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQb0hYZuk-8-W-TjvGZOgldthe9emR01rjuSxTHywh5AUq-uA017AA1TePSVhOMTMElh4eLpdTCLH-AnjkiBFGhMtp_wNJdDj36mNSJ1p9fmfTQZnMqkG8LVkY1nTFf00FP8FxLh3blKr7/s1600/heatmap1.jpg&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; src=https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQb0hYZuk-8-W-TjvGZOgldthe9emR01rjuSxTHywh5AUq-uA017AA1TePSVhOMTMElh4eLpdTCLH-AnjkiBFGhMtp_wNJdDj36mNSJ1p9fmfTQZnMqkG8LVkY1nTFf00FP8FxLh3blKr7/s1600/heatmap1.jpg&quot; width=&quot;320&quot; height=&quot;256&quot; data-original-width=&quot;1500&quot; data-original-height=&quot;1200&quot; /&gt;&lt;/a&gt;&lt;/div&gt;I would like to draw a group bar graph with error bars and split y axis to show both smaller and larger values in same plot. Although plotrix has function to do that but I don&#39;t know how to moifiy their aweful looking graphs. I got a good solution &lt;a href=&quot;http://sickel.net/blogg/?p=688&quot;&gt;&lt;b&gt;HERE&lt;/b&gt;&lt;/a&gt;. I just modified it as per my taste and requirement. It need &lt;b&gt;gplots&lt;/b&gt;, &lt;b&gt;extrafont&lt;/b&gt; and&lt;b&gt; RColorBrewer&lt;/b&gt; and produce a high resolution beautiful chart.

&lt;blockquote&gt;&lt;a href=&quot;http://www.bioinformatics-made-simple.com/2017/10/how-to-perform-non-metric.html&quot;&gt;How to perform Non-metric multidimensional scaling (NMDS) analysis&lt;/a&gt;&lt;/blockquote&gt;
&lt;script src=&quot;https://gist.github.com/sanjaysingh765/33932075a728a19320ff416d0b0ade85.js&quot;&gt;&lt;/script&gt;


   &lt;/div&gt;

&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/7107802415250353851/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2018/05/how-to-make-group-bar-graph-with-error.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/7107802415250353851'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/7107802415250353851'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2018/05/how-to-make-group-bar-graph-with-error.html' title='How to make a group bar graph with error bars and split y axis '/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-1008382590006924214</id><published>2018-04-11T02:05:00.005+05:30</published><updated>2019-08-16T19:48:24.387+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="Bioinformatics resources"/><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="R"/><title type='text'>How to make a Heatmap with multiple annotation</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;&lt;a href=&quot;https://raw.githubusercontent.com/sanjaysingh765/complexheatmap/master/blog.png&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; src=&quot;https://raw.githubusercontent.com/sanjaysingh765/complexheatmap/master/blog.png&quot; width=&quot;320&quot; height=&quot;213&quot; data-original-width=&quot;800&quot; data-original-height=&quot;533&quot; /&gt;&lt;/a&gt;&lt;/div&gt;I was interested to make a heatmap with multiple annotation with least interference. It will create an heatmap with multiple annotation such as: genotype, treatment, gene, class. Legends related to all annotations are given right side of the heatmap. Please visit&lt;a href=&quot;https://github.com/sanjaysingh765/complexheatmap&quot;&gt; this page&lt;/a&gt; for all file related to this R script.   
We need R libraries&lt;b&gt; extrafont&lt;/b&gt;, &lt;b&gt;ComplexHeatmap&lt;/b&gt;,&lt;b&gt; circlize&lt;/b&gt; and &lt;b&gt;RColorBrewer&lt;/b&gt; to run this script.&lt;br /&gt;
&lt;blockquote&gt;
1. &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2014/07/easiet-way-to-create-heat-map-in-excel.html&quot; target=&quot;_blank&quot;&gt;Easiet Way to Create A Heat Map In Excel&lt;/a&gt;&lt;/blockquote&gt;
&lt;script src=&quot;http://gist-it.appspot.com/github/sanjaysingh765/complexheatmap/blob/master/complexheap.R&quot;&gt;&lt;/script&gt;

&lt;/div&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/1008382590006924214/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2018/04/how-to-make-heatmap-with-multiple.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/1008382590006924214'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/1008382590006924214'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2018/04/how-to-make-heatmap-with-multiple.html' title='How to make a Heatmap with multiple annotation'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-6191076959328077403</id><published>2018-03-02T03:33:00.000+05:30</published><updated>2018-04-11T02:07:48.192+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="R"/><title type='text'>How to download expression data set from NCBI GEO</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
NCBI has provided a wonderful tool &lt;b&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=EUPmGWS8ik0&quot;&gt;GEO2R&lt;/a&gt;&lt;/b&gt; to do analysis of microarray data sets but sometime I need the normalized data sets to check the expression of given probe across experiments. In this case I can use this given R script which can help to download the expression data for whole experiment in a simple text file that can be used as in excel or other similar program. This script assume that you have installed &lt;b&gt; GEOquery&lt;/b&gt; at your machine. If GEOquery is not pre installed then run this command





&lt;br /&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;source(&quot;http://bioconductor.org/biocLite.R&quot;)
biocLite(&quot;GEOquery&quot;)&lt;/pre&gt;
&lt;br /&gt;
Script can be simply run from your terminal
&lt;br /&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;Rscript GEO_expression.R accession_number&lt;/pre&gt;

&lt;blockquote&gt;
1. &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2016/11/how-to-download-all-sra-samples-or.html&quot; target=&quot;_blank&quot;&gt;Easiest Way to Download All Sra Samples or Multi Experiment file from NCBI SRA database&lt;/a&gt;&lt;/blockquote&gt;
&lt;script src=&quot;http://gist-it.appspot.com/github/sanjaysingh765/GEO_expression.R/blob/master/GEO_expression.R&quot;&gt;&lt;/script&gt;


&lt;/div&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/6191076959328077403/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2018/03/how-to-download-expression-data-set.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/6191076959328077403'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/6191076959328077403'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2018/03/how-to-download-expression-data-set.html' title='How to download expression data set from NCBI GEO'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-6792323460859376874</id><published>2017-10-10T23:11:00.002+05:30</published><updated>2018-03-02T03:36:19.455+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="Bioinformatics resources"/><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="R"/><category scheme="http://www.blogger.com/atom/ns#" term="tool"/><title type='text'>How to perform Non-metric multidimensional scaling (NMDS) analysis</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
&lt;b&gt;Requirements&lt;/b&gt;&lt;br /&gt;

We need R libraries &lt;b&gt;vegan&lt;/b&gt;, &lt;b&gt;ggplot2&lt;/b&gt;, &lt;b&gt;extrafont&lt;/b&gt; to run this script. &lt;br /&gt;&lt;br /&gt;


&lt;b&gt;R scripts&lt;/b&gt;&lt;br /&gt;
&lt;script src=&quot;http://gist-it.appspot.com/github/sanjaysingh765/NMDS_analysis/blob/master/NMDS.R&quot;&gt;&lt;/script&gt;

&lt;br /&gt;
&lt;br /&gt;

For complete result of analysis and raw file, please visit &lt;b&gt;&lt;a href=&quot;https://github.com/sanjaysingh765/NMDS_analysis&quot;&gt;HERE&lt;/a&gt;&lt;/b&gt;


&lt;br /&gt;
&lt;br /&gt;
&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
How To Predict CRISPR-Cas9 target site in R &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2017/04/how-to-predict-crispr-cas9-target-site.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/6792323460859376874/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2017/10/how-to-perform-non-metric.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/6792323460859376874'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/6792323460859376874'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2017/10/how-to-perform-non-metric.html' title='How to perform Non-metric multidimensional scaling (NMDS) analysis'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-3920396249105331553</id><published>2017-07-06T02:07:00.003+05:30</published><updated>2017-10-10T23:12:44.848+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="Bioinformatics resources"/><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><title type='text'>Easiest Way to Download All Sra Samples or Multi Experiment file from NCBI SRA database- II</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
Previously I shared a &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2016/11/how-to-download-all-sra-samples-or.html&quot;&gt;easy way to download the data files from NCBI SRA database&lt;/a&gt;. Although it took only &lt;b&gt;wget&lt;/b&gt; to download the data files but it required lots of link editing. Here I am going to share another script which can download all files of any biostudy or a single file, depending upon the accession number provided to the script. This bash script will also simultaneously convert the native sra file into fastq files also. Hope it will be some help.

&lt;h2&gt;sra_download.sh&lt;/h2&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;

#!/bin/bash
id=$1
if [ &quot;$#&quot; -eq  &quot;0&quot; ]
 then
   echo &quot;No accession number provided&quot;
   exit 1
else
   echo -n &quot;Please wait....&quot;
   esearch -db sra -query $id  | efetch --format runinfo | cut -d &#39;,&#39; -f 1 | grep SRR | xargs fastq-dump --split-files
   echo
   echo -n &quot;Download complete....&quot;
   echo
fi


&lt;/pre&gt;
Save the script as sra_download.sh and run as 
&lt;pre class=&quot;brush:perl&quot;&gt; bash sra_download.sh SRA_accession_number &lt;/pre&gt;
You can use any kind of accession number including SRR3114162, PRJNA309373 or SRP068795. I have tested this script on Ubuntu and assume that &lt;a href=&quot;ftp://ftp.ncbi.nlm.nih.gov/entrez/entrezdirect/&quot;&gt;Entrez Direct &lt;/a&gt; and &lt;a href=&quot;https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=software&quot;&gt;NCBI SRA Toolkit&lt;/a&gt; is installed on your machine. 

&lt;/div&gt;
&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
Extract Part of a FASTA Sequences with Position by python script &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2013/10/actually-i-have-hundreds-of-protein.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/3920396249105331553/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2017/07/easiest-way-to-download-all-sra-samples.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/3920396249105331553'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/3920396249105331553'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2017/07/easiest-way-to-download-all-sra-samples.html' title='Easiest Way to Download All Sra Samples or Multi Experiment file from NCBI SRA database- II'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-8779771805989172688</id><published>2017-04-28T23:45:00.002+05:30</published><updated>2017-10-10T23:12:26.046+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="Bioinformatics resources"/><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="Software"/><title type='text'>How To Predict CRISPR-Cas9 target site in R</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
Although there are several &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2012/02/5-bioinformatics-link-directories-you.html&quot;&gt;online bioinformatics tools&lt;/a&gt; to predict the target site for CRISPR-Cas9 but I was looking for a offline solution. I am going to share a R script that use two R libraies &lt;b&gt;CRISPRseek&lt;/b&gt; and &lt;b&gt;msa&lt;/b&gt;. The best part is that it also predict the restriction enzyme sites in target region therefore it will help in downstream analysis of mutant screening. Hope this script will help some one. As always R script is heavily commented to get it easy. 
&lt;br /&gt;
&lt;br /&gt;
&lt;script src=&quot;http://gist-it.appspot.com/github/sanjaysingh765/crispr_target_prediction_in_R/blob/master/crispr_target_prediction.R&quot;&gt;&lt;/script&gt;

&lt;/div&gt;
&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
How To Perform Basic Multiple Sequence Alignments In R &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2017/03/how-to-perform-basic-multiple-sequence.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/8779771805989172688/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2017/04/how-to-predict-crispr-cas9-target-site.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/8779771805989172688'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/8779771805989172688'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2017/04/how-to-predict-crispr-cas9-target-site.html' title='How To Predict CRISPR-Cas9 target site in R'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-8235548147978871812</id><published>2017-03-20T01:41:00.000+05:30</published><updated>2020-01-18T00:01:10.362+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="Bioinformatics resources"/><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><category scheme="http://www.blogger.com/atom/ns#" term="R"/><title type='text'>How To Perform Basic Multiple Sequence Alignments In R</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
There are several servers out there to perform the multiple sequence alignment and visualization. I was looking for all in one &lt;a href=&quot;http://bioinfo-tool.blogspot.com/p/online-bioinformatics-tools.html&quot;&gt;offline tool&lt;/a&gt; to perform the multiple sequence alignment and generate a publication quality image. 
We can use &lt;b&gt;&lt;a href=&quot;https://bioconductor.org/packages/devel/bioc/vignettes/msa/inst/doc/msa.pdf&quot;&gt;msa package&lt;/a&gt;&lt;/b&gt; for both amino acid and DNA alignment. Following Rscript can be used for this purpose. Script is heavily commented to easy understand.Hope it will be helpful for others too.
&lt;br /&gt;&lt;br /&gt;
&lt;script src=&quot;http://gist-it.appspot.com/github/sanjaysingh765/Pairwise_Sequence_Alignment_With_R/blob/master/alignment.R&quot;&gt;&lt;/script&gt;


&lt;/div&gt;




&lt;div style=&quot;text-align: center; font-style:italic; color: #a5a4a4; font-weight:bold; border-top: 1px solid #ccc; border-bottom: 1px solid #ccc; padding:30px; margin:30px; &quot;&gt;Sequence Similarity Search - I : Basic Local Alignment Search Tool (BLAST)  &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2011/11/sequence-similarity-search-i-basic.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/8235548147978871812/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2017/03/how-to-perform-basic-multiple-sequence.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/8235548147978871812'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/8235548147978871812'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2017/03/how-to-perform-basic-multiple-sequence.html' title='How To Perform Basic Multiple Sequence Alignments In R'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-2375376091177134998</id><published>2016-11-17T20:46:00.000+05:30</published><updated>2018-03-02T03:35:57.827+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><title type='text'>Easiest Way to Download All Sra Samples or Multi Experiment file from NCBI SRA database</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
&lt;a href=&quot;http://www.ebi.ac.uk/ena&quot;&gt;European Nucleotide Archive&lt;/a&gt; is good place to start to download the raw fastq files. But it is not easy to download multiple run files from NCBI SRA database. I recently learn to download in relatively easy way. I want to download these four run together from NCBI SRA database : SRR122247,SRR122248,SRR122249, SRR122250. Format of basic url is like that&lt;br /&gt;

&lt;pre style=&quot;font-family: Andale Mono, Lucida Console, Monaco, fixed, monospace; color: #000000; font-size: 12px;border: 1px dashed #999999;line-height: 14px;padding: 5px; overflow: auto; width: 100%&quot;&gt;&lt;code&gt;ftp://ftp-trace.ncbi.nih.gov/sra/sra-instant/reads/ByRun/sra/{SRR&amp;#124;ERR&amp;#124;DRR}/&amp;lt;first 6 characters of accession&amp;gt;/&amp;lt;accession&amp;gt;/&amp;lt;accession&amp;gt;.sra
&lt;/code&gt;&lt;/pre&gt;

Where

{SRR|ERR|DRR} should be either ‘SRR’, ‘ERR’, or ‘DRR’ and should match the prefix of the target .sra file


Described in details &lt;a href=&quot;https://www.ncbi.nlm.nih.gov/books/NBK158899/&quot; target=&quot;_blank&quot;&gt;&lt;b&gt;HERE&lt;/b&gt;&lt;/a&gt;. 
So my final urls will look like this 

&lt;pre class=&quot;brush:perl&quot;&gt;

ftp://ftp-trace.ncbi.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR122/SRR122247/SRR122247.sra
ftp://ftp-trace.ncbi.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR122/SRR122248/SRR122248.sra
ftp://ftp-trace.ncbi.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR122/SRR122249/SRR122249.sra
ftp://ftp-trace.ncbi.nih.gov/sra/sra-instant/reads/ByRun/sra/SRR/SRR122/SRR122250/SRR122250.sra
&lt;/pre&gt;


Now save this urls in a text file (url_list.txt, for example). Run the &lt;b&gt;wget&lt;/b&gt; like this 

&lt;pre class=&quot;brush:perl&quot;&gt;wget -i url_list.txt&lt;/pre&gt;

&lt;blockquote&gt;run &lt;b&gt;sudo apt-get install wget&lt;/b&gt; from terminal if you don&#39;t have wget&lt;/blockquote&gt;


&lt;/div&gt;

&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/2375376091177134998/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2016/11/how-to-download-all-sra-samples-or.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/2375376091177134998'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/2375376091177134998'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2016/11/how-to-download-all-sra-samples-or.html' title='Easiest Way to Download All Sra Samples or Multi Experiment file from NCBI SRA database'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-8318157745325678321</id><published>2016-06-15T20:37:00.001+05:30</published><updated>2017-04-15T07:36:10.098+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="Bioinformatics resources"/><title type='text'>Journal Citation Reports 2015 / 2016</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div class=&quot;separator&quot; style=&quot;clear: both; text-align: center;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDhCcKU45PPQ9egWG3nGKCjVMerXKIvO2tjb-gZBrli4fWeF8ZCOfIcvo2ohqUeXN5ICZwP0XJo8Xn1nom_u4y9uN21zjnQl53BWMMV7pCjUE32x4deC8N2TvbiWZIOE_-Pz6KFTnDF0-1/s1600/journal+impact+factor+2012.jpg&quot; imageanchor=&quot;1&quot; style=&quot;clear: left; float: left; margin-bottom: 1em; margin-right: 1em;&quot;&gt;&lt;img border=&quot;0&quot; height=&quot;213&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDhCcKU45PPQ9egWG3nGKCjVMerXKIvO2tjb-gZBrli4fWeF8ZCOfIcvo2ohqUeXN5ICZwP0XJo8Xn1nom_u4y9uN21zjnQl53BWMMV7pCjUE32x4deC8N2TvbiWZIOE_-Pz6KFTnDF0-1/s320/journal+impact+factor+2012.jpg&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div style=&quot;text-align: justify;&quot;&gt;
&lt;span style=&quot;font-size: large;&quot;&gt;Recently, Thomson Reuter&amp;nbsp;announced&amp;nbsp;&lt;span style=&quot;font-family: &amp;quot;times&amp;quot; , &amp;quot;times new roman&amp;quot; , serif;&quot;&gt;&lt;span style=&quot;background-color: white; line-height: 18px; text-align: -webkit-auto;&quot;&gt;Journal Citation Reports®&lt;/span&gt;&lt;span style=&quot;background-color: white; line-height: 18px; text-align: -webkit-auto;&quot;&gt;&amp;nbsp;for year 2016. Actually&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;background-color: white; font-family: &amp;quot;times&amp;quot; , &amp;quot;times new roman&amp;quot; , serif; line-height: 18px;&quot;&gt;Journal Citation Reports®&lt;/span&gt;&lt;span style=&quot;background-color: white; font-family: &amp;quot;times&amp;quot; , &amp;quot;times new roman&amp;quot; , serif; line-height: 18px;&quot;&gt;&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;background-color: white; text-align: -webkit-auto;&quot;&gt;&lt;span style=&quot;font-family: &amp;quot;times&amp;quot; , &amp;quot;times new roman&amp;quot; , serif;&quot;&gt;&lt;span style=&quot;line-height: 18px;&quot;&gt;offers a systematic means to critically evaluate the world&#39;s leading journals, with quantifiable, statistical information based on citation data by compiling articles&#39; cited references. Thus in scientific community journal impact is now considering as a quality measure of published work.&amp;nbsp;JCR is actually contains the journal impact factor, number of times they cited in different journals and other&amp;nbsp;&lt;/span&gt;&lt;span style=&quot;line-height: 17.27272605895996px;&quot;&gt;statistical&lt;/span&gt;&lt;span style=&quot;line-height: 18px;&quot;&gt;&amp;nbsp;values. So this list of journal impact factors is made by the data gathered throughtout 2015 that is why you can also say it journal impact factor 2015 download. You can view this SCI&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;background-color: white; font-family: &amp;quot;times&amp;quot; , &amp;quot;times new roman&amp;quot; , serif; line-height: 17.27272605895996px;&quot;&gt;journal impact factor 2015&lt;/span&gt;&amp;nbsp;&lt;span style=&quot;background-color: white; text-align: -webkit-auto;&quot;&gt;&lt;span style=&quot;font-family: &amp;quot;times&amp;quot; , &amp;quot;times new roman&amp;quot; , serif;&quot;&gt;&lt;span style=&quot;line-height: 18px;&quot;&gt;&amp;nbsp;in other tab by clicking &lt;b&gt;&lt;a href=&quot;https://drive.google.com/file/d/0B-PwDYXWDBdgNURaeGRyU1U1YkE/view?usp=sharing&quot; target=&quot;_blank&quot;&gt;HERE&lt;/a&gt;&amp;nbsp;or you can download&amp;nbsp;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;background-color: white; font-family: &amp;quot;times&amp;quot; , &amp;quot;times new roman&amp;quot; , serif; line-height: 17.27272605895996px;&quot;&gt;journal impact factor 2015 &lt;b&gt;&lt;a href=&quot;https://drive.google.com/uc?export=download&amp;id=0B-PwDYXWDBdgNURaeGRyU1U1YkE&quot; target=&quot;_blank&quot;&gt;HERE&lt;/a&gt;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/8318157745325678321/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2016/06/journal-citation-reports-2015-2016.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/8318157745325678321'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/8318157745325678321'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2016/06/journal-citation-reports-2015-2016.html' title='Journal Citation Reports 2015 / 2016'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiDhCcKU45PPQ9egWG3nGKCjVMerXKIvO2tjb-gZBrli4fWeF8ZCOfIcvo2ohqUeXN5ICZwP0XJo8Xn1nom_u4y9uN21zjnQl53BWMMV7pCjUE32x4deC8N2TvbiWZIOE_-Pz6KFTnDF0-1/s72-c/journal+impact+factor+2012.jpg" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-5605449194804027697</id><published>2016-01-19T23:08:00.003+05:30</published><updated>2017-04-15T07:35:42.791+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><title type='text'>How to download only viridiplantae miRNA from miRBase</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
There is no direct way to download the organism specific miRNA from miRBase database. So I extracted the miRNA of viridiplantae plant from miRBase using some unix command. Steps are as follows&lt;br /&gt;
&lt;ul&gt;
&lt;li&gt;&amp;nbsp;Download the information regarding organisms from &lt;a href=&quot;ftp://mirbase.org/pub/mirbase/CURRENT/organisms.txt.gz&quot; target=&quot;_blank&quot;&gt;&lt;b&gt;HERE&lt;/b&gt;&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Download the mature miRNA sequence from &lt;a href=&quot;ftp://mirbase.org/pub/mirbase/CURRENT/mature.fa.gz&quot; target=&quot;_blank&quot;&gt;&lt;b&gt;HERE&lt;/b&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Extract both files in same directory&lt;/li&gt;
&lt;li&gt;Download the fasta dereplicating python script from &lt;b&gt;&lt;a href=&quot;https://drive.google.com/uc?export=download&amp;amp;id=0B7s04L2fqhwDTXY1UjFTOHAybjg&quot; target=&quot;_blank&quot;&gt;HERE&lt;/a&gt;&lt;/b&gt; &lt;/li&gt;
&lt;li&gt;Now run the bash script given from the same directory&lt;/li&gt;

&lt;pre class=&quot;brush:perl&quot;&gt;#!/bin/bash
#script to extact plant mirna from mirbase database

# convert fasta to tab
awk &#39;BEGIN{RS=&quot;&amp;gt;&quot;}{gsub(&quot;\n&quot;,&quot; &quot;,$0); print &quot;&amp;gt;&quot;$0}&#39; mature.fa &amp;gt;mature.tab


#extract the organisms belong to Viridiplantae. You can extract the miRNA for other
# organism too by changing the word &quot;Viridiplantae&quot;
grep Viridiplantae organisms.txt &amp;gt;plants_mirbase.txt

# extract name of plants
awk &#39;{ print $3 &quot; &quot; $4 }&#39; plants_mirbase.txt &amp;gt;plant_name.txt

#extract mirna for plants
grep -f plant_name.txt mature.tab &amp;gt;plant_mirna.tab

#convert tab to fasta
awk &#39;{print &quot;&quot;$1&quot; &quot;$2&quot; &quot;$3&quot; &quot;$4&quot; &quot;$5&quot;\n&quot;$6}&#39; plant_mirna.tab &amp;gt; plant_mirna.rna

#convert RNA to DNA
sed &#39;/^[^&amp;gt;]/ y/uU/tT/&#39; plant_mirna.rna  &amp;gt;plant_mirna.fasta


#dereplicate mirna file
python derep.py -i plant_mirna.fasta

#cleaning fasta header
cat derep_plant_mirna.fasta | awk -F &#39;;&#39; &#39;{print $1}&#39; &amp;gt;plant_mature_mirna_unique.fasta


rm mature.tab
rm plants_mirbase.txt
rm plant_mirna.tab
rm plant_mirna.rna
rm plant_name.txt
rm derep_plant_mirna.fasta

echo mature mirna from all plants are in plant_mirna.fasta!!!
echo unique mature mirna from all plants are in plant_mature_mirna_unique.fasta!!!
echo all job done!!!

&lt;/pre&gt;
&lt;/ul&gt;
Basically the above bash script extract the miRNA from plant deposited to miRBase database and save them to a file &lt;b&gt;plant_mirna.fasta. &lt;/b&gt;In second part, it remove the duplicate miRNAs and save them in another file&lt;b&gt; plant_mature_mirna_unique.fasta.&lt;/b&gt;&lt;/div&gt;
&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
How to remove duplicate sequences from FASTA file &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2013/04/how-to-remove-duplicate-sequences-from.html&quot; target=&quot;_blank&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;
</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/5605449194804027697/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2016/01/how-to-download-only-viridiplantae.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/5605449194804027697'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/5605449194804027697'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2016/01/how-to-download-only-viridiplantae.html' title='How to download only viridiplantae miRNA from miRBase'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7223713061342925300.post-4003437738074042369</id><published>2015-12-28T04:32:00.000+05:30</published><updated>2016-06-15T20:42:52.128+05:30</updated><category scheme="http://www.blogger.com/atom/ns#" term="BLAST"/><category scheme="http://www.blogger.com/atom/ns#" term="HOW  TO"/><title type='text'>BLAST Database creation error</title><content type='html'>&lt;div dir=&quot;ltr&quot; style=&quot;text-align: left;&quot; trbidi=&quot;on&quot;&gt;&lt;a name=&#39;more&#39;&gt;&lt;/a&gt;
&lt;div style=&quot;line-height: 30px; text-align: justify;&quot;&gt;
&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhOKhXtyljwxpRJ-lN3nE7MTGaw0n6JFNuVyIPRlfOnj-Mq_mSDpkSXXTcFS47ChwBj9MzjJk-8EJ_5uo4MRCTiLpkyuwHq_ubAvemBO4NkP2a0iQwaRqDwSVaXofyjUeEabU1yQpNIs8MH/s1600/ncbi_blast.gif&quot; imageanchor=&quot;1&quot;&gt;&lt;img border=&quot;0&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhOKhXtyljwxpRJ-lN3nE7MTGaw0n6JFNuVyIPRlfOnj-Mq_mSDpkSXXTcFS47ChwBj9MzjJk-8EJ_5uo4MRCTiLpkyuwHq_ubAvemBO4NkP2a0iQwaRqDwSVaXofyjUeEabU1yQpNIs8MH/s320/ncbi_blast.gif&quot; /&gt;&lt;/a&gt;

I was trying to create a &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2013/05/how-to-blast-multiple-sequences-against.html&quot; target=&quot;_blank&quot;&gt;BLAST database&lt;/a&gt; but I got this error

&lt;br /&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;Building a new DB, current time: 12/03/2015 09:44:18
New DB name:   plant_protein
New DB title:  /home/sanjay/bin/Genomes/plant_protein_from_plantgdb.fa
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B

volume: plant_protein

file: plant_protein.pin
file: plant_protein.phr
file: plant_protein.psq

BLAST Database creation error: FASTA-Reader: No residues given&lt;/pre&gt;
Then I looked whether my any FASTA sequence is empty or not by running this command

&lt;br /&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;grep -c &quot;^$&quot; ~/bin/Genomes/plant_protein_from_plantgdb.fa&lt;/pre&gt;
I found that there is one sequence which have only FASTA header. To remove the &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2013/09/how-to-remove-all-empty-fasta-sequences.html&quot; target=&quot;_blank&quot;&gt;empty FASTA&lt;/a&gt; sequence I run this command
&lt;br /&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;awk &#39;BEGIN {RS = &quot;&amp;gt;&quot; ; FS = &quot;\n&quot; ; ORS = &quot;&quot;} $2 {print &quot;&amp;gt;&quot;$0}&#39; ~/bin/Genomes/plant_protein_from_plantgdb.fa &amp;gt;~/bin/Genomes/plant_protein_from_plantgdb.fasta&lt;/pre&gt;
And finally I got the happy success message



&lt;br /&gt;
&lt;pre class=&quot;brush:perl&quot;&gt;Building a new DB, current time: 12/03/2015 09:48:01
New DB name:   plant_protein
New DB title:  /home/sanjay/bin/Genomes/plant_protein_from_plantgdb.fasta
Sequence type: Protein
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 980219 sequences in 24.2583 seconds.
&lt;/pre&gt;
&lt;br /&gt;
&lt;div style=&quot;border-bottom: 1px solid #ccc; border-top: 1px solid #ccc; color: #a5a4a4; font-style: italic; font-weight: bold; margin: 30px; padding: 30px; text-align: center;&quot;&gt;
How to install &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2012/02/5-server-for-batch-blast-against-ncbi.html&quot; target=&quot;_blank&quot;&gt;NCBI BLAST&lt;/a&gt; program on your computer &lt;a href=&quot;http://www.bioinformatics-made-simple.com/2012/06/how-to-install-ncbi-blast-on-window-7.html&quot;&gt;HERE&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinformatics-made-simple.blogspot.com/feeds/4003437738074042369/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2015/12/blast-database-creation-error.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/4003437738074042369'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7223713061342925300/posts/default/4003437738074042369'/><link rel='alternate' type='text/html' href='http://bioinformatics-made-simple.blogspot.com/2015/12/blast-database-creation-error.html' title='BLAST Database creation error'/><author><name>संजय</name><uri>http://www.blogger.com/profile/13208510103131669624</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='30' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJtjLJMWy7g_2LQrSdECZZcRipQtCx0wpxHkPlF2Y4MPR3lVxzb-gA4ZvHLKH8UOwUxDHZC9sw32QGH3_iBUhnL-GvZ13i3hfRnwFn34aR8HZfwMC5pRyIm4R4jccijhQ/s220/Sadhu_V%25C3%25A2r%25C3%25A2nas%25C3%25AE_.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhOKhXtyljwxpRJ-lN3nE7MTGaw0n6JFNuVyIPRlfOnj-Mq_mSDpkSXXTcFS47ChwBj9MzjJk-8EJ_5uo4MRCTiLpkyuwHq_ubAvemBO4NkP2a0iQwaRqDwSVaXofyjUeEabU1yQpNIs8MH/s72-c/ncbi_blast.gif" height="72" width="72"/><thr:total>0</thr:total></entry></feed>