Force download of fasta files and aln files with Apache

By kenglish

If you are serving fasta files or alignment files on your server, you may want to force users to download them instead of previewing them in the browser. My application would return the fasta files as Content-Type text/plain. I wanted to force it to application/x-fasta and force download. This is accomplished rather easily in Apache with the following directive:

  <FilesMatch "\.(?i:fasta)$">
    ForceType application/x-fasta
    Header set Content-Disposition attachment
  </FilesMatch>
  <FilesMatch "\.(?i:aln)$">
    ForceType application/x-aln
    Header set Content-Disposition attachment
  </FilesMatch>

You will have to enable the apache module “mod_header” for this to work.

Parsing Emboss Water output with Ruby

By kenglish

First, you will need to install the emboss suite on your computer:

sudo apt-get install emboss emboss-lib

If don’t already have the BioRuby installed, you will need that too:

sudo gem install bio --no-ri --no-rdoc

Your first ruby script calling Emboss Water from Ruby:

1
2
3
4
5
require 'rubygems'
require 'bio'
test_filename =ARGV.shift
target_filename =ARGV.shift
result = Bio::EMBOSS.run('water', '-asequence', test_filename, '-bsequence', target_filename)

Unforntunately, there is not a nice report result class in BioRuby for Emboss Water so you will have to parse the output yourself. Here’s an example script that finds percent similarity:

require 'rubygems'
require 'bio'
test_filename =ARGV.shift
target_filename =ARGV.shift
result = Bio::EMBOSS.run('water', '-asequence', test_filename, '-bsequence', target_filename)
# result now has the text output of water...
# Here's an example of looping through each line of the result to get the similary:
 
test_seq = ""
target_seq = ""
similarity = ''
 
 
result.split("\n").each do | line |
  # This mean
  if line =~ /^# Aligned_sequences/
    puts "Seq '#{test_seq}' has similarity to Seq '#{target_seq}' of #{similarity}"  unless (test_seq == "" ) && (target_seq == "")
    test_seq = ""
    target_seq = ""
  end
  # Get sequence numbers 
  if line =~ /^# (\d+): (\d+)/
     test_seq  = $2 if $1 == '1'
     target_seq = $2 if $1 == '2'
  end
  # parse similarity
  if line =~ /^# Similarity:.*\((.*)%\)/
    similarity  = $1
  end
end
 
puts "Seq '#{test_seq}' has similarity to Seq '#{target_seq}' of #{similarity}"

Place this in a file called water.rb and run it with frags.fasta and frags1.fasta and the above script will output this.

$ ruby water.rb fastas/frags1.fasta frags.fasta 
Seq '1' has similarity to Seq '1' of 100.0
Seq '1' has similarity to Seq '2' of 96.6
Seq '1' has similarity to Seq '3' of 64.3
Seq '1' has similarity to Seq '4' of 97.9
Seq '1' has similarity to Seq '5' of 96.9
Seq '1' has similarity to Seq '6' of 94.1
Seq '1' has similarity to Seq '7' of 62.5
Seq '1' has similarity to Seq '8' of 61.1
Seq '1' has similarity to Seq '9' of 62.5
Seq '1' has similarity to Seq '10' of 57.1
Seq '1' has similarity to Seq '11' of 57.4
Seq '1' has similarity to Seq '12' of 97.8
Seq '1' has similarity to Seq '13' of 50.0
Seq '1' has similarity to Seq '14' of 62.5
Seq '1' has similarity to Seq '15' of 97.9
Seq '1' has similarity to Seq '16' of 62.5
Seq '1' has similarity to Seq '17' of 59.1
Seq '1' has similarity to Seq '18' of 55.9
Seq '1' has similarity to Seq '19' of 61.9
Seq '1' has similarity to Seq '20' of 60.0
Seq '1' has similarity to Seq '21' of 56.4
Seq '1' has similarity to Seq '22' of 56.2

Water is the worse name for a program, EVER. Because it is impossible to Google…

categoriaBioinformatics, Programming commentoNo Comments dataNovember 20th, 2009
Read All

blast notes

By kenglish

Create db from fasta file:

formatdb -p F -i EST_Clade_A.fasta -n EST_Clade_A

This will create a 3 files: EST_Clade_A.nhr, EST_Clade_A.nin, EST_Clade_A.nsq. If you omit the -n option, it will create EST_Clade_A.fasta.nhr, EST_Clade_A.fasta.nin, EST_Clade_A.fasta.nsq
Blast example:

blastall -p blastn -i EST_Clade_C_1.fasta -d EST_Clade_A  -e 25

categoriaBioinformatics commentoNo Comments dataMarch 23rd, 2009
Read All