Force download of fasta files and aln files with Apache

By kenglish

If you are serving fasta files or alignment files on your server, you may want to force users to download them instead of previewing them in the browser. My application would return the fasta files as Content-Type text/plain. I wanted to force it to application/x-fasta and force download. This is accomplished rather easily in Apache with the following directive:

  <FilesMatch "\.(?i:fasta)$">
    ForceType application/x-fasta
    Header set Content-Disposition attachment
  </FilesMatch>
  <FilesMatch "\.(?i:aln)$">
    ForceType application/x-aln
    Header set Content-Disposition attachment
  </FilesMatch>

You will have to enable the apache module “mod_header” for this to work.

Best Music of 2008: My picks

By kenglish

1) R.E.M. : Accelerate

200px-R.E.M._-_AccelerateThe 3 remaining veteran musicians of R.E.M. returned to the sound of Life Rich Paegent and Document to breath life back into a band that was one of the fore-bearers of alternative music in the 80s and 90s. At only 34 minutes, this record is perfectly paced. The majority of the songs are loud, distortion, garage rockers. “Man-sized Wreath” and “Supernatural Superserious” have that pop feel that could only be R.E.M. The title track and “Sing for the Submarine” explored darker territory while “Houston” and “Until the Day Is Done” show that they are still masters of acoustic rock. The highlight of the album for me is “Mr. Richards”, a poetic, political critique of the Bush Administration decorated in a great melodies and metaphors. Like “Exhuming McCarthy”, “Disturbance at the Heron House” and “Fall on Me”, that song will still sound good in 10-15 years.

2) Air France: No Way Down

200px-NoWayDownI discovered this album on accident. This group from Sweden combines ambient sounds, dance beats and symphonic arrangements with just enough vocals to keep it out the instrumental category. Another short record, this one is only 22 minutes, the songs blend into one another. The best song is “No Excuses.”

3) Bob Dylan: Tell Tale Signs

200px-Bobdylan-telltalesignscoverA co-worker told me that PBS was making available a streaming version of this Dylan album when it was released. I listened to it online for about 2 weeks and fell in love. I’m not one of those who thinks that Love and Theft and Modern Times are the awesome masterpieces that critics and other fans declare. However, this release is a great sample of his last 15 years of material. It makes you wonder how some of the version of these songs never ended up on albums. “Born in Time” in particular memorizes the ears here whereas the original version is awful. Other stand outs in clude “Tel Ol’ Bill“, “Someday Baby” & “Dreamin’ of You.”

4) Torche: Meanderthal

200px-Torche_MeanderthalThis music isn’t for everyone. It’s pretty heavy but it’s the perfect combination of melodic and hard for me. “Grenades” feels like it could be the anthem of a new generation, “Sundown” a Jawboxish slow epic and the short instrumentals (“Triumph of Venus”, “Little Champion”) like every song on the album let the band showcase their technical chops. Only the last 3 songs are longer than 3:30 minutes. Watch out for the long octave solo in “Fat Waves.

5) Cervantes: Making Friends and Enemies

cervantes-friends-and-enemiesPerhaps this decade’s most underrated act in San Francisco, Cervantes (formerly Dumbwaiter) has undergone a number of personnel changes over the years but the core members and songwriters remain to help the band reinvigorate and reinvent themselves each time. This album represent the pinnacle of their effort. The guitar work, the angst-driven vocals, the creative song structures and the hat-tips to their influences forge this record. This is one album that should be in your collection.

categoriaMusic commento1 Comment dataJanuary 4th, 2010
Read All

Using Ruby & Hpricot to find lowest mortgage rate in Hawaii

By kenglish

Each week the Honolulu Board of Realtors publishes a report of Hawaii Mortgage Rates. To find the lowest rate for your category is difficult. A non-programming solution would be to copy it into excel, delete all the rows that you need and then sort by the rate column. This takes too much time so I wrote a ruby script that parses this data. This is also a demonstration of how to use the ruby tool Hpricot, an HTML Parser. You will need to install the Hpricot gem for this to work

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
require 'rubygems'
require 'hpricot'
require 'open-uri'
 
#term = '1-YR ARM'
#term = '30-YR Fixed'
term = '15-YR Fixed'
 
doc = Hpricot.parse(open("http://www.hicentral.com/MortgageRates.asp")) 
 
rates = [] 
lender_name = ""
 
(doc/"table"/"table"/"tr").each do | row |
   arr =[]
   (row/"td").each do | cell|  
     arr << cell.inner_html() 
   end
   if arr[1] =~ /15-YR Fixed/
      lender_name = arr[0]
      arr.delete(lender_name) 
      lender_name.sub!('<br />',' - ')
   end
   next unless arr[0] =~ /#{term}/
   lender_data ={} 
   lender_data[:lender_name] = lender_name    
   lender_data[:term] = arr[0] 
   lender_data[:apr] = arr[3].to_f
   rates << lender_data
end
 
5.times do | rank |
  puts "Losest Rate ##{rank+1}"
  row = rates.min{|a,b| a[:apr] <=> b[:apr] } 
  puts "  Name: #{row[:lender_name]}"  
  puts "  Term: #{row[:term]}" 
  puts "  APR: #{row[:apr]}"  
  rates.delete(row)
end

Here’s what’s going on:

  • On line 7 you will notice that I am interested in the 15-Year mortgage rate. You can change this value to get the report for the term you want.
  • On line 9, the program will download the latest rates from the hicentral.com website and parse the page returning an hpricot doc object.
  • From line 14 to 30, the program parses each line in the mortgage rate table. The logic is custom to this table. The table is unusual because the lender name is first cell only on the first line (15-YR Fixed) for each lender. To accommodate this, we match the line that has “15-YR Fixed” in second position and delete the lender name from the array (lines 19-23). We then assign the data to our summary data structure (lines 25-29)
  • Finally, we show the top 5 lowest mortgage rates (lines 32-38). To do this we use the Ruby max method (line 34). We delete the current max element so it is not counted in the next loop iteration (line 38).

Thomas Lecklider has a great tutorial on how to use Hpricot called Using Hpricot to Traverse and Parse HTML.

Your commends are welcome. Give some refactoring advice if you like. I have wordpress plugin for pre tag so to write ruby code just do:

<pre lang="ruby">
puts YEAH
</pre>

categoriaProgramming commento1 Comment dataDecember 14th, 2009
Read All

Best Music of 2009: My picks

By kenglish

1) Neko Case: Middle Cyclone

200px-Middle_cyclone_album_coverFox Confessor Brings the Flood seemed a little off and it didn’t warm up to me the way Blacklisted, and Furnace Room Lullaby did but this record was on rotation in my music player for countless weeks. Radio friendly tracks like “This Tornado Loves You”, “People Got A Lotta Nerve” and “Red Tide” have a sleek, polished feel that could some day make them Alt-Country classics but the real gems here are the acoustic, non-traditional songs like “Polar Nettles”, “Fever”, “Vengenance is Sleeping” and the title track. These numbers weave in lovely harmonies, melodies and lyrics that are nothing more than memorizing.

2) Islands: Vapours

200px-IslandsVapoursThis album actually surprised me. Islands, a montreal based band, followed their remarkable 2005 debut Return to the Sea with a disappointing, overproduced and swollen album Arm’s Way. Vapours doesn’t revert to the sound of Return to the Sea. It’s more like a new beginning. It pulls the best sounds from the 80s and lays them on top of some very cleverly crafted indie-synth-pop tunes. This is fun music and each time I listen to this record I marvel at how well these songs are arranged. “Tender Torture”, “Devout” and “Switched On” are my favorites.

3) U2: No Line on the Horizon

NoLineU2PromoI was talking to a guy at a campout this summer who described U2 as the “Beatles of our generation.” This pisses music critics off. When at the beginning of Rattle & Hum Bona said, “This song Charles Mason stole from the Beatles and now we’re stealing it back” wasn’t he implying that U2 was good enough or better than the Beatles and thus claiming their spot as the top rock act of all time? Oh, the nerve. As clique as it sounds, U2 does know how to reinvent itself very well and on this record they did it again while maintaining their core unique elements: Edges infinite guitar notes, Larry Mullen Jr’s creative rock drumming and Bono’s soulful, rock vocals. Great songs: “Breathe”, “Magnificent”, “Fez”

4) Morrissey : Years of Refusal

200px-YearsOfRefusalIt’s not that obsessed with my 7th grade heroes, it’s just that they are still putting out great music. Morrissey returns from the melodrama of Ringleader of the Tormentor with an album that has much heavier, rock influences. His new band is tight, full of punk/pop hooks, raw and energetic. Of course, the lyrics are great too. Only one voice can successfully belt out lines like “It’s not your birthday anymore, did you really think we meant all of those syrupy, sentimental things that we said yesterday”.

5) George Jones Musicor Recordings Box Sets (1965-1971)

good-year-for-the-rosesWhile this isn’t “new” music, these two box sets were released this year and are must haves for any fan of old country. They are appropriately titled Walk Through This World with Me and A Good Year for the Roses after two of his biggest hits of the era, possibly of his career. You’ll find plenty of saloon jumping honky tonks and soothing “lost my wife” ballads (literally, one is called “When The Wife Runs Off”). All of these are decorated with beautiful piano parts, thick female background harmonies, whiny slide guitars and one of the greatest voices in country music. Due to pressure from his manager and a changing country scene, this was a very productive period for George. The sheer size of these 2 box sets is a tribute to that: walk-through-this-world-with-me10 Discs, 320 songs and 12 hours and 38 minutes of music. Not every song is a gem but that’s why there’s a next button on your mp3 player. However, there are songs here that you can’t miss: “Love Bug”, “I Cried Myself Awake”, “No Blues is Good News”, and one my all time favorites, “You’re Still On My Mind.” Country gets a bad rap from a lot of folks these days but this box set is simple, American music at its finest.

categoriaMusic commento2 Comments dataDecember 4th, 2009
Read All

A SQL Server file ‘basename’ function

By kenglish

Given a file path: /var/www/html/index.html
Returns: index.html

Pretty common, here’s how you do it:

Perl:

use File::Basename; 
$fullname = "/usr/local/src/perl-5.6.1.tar.gz"; 
$file = basename($fullname);

PHP:

$path = "/home/httpd/html/index.php";
$file = basename($path);

Ruby:

path = "/usr/lib/ruby/site_ruby/1.8/rubygems/version.rb"
File.basename path

Python:

import os.path
path =  "/usr/local/bin/python"
os.path.basename(path)

T-SQL (MsSQL Server):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
CREATE FUNCTION [dbo].[fn_file_basename]
(
	-- Add the parameters for the function here
	@file_path VARCHAR(255)
)
RETURNS VARCHAR(255)
AS
BEGIN
    DECLARE @file_basename VARCHAR(255)
    IF charindex('\', @file_path) != 0 
	   set @file_basename= reverse(substring(reverse(@file_path), 1, charindex('\', reverse(@file_path))-1))
	else
	   set @file_basename=@file_path
 
	return @file_basename
END

My brilliant co-worker figured this out. The magic is done on line 11:

reverse(SUBSTRING(reverse(@file_path), 1, charindex('\', reverse(@file_path))-1))

It reverses the string, find the first occurance of the character ‘\’, takes the substring to that character and the reverses again. How elegant!

categoriaProgramming commentoNo Comments dataNovember 24th, 2009
Read All

Remove Invalid XML Characters with an SSIS Visual Basic Script

By kenglish

My boss is forcing us to use Microsoft SQL Service Intergration Services for our ETL process. I Googled around for a bit and could not find a good example of how to do this simple task: open an XML file, read the text, replace any invalid characters and write it back out to the same file. My VB is very rusty but this works pretty well. The ReadVariable portion is specific to SSIS but the rest should be generic. Hopefully, the next poor who needs to do this will be able to find this blog entry!

 
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.Text.RegularExpressions
Imports System.Object
 
 
Public Class ScriptMain
 
    Public Sub Main()
        Dts.TaskResult = Dts.Results.Success
        Dim strPath, strXML As String
        strPath = CStr(ReadVariable("User::strFileName"))
        strXML = FileIO.FileSystem.ReadAllText(strPath)
 
        Dim rgx As Regex = New Regex("[\x00-\x08\x0B-\x0C\x0E-\x1F]", RegexOptions.None)
        rgx.Replace(strXML, " ")
        FileIO.FileSystem.WriteAllText(strPath, strXML, False)
    End Sub
 
    Private Function ReadVariable(ByVal varName As String) As Object
        Dim result As Object
 
        Try
            Dim vars As Variables
            Dts.VariableDispenser.LockForRead(varName)
            Dts.VariableDispenser.GetVariables(vars)
            Try
                result = vars(varName).Value
            Catch ex As Exception
                Throw ex
            Finally
                vars.Unlock()
            End Try
        Catch ex As Exception
            Throw ex
        End Try
 
        Return result
    End Function
 
End Class

categoriaProgramming commentoNo Comments dataNovember 24th, 2009
Read All

Install NetBeans jVi plugin

By kenglish

NetBeans is the only IDE with a great VI key binding plugin. This was the sole reason I switched to NetBeans as my Ruby/Rails IDE of choice this year. There are 2 vi plugins for Eclipse: one you have to pay for and the other relies on gVIM. The NetBeans Vi plugin is called jVi and can be found at http://jvi.sourceforge.net. There are a few caveats and extra configuration options that you need to set.

  1. Download the latest jVi release. Unzip the file in your home directory. This will create the directory nbvi-1.2.6.
  2. In the Netbeans menu bar, select Tools | Plugins. Click on the Downloaded tab.
  3. Press the “Add Plugin” button. Browse to the nbvi-1.2.6 directory and select the two file: org-netbeans-modules-jvi.nbm and com-raelity-jvi.nbm. You should now have 2 plugins availabe in the downloaded list: jVi Key Bindings and jVi Core.
  4. Click Install and then click through the installation process. NetBeans will need to restart.
  5. After NetBeans has restarted, from the menu bar, select Tools | Options. Click on the last tab which should be jVi Config.
  6. Select the “Buffer Modifications” panel in the jVi Config screen.
  7. Make sure the ‘expandtab’ value is checked.
  8. Change the value of ‘shiftwidth’ to 2.
  9. Change the value of ‘tabstop’ to 2.

The last option for setting the tabstop and expandtab are pretty important. If you don’t use these, jVi will insert tabs into your Ruby files. This could cause your co-workers to complain about you ruining the formatting in the project.

Congratuations, jVi is now setup in NetBeans. Enjoy.

This blog entry was written for NetBeans 6.7 and with jVi version 1.2.6.

categoriaProgramming commento3 Comments dataNovember 22nd, 2009
Read All

Parsing Emboss Water output with Ruby

By kenglish

First, you will need to install the emboss suite on your computer:

sudo apt-get install emboss emboss-lib

If don’t already have the BioRuby installed, you will need that too:

sudo gem install bio --no-ri --no-rdoc

Your first ruby script calling Emboss Water from Ruby:

1
2
3
4
5
require 'rubygems'
require 'bio'
test_filename =ARGV.shift
target_filename =ARGV.shift
result = Bio::EMBOSS.run('water', '-asequence', test_filename, '-bsequence', target_filename)

Unforntunately, there is not a nice report result class in BioRuby for Emboss Water so you will have to parse the output yourself. Here’s an example script that finds percent similarity:

require 'rubygems'
require 'bio'
test_filename =ARGV.shift
target_filename =ARGV.shift
result = Bio::EMBOSS.run('water', '-asequence', test_filename, '-bsequence', target_filename)
# result now has the text output of water...
# Here's an example of looping through each line of the result to get the similary:
 
test_seq = ""
target_seq = ""
similarity = ''
 
 
result.split("\n").each do | line |
  # This mean
  if line =~ /^# Aligned_sequences/
    puts "Seq '#{test_seq}' has similarity to Seq '#{target_seq}' of #{similarity}"  unless (test_seq == "" ) && (target_seq == "")
    test_seq = ""
    target_seq = ""
  end
  # Get sequence numbers 
  if line =~ /^# (\d+): (\d+)/
     test_seq  = $2 if $1 == '1'
     target_seq = $2 if $1 == '2'
  end
  # parse similarity
  if line =~ /^# Similarity:.*\((.*)%\)/
    similarity  = $1
  end
end
 
puts "Seq '#{test_seq}' has similarity to Seq '#{target_seq}' of #{similarity}"

Place this in a file called water.rb and run it with frags.fasta and frags1.fasta and the above script will output this.

$ ruby water.rb fastas/frags1.fasta frags.fasta 
Seq '1' has similarity to Seq '1' of 100.0
Seq '1' has similarity to Seq '2' of 96.6
Seq '1' has similarity to Seq '3' of 64.3
Seq '1' has similarity to Seq '4' of 97.9
Seq '1' has similarity to Seq '5' of 96.9
Seq '1' has similarity to Seq '6' of 94.1
Seq '1' has similarity to Seq '7' of 62.5
Seq '1' has similarity to Seq '8' of 61.1
Seq '1' has similarity to Seq '9' of 62.5
Seq '1' has similarity to Seq '10' of 57.1
Seq '1' has similarity to Seq '11' of 57.4
Seq '1' has similarity to Seq '12' of 97.8
Seq '1' has similarity to Seq '13' of 50.0
Seq '1' has similarity to Seq '14' of 62.5
Seq '1' has similarity to Seq '15' of 97.9
Seq '1' has similarity to Seq '16' of 62.5
Seq '1' has similarity to Seq '17' of 59.1
Seq '1' has similarity to Seq '18' of 55.9
Seq '1' has similarity to Seq '19' of 61.9
Seq '1' has similarity to Seq '20' of 60.0
Seq '1' has similarity to Seq '21' of 56.4
Seq '1' has similarity to Seq '22' of 56.2

Water is the worse name for a program, EVER. Because it is impossible to Google…

categoriaBioinformatics, Programming commentoNo Comments dataNovember 20th, 2009
Read All

Defining Class methods in a Module

By kenglish

The code should speak for itself. Make sense?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
module Loveable
  module ClassMethods
    def give_hug
    end
  end
  def self.included(base)
    base.extend(ClassMethods)
  end
end
 
class Person
  include Loveable
 
  give_hug
 
end

I fuzzy as to why a certain Rails genius would suggest it is better to do it this way
(see line 7):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
module Loveable
  module ClassMethods
    def give_hug
    end
  end
  def self.included(base)
    base.send :extend, ClassMethods
  end
end
 
class Person
  include Loveable
 
  give_hug
 
end

Feel free to comment…

categoriaProgramming commentoNo Comments dataNovember 20th, 2009
Read All

Install thoughtbot shoulda and rcov (the right way)

By kenglish

Install rcov & ruby-prof (rcov-0.9.6 & ruby-prof-0.7.3 at the time of this writing).

sudo gem install ruby-prof rcov --no-ri --no-rdoc

Update your test/test_helper.rb, add:

require 'shoulda/rails'

Install Thoughbot’s Shoulda gem (shoulda-2.10.2 at the time of this writing). Make sure you have added GemCutter as one of your ruby gem sources.

sudo gem install shoulda --no-ri --no-rdoc

Edit your applicaitons main Rakefile and add:

require(File.join(File.dirname(__FILE__), 'config', 'boot'))
require 'rake'
require 'rake/testtask'
require 'rake/rdoctask'
require 'tasks/rails'
require 'shoulda/tasks'
 
def run_coverage(files)
  rm_f "coverage"
  rm_f "coverage.data"
 
  # turn the files we want to run into a  string
  if files.length == 0
    puts "No files were specified for testing"
    return
  end
 
  files = files.join(" ")
 
  if PLATFORM =~ /darwin/
    exclude = '--exclude "gems/*"'
  else
    exclude = '--exclude "rubygems/*"'
  end
 
  rcov = "rcov --rails -Ilib:test --sort coverage --text-report #{exclude}  --aggregate coverage.data"
  cmd = "#{rcov} #{files}"
  puts cmd
  sh cmd
end
namespace :test do
 
  desc "Measures unit, functional, and integration test coverage"
  task :coverage do
    run_coverage Dir["test/**/*.rb"]
  end
 
  namespace :coverage do
    desc "Runs coverage on unit tests"
    task :units do
      run_coverage Dir["test/unit/**/*.rb"]
    end
    desc "Runs coverage on functional tests"
    task :functionals do
      run_coverage Dir["test/functional/**/*.rb"]
    end
    desc "Runs coverage on integration tests"
    task :integration do
      run_coverage Dir["test/integration/**/*.rb"]
    end
  end
end

Checkout your new coverage rake tasks:

rake -T | grep cov

Should show you:

rake test:coverage                        # Measures unit, functional, and integration test coverage
rake test:coverage:functionals            # Runs coverage on functional tests
rake test:coverage:integration            # Runs coverage on integration tests
rake test:coverage:units                  # Runs coverage on unit tests

categoriaProgramming commento1 Comment dataNovember 19th, 2009
Read All