Sean’s Obsessions

Sean Walberg’s blog

Practical Packet Analysis, 2ed

No Starch Press sent me Practical Packet Analysis, 2ed a little while back. At about 250 pages it’s a lot smaller than Chappell’s “Wireshark Network Analysis”, and more appropriate for someone who wants to get up and running quickly rather than going for a certification.

The book assumes no knowledge of Wireshark, and a basic understanding of networking. More than half the book is devoted to teaching the Wireshark interface and how the popular protocols work. So, if you don’t know anything about DNS recursion, you’ll get a taste of it here along with what it looks like in Wireshark. The first half covers everything from filtering inside Wireshark to how different protocols work.

The second half of the book follows fairly typical examples, such as decoding HTTP streams and troubleshooting the causes of network congestion. Of special interest is Chapter 10, which is about using wireshark for security analysis. This chapter is merely an introduction to a huge topic, but the author has chosen some interesting examples such as an ARP poisoning attack and analysis of a trojan horse.

One theme the author continually comes back to is appropriate placement of the analysis tool. The early chapters discuss the matter in theory, and every example in the second half has some text that analyzes the options for where to use Wireshark and where the best spot is.

Some of the highlights of the book:

  • A great discussion of TCP congestion and analysis of a congestion scenario
  • A good tradeoff between depth and breadth. This is a “getting started” guide/
  • Uses many of the features of Wireshark in a practical context
  • A good, though basic, chapter about wireless sniffing

Some of the downsides:

  • No IPv6 (other than a brief mention of a host filter)
  • Would have liked to see more use about IO graphs and TCP stream graphs especially when talking about congestion.

On the whole, a great book for the IT administrator who wants to quickly get started using Wireshark. Cover price is $49.95 US, Amazon.com is showing it for $30 which is a bargain.

Nagios Paging Using Twilio

I use Nagios to monitor the health of a few servers, and would like to be paged if something goes wrong.

When I set it up a couple of years ago, I used SMS Gateway which was $10 for 100 SMSes. I was able to page with a simple curl command. However I’d get the odd page that wouldn’t go through, and despite being very responsive, the support wasn’t very reassuring.

Now that I’ve depleted my 100 pages, I figured I’d move over to Twilio because they’re pretty slick, and the reliability has to be better.

Some Nagios code, first:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24

define contactgroup{
        contactgroup_name       important
        alias                   Sean Buzz
        members                 sean, page
}
define contact{
        contact_name                    page
        alias                           page
        service_notification_commands   notify-by-page
        host_notification_commands      host-notify-by-page
        service_notification_period 24x7
        host_notification_period 24x7
        service_notification_options w,u,c,r
        host_notification_options d,u,r
        pager                           nobody@localhost
}
define service{
        use                             local-service         ; Name of service template to use
        host_name                       localhost
        service_description             smallpayroll.ca
        contact_groups                  important
        check_command                   check_http_string!smallpayroll.ca!Easy to use
        }

The first stanza creates a contact group called “important” that emails sean, and pages. The second stanza implements the “host-notify-by-page” and “notify-by-page” which do the actual paging. The final stanza is an example of a service that would get paged on. If the check_http_string check fails, the “important” contact group is notified.

The code to page is as follows:

1
2
3
4
5
6
7
8
define command {
        command_name    notify-by-page
        command_line    curl --data-urlencode "From=YOURTWILIONUMBER" --data-urlencode "To=YOURCELL" --data-urlencode "Body=[Nagios] $NOTIFICATIONTYPE$ $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$" https://SID:TOKEN@api.twilio.com/2010-04-01/Accounts/SID/SMS/Messages >> /tmp/sms
}
define command {
        command_name    host-notify-by-page
        command_line    curl --data-urlencode "From=YOURTWILIONUMBER" --data-urlencode "To=YOURCELL" --data-urlencode "Body=[Nagios] $HOSTSTATE$ alert for $HOSTNAME$" https://SID:TOKEN@api.twilio.com/2010-04-01/Accounts/SID/SMS/Messages >> /tmp/sms
}

To get the SID and TOKEN (note there are two instances of the SID, the second is in the URL right after accounts) go to your dashboard and look at the top:

To get a number click on Numbers then Buy a Number:

Then search for a number. It should be in the USA, as it looks like Canadian numbers don’t support SMS. You can verify this by clicking on “Buy”:

Buy the number for $1/month. You don’t have to set up any URLs with it if you’re doing outbound only.

YOURCELL should be obvious :) It could also be templated within Nagios.

Good for Now Does Not Mean Good Forever

SmallPayroll.ca was my first big Rails project, and looking back at some of the code, it shows. One of the first things I did was the timesheet. The form has 21 input fields per employee, then it has to go through the database and figure out if days have changed or deleted. So it’s doing a lot, and at the time I was trying to figure out how both Ruby and Rails worked, and the code ended up being a mess.

But I was OK with that. If I was to get anywhere with SmallPayroll, people had to submit a timesheet. They didn’t care if the server side code was efficient as long as it worked. And, as I was to find out, they didn’t seem to care if it was slow. In order to build the rest of the app I had to have a timesheet. So I left my ugly inefficient code in, along with the tests that exercised it, and got to building the rest of the application.

Between Rails Analyzer and New Relic I kept an eye on things. The timesheet did get worse as time passed. Now that SmallPayroll has become more successful and I can spend more time on it, I’ve come back to look at fixing this. But before I know if I’m doing a better job, I have to know how I’m doing now.

I found Advanced Performance Optimization of Rails Applications that had a neat trick to capture the queries generated by a test.

Put this in test_helper.rb:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
module ActiveSupport
  class BufferedLogger
    attr_reader :tracked_queries
    def tracking=(val)
      @tracked_queries = []
      @tracking = val
    end
    def debug_with_tracking(message)
      @tracked_queries
      debug_without_tracking(message)
    end
    alias_method_chain :debug, :tracking
  end
end
class ActiveSupport::TestCase
  def track_queries(&block)
    RAILS_DEFAULT_LOGGER.tracking = true
    # Had to add this to get queries to be logged
    Rails.logger.level = ActiveSupport::BufferedLogger::DEBUG
    yield
    result = RAILS_DEFAULT_LOGGER.tracked_queries
    RAILS_DEFAULT_LOGGER.tracking = false
    Rails.logger.level = ActiveSupport::BufferedLogger::INFO
    result
  end
end

Then wrap your test inside a test_queries block:

1
2
3
4
5
6
7
8
9
10
def test_submit_timesheet
    visit "/timesheet?date=2011-07-10"
    10.upto(16) do |day|
      fill_in /.*_shift_REG_2011-07-#{day}_hours/, :with => 8
    end
    queries = track_queries do
      click_button "Update"
    end
    puts queries.inspect
  end

After that it was a matter of making a performance test, copying over some of my functional tests that represented a case I was trying to optimize, then doing some before/after.

Before
------
TimesheetTest#test_submit_timesheet (283 ms warmup)
        process_time: 707 ms
        memory: 18915.89 KB
        objects: 232124
Queries: ["User Load", "User Update", "Employer Load", "Employee Load", "Shift Load", "Shift Create", "Shift Load", "Shift Create",
 "Shift Load", "Shift Create", "Shift Load", "Shift Create", "Shift Load", "Shift Create", "Shift Load", "Shift Create", "Shift Load",
 "Shift Create", "Shift Load", "Shift Create", "Shift Load", "Shift Create", "Shift Load", "Shift Create", "Shift Load", "Shift Create",
 "Shift Load", "Shift Create", "Shift Load", "Shift Create", "Shift Load", "Shift Create", "Shift Load", "Shift Create", "Shift Load",
 "Shift Create", "Shift Load", "Shift Create", "Shift Load", "Shift Create", "Shift Load", "Shift Create", "Shift Load", "Shift Create",
 "Shift Load", "Shift Create", "Employee Load", "Shift Load", "Property Load"]
After
-----
TimesheetTest#test_submit_timesheet (243 ms warmup)
        process_time: 578 ms
              memory: 14788.10 KB
             objects: 204551
Queries: ["User Load", "User Update", "Employer Load", "Employee Load", "Shift Load", "Shift Create", "Shift Create", "Shift Create",
"Shift Create", "Shift Create", "Shift Create", "Shift Create", "Employee Load", "Shift Load", "Property Load"]

So it would seem I’ve been able to knock off some time and memory consumption, along with lots of queries, by optimizing my code. Since I had already written test cases I was able to show that it worked the same as before.

But I think the better point to make here is that I could have spent a lot of time trying to build these optimizations in from day 1 and detracted from building a good product. Instead, I deferred the hard work until the time that it mattered. And now that I have more Ruby and Rails experience, doing the optimization was much easier. Something that might have taken several evenings over the course of a couple of weeks was done in less than a day. And while I don’t follow TDD, having existing tests to start from made a huge difference.

Linode Review

This site is hosted on a Linode 768 VPS, and has been for a couple of years now, along with some other domains. I have hosted it at home, and also on a GoDaddy VPS which didn’t end up being all that good, but am now very happy with Linode. I host a combination of PHP (mostly WordPress) and Ruby on Rails applications.

Over the years Linode has kept the price the same ($30/month for my plan) but have increased the disk and memory of their plans every year. When I started out my plan had 18G of disk space and around 512MB of RAM, now it has 30G of disk and 768MB of RAM. So the value for money keeps on getting better.

I’ve also set up Linode VPSes for a few people, including TwiTip.com and TopMMANews.com and continue to assist in their management. Both of them are fairly heavy sites and also run on a Linode 768. TwiTip hit 11mbps of traffic when it was tweeted by Ashton Kutcher, and TopMMANews has a fairly active site.

I’ve found the service to be very reliable. At one point one of their data centres was having problems but they were fixed reasonably quickly and the company kept the customers updated.

You get a control panel that lets you see your cpu/disk/io status, along with how much disk and network you’ve used. The screenshot below shows my system (you can see that I haven’t yet taken advantage of the 6GB of disk space they added to my account)

One feature I really like about the service is that you get free DNS hosting, and the interface is very simple (I mean “simple” as “does not get in your way”, not as in “stripped of features”). You can do AAAA, TXT, and SRV records, or control the whole thing through an API.

I can’t speak highly enough about Linode VPSes. If you’re looking for a VPS service they offer great value for money and a high service level. If you’re wondering about which size to buy, I’ve found the 768 to be a real workhorse. You can also upgrade/downgrade your plan with minimal downtime and no loss of data, so there’s little risk in picking the wrong plan.

Freshbooks/Heroku and Twilio APIs

I have been playing with the Freshbooks API and the Twilio API as part of a contest that Twilio is running. It’s a great excuse to try something I’ve been meaning to do for a while.

I ran into a few problems.

The freshbooks gem doesn’t work under 1.9.2, which I found out after trying to deploy to Heroku and then trying locally under RVM. The error was:

1
2
NoMethodError in OutstandingController#index
undefined method `gsub!' for :invoice_id:Symbol

Someone made a compatible gem on Github that you can use by using the following in your Gemfile instead of gem “freshbooks”:

1
2
3
gem 'bcurren-freshbooks.rb',
  :require => 'freshbooks',
  :git => 'git://github.com/bcurren/freshbooks.rb'

There were a couple of differences I found:

  • Instead of using FreshBooks.setup to enter your credentials, use FreshBooks::Base.establish_connection
  • The original gem let you do FreshBooks::Invoice.send_by_email(id), the bcurren one makes it an instance method… FreshBooks::Invoice.get(id).send_by_email

Those were the only two changes I had to make.

The second problem was with the Twilio gem. I got SSL errors if I had to make calls to Twilio (as opposed to incoming requests from the Twilio API).

1
2
OpenSSL::SSL::SSLError (SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed):
  app/controllers/phone_controller.rb:8:in `call'

The reason for this problem is that the OpenSSL library is making a call to an HTTPS resource, but it has no way to verify the certificate. There are two ways to fix this problem:

  1. Tell OpenSSL not to verify the certificate
  2. Give OpenSSL the proper Certification Authority (CA) certificates to verify the certificate

I’ll confess that my normal approach here is #1, but this time I felt like doing it properly… Since the Twilio module includes HTTPParty you can call methods right on the Twilio module. So add an intializer inside config/initializers, such as twilio.rb:

1
Twilio::ssl_ca_file File.join(Rails.root, "config/cacert.pem")

Then, all you have to do is grab cacert.pem from somewhere else. Many other gems include it so if you look for the file on your hard drive you should find it. FWIW, the NewRelic one doesn’t work, it only includes the certificates they need. I ended up finding mine in ActiveMerchant.

*edit* I forked the twilio gem to try and bundle the cacerts.pem file, and as I was going through the code I saw that they look for the certs in /etc/ssl/certs unless you use the method above.

After that, things just worked!

Showing Git Commits in a Rails View

I have an administration panel for my Rails application that shows various information. I’ve found it helpful to show the last few commits along with a link to the repository. Here’s the code:

Controller:

1
2
3
4
5
6
7
8
9
10
11
@commits = Array.new
git = `git log -15 --abbrev-commit --pretty=format:"%H - %cr - %s - %d"`
git.split(/\n/).each do |commit|
  elements = commit.split(/ - /)
  @commits
    :hash => elements[0],
    :time => elements[1],
    :subject => elements[2],
    :refs => (elements[3] || "").match(/deploy_[^,]*/)
  }
end

The :refs never got used, I had thought that I’d tag all my deploys and be able to highlight them later, but it never ended up working out.

View: (It’s in HAML)

1
2
3
4
5
 - @commits.each do |commit|
      %li
        = link_to truncate(commit[:subject], :length => 80), "http://git.example.com/git/myapp.git/commit/#{commit[:hash]}"
        = "(#{commit[:time]})"
        %b= commit[:refs]

It’s pretty simple, it just parses the output of git log and spits it out as a list, showing the description and how long ago it was checked in. If you have gitweb or something similar installed, you get a link to the repo.

It’s helped me to find production bugs, and also when I deploy without pushing my code to the git repo, and end up forgetting some changes!

"cd" Tricks to Increase Your Efficiency

I was doing some work that involved moving between several directories. Remembering about pushd and popd, I googled around to try and find out how to use them properly. I found this article which was helpful, but what was even better was one of the comments talking about “cd -“.

1
2
3
4
5
6
7
8
9
10
[root@host tmp]# pwd
/tmp
[root@host tmp]# cd -
/auto/tmp/sean
[root@host sean]# pwd
/auto/tmp/sean
[root@host sean]# cd -
/tmp
[root@host tmp]# pwd
/tmp

What it does is cycle you between your last two directories. It also operates outside of the pushd/popd stack:

1
2
[root@host tmp]# dirs
/tmp

So you should still be able to use those!

Telling Your Wordpress Environments Apart

I am doing some work with Wordpress, where we have a development server and a production server. The development side is set up as a git repo, and the production side pulls from the dev repo when we want to pull in changes:

1
git pull origin master

I move between the two environments using the host file, which sometimes means I’m not sure which environment I’m in. I put the following in functions.php to help me out:

1
2
3
4
5
6
7
8
9
function mysite_i_am_in_dev() {
  // Red border if we're in dev mode
  echo '<!-- dev mode -->
<style type="text/css"> body { border: 2px solid #FF0000; } </style>';
}
if ($_SERVER["SERVER_ADDR"] == "x.x.x.x") { // x.x.x.x is the IP address of your dev server
  add_action('wp_head', 'mysite_i_am_in_dev');
  add_action('admin_head', 'mysite_i_am_in_dev');
}

So now, anyone viewing the development server will have a small red border around the screen.

Starting the Beanstalk Worker From Capistrano

I recently changed SmallPayroll to use Beanstalkd instead of delayed_job for background processing. delayed_job is an awesome tool and makes asynchronous processing so simple. However I wanted to have multiple queues so that I could have different workers processing different queues, and have some upcoming needs to process the jobs quicker than the 5 second polling time.

After watching railscasts on beanstalkd and stalker I decided to use that. Beanstalkd is a lightweight job processor, and stalker makes it very simple to use from the client end.

I used to have an observer that said

1
2
3
4
5
6
<pre><code>class UserObserver < ActiveRecord::Observer<br />
  def after_create(user)<br />
    UserMailer.send_later(:deliver_welcome_email, user)<br />
    UserMailer.send_later(:deliver_notify_admin, user)<br />
  end<br />
end</code></pre></p>

This became:

1
2
3
4
5
<pre><code>class UserObserver < ActiveRecord::Observer<br />
  def after_create(user)<br />
    Stalker.enqueue("email.new_user", :user_id => user.id)<br />
  end<br />
end</code></pre></p>

delayed_job was nice in that the job would just run against the model, but now I have to process the job in config/worker.rb:

1
2
3
4
5
6
7
8
9
10
11
require 'stalker'
include Stalker
require File.expand_path("../environment", __FILE__)
job 'default' do |args|
  puts "I don't support the default queue"
end
job 'email.new_user' do |args|
  user = User.find(args["user_id"])
  UserMailer.deliver_welcome_email(user)
  UserMailer.deliver_notify_admin(user)
end

One thing about stalker is that it wants you to pass simple objects instead of ActiveRecord objects, so I queue the user_id instead of the user model.

The script above monitors the default tube, which I don’t use, because the nagios plugin for beanstalk expects someone to monitor it (after setting it all up, I guess I could have set it up to ignore that tube). I’m also using the munin plugin for beanstalkd to graph the activity in the daemon.

Then, script/worker uses the daemons gem to start the job and restart it if it crashes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!/usr/bin/env ruby
require 'rubygems'
require 'daemons'
pwd  = File.dirname(File.expand_path(__FILE__))
file = pwd + '/../config/worker.rb'
Daemons.run_proc(
  'payroll-generic-worker', # name of daemon
  :dir_mode => :normal,
  :dir => File.join(pwd, '../tmp/pids'),
  :backtrace => true,
  :monitor => true,
  :log_output => true
) do
  exec "stalk #{file}"
end

Finally, some capistrano magic to start the worker with the deploy inside config/deploy.rb

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
namespace :beanstalk do
  desc "Start beanstalk process"
  task :start, :roles => :app do
    run "cd #{current_path}; RAILS_ENV=production script/worker start"
  end
  desc "Stop beanstalk process"
  task :stop, :roles => :app do
    run "cd #{current_path}; RAILS_ENV=production script/worker stop"
  end
  desc "Restart beanstalk process"
  task :restart, :roles => :app do
    run "cd #{current_path}; RAILS_ENV=production script/worker restart"
  end
end
after "deploy:start", "beanstalk:start"
after "deploy:stop", "beanstalk:stop"
after "deploy:restart", "beanstalk:restart"

The only problem I’ve run into so far is that my HTML email seems to go out without the text/html content-type. Fixing that was a simple matter of putting content_type 'text/html' inside my mailer, which wasn’t needed when I was using delayed_job.

Installing ExceptionNotifier for Rails 3

I was just fighting with this, and being new to the Rails 3 way of things, the docs didn’t quite make sense.

Step 1 - Install the plugin

1
rails plugin install https://github.com/rails/exception_notification.git

Step 2 - Edit config/application.rb:

config/application.rb
1
2
3
4
5
6
7
  class Application
    # .....
    #  somewhere in this block put the following:
    config.middleware.use "::ExceptionNotifier",
      :email_prefix => "[MyApp Error] ",
      :sender_address => %{"notifier" },
      :exception_recipients => %w{youraddress@example.com}

3. Verify:

1
2
$ rake middleware | grep ExceptionNotifier
use ExceptionNotifier

Now you’ll get any application errors emailed to the addresses in the exception_recipients array.