Sean’s Obsessions

Sean Walberg’s blog

Thinking About Cyber Security

There have been a lot of high profile security breaches lately. If people like Sony can get hacked, what chance do you have?

The people at Sony are humans. They get together at the water cooler and complain about the state of affairs and all the legacy applications they have to support. Even new companies like Dropbox are going to have dark corners in their applications. Money isn’t going to solve all these problems - Apple has billions of dollars and still gave up information about users. The bigger the company, the bigger the attack surface and the more the team has to defend.

How do you prevent your company or product from being hacked given your resources are finite and you may not be able to change everything you want to?

I’ve been thinking of how Agile methods such as rapid iteration and sprints could be applied to security. With that in mind, some high level principles:

  • Solutions to problems should be ranked in terms of business value
  • If a solution takes more than a week or two to implement, it should be broken down into individual phases with their own business value scores
  • It’s not necessary to completely solve the problem as long as you’re better off than you were before. You can make it better the next iteration
  • Instead of “how can we eliminate all risk?” the better question is “how can we make an attacker’s work more difficult?”
  • Detection is just as important as prevention. Look at safes – they protect valuables against a determined adversary for a given period of time, it’s still up to you to make sure you can react in that timeframe

The list above is trying to get away from the traditional security project where you spend lots of time producing documentation, shelve it, and then provide a half-assed solution to meet the deadline. Instead you break the solution into parts and try and continually produce business value. Time for a diagram:

agile vs waterfall

Even in the best case where you deliver full value, why not try to deliver parts of it sooner?

Look at it this way – at any point in time you know the least amount about your problem as you ever will. It’s folly to think you can solve them all with some mammoth project. Pick something, fix it, move on. You have an end goal for sure, but the path may change as you progress.

With that in mind, how do you figure out what to do?

One organization technique I’ve found helpful is the attack tree. Here you define the goal of the attacker: Steal some money, take down the site, and so forth. Then you start coming up with some high level tasks the attacker would have to do in order to accomplish the goal. The leaf nodes are the more actionable things. For example, consider what it would take to deface a website:

attack tree to deface website

While not totally complete, this attack tree shows where the attack points are. Given that, some low effort and high value activities that could be done:

  • Audit who has access to CDN, DNS, and registrar accounts
  • Audit CMS accounts

Some higher effort activities would then be:

  • Code fix to enforce strong passwords
  • Code fix to lock out accounts after a certain period
  • Code fix to centralize the authentication to the corporate directory
  • Investigate two factor or SAML login with hosting providers
  • Network fix to ban IPs after a certain number of attempts
  • Monitor failed login attempts and investigate manually

Some of those options may be a lot of work. But if you first start with a simple password policy, build on that in the next iteration to lock out accounts, and finally tie in to another directory, you’re able to incrementally improve by making small fixes and learning as you go.

What if a group like Anonymous threatens to attack your website on the weekend? Look at the attack tree, what kind of confusion can you throw at the attacker? Change the URL of the login page? Put up a static password on the web server to view the login screen itself? Security through obscurity is not a long term fix, but as a tactical approach it can be enough to get you past a hurdle.

Too often security projects are treated in a waterfall manner. You must figure everything out front and then implement the solution, with the business value delivered at the end. Instead, treat this all as a continual learning exercise and strive to add value at each iteration. If the situation changes in the middle of the project, like an imminent threat, you’re in a better position to change course and respond.

Test Driven Infrastructure

In software, test driven development happens when you write an automated test that proves what you are about to write is correct, you write the code to make the test pass, then you move on to the next thing. Even if you don’t follow that strict order (e.g. write your code, then write a test), the fact that there’s a durable test that you can run later to prove the system still works is very valuable. All the tests together give you a suite of tools to help prove that you have done the right thing and that regressions haven’t happened.

What about the infrastructure world? We’ve always had some variant of “can you ping it now?”, or some high level Nagios tests. But there’s still some value to knowing that your test was good – if you make a change and then test, how can you be sure your test is good? If you ran the same test first you’d know it failed, then you could make your change. And then there’s the regression suite. A suite of tests that may be too expensive to run every 5 minutes through Nagios but are great to run to verify your change didn’t break anything.

Enter the Bash Automated Test System - a Bash based test suite. It’s a thin wrapper around the commands that you’d normally run in a script but if you follow the conventions you get some easy to use helpers and an easy to interpret output.

As an example, I needed to configure an nginx web server to perform a complicated series of redirects based on the user agent and link. I had a list of “if this then that” type instructions from the developer but had to translate them into a set of cURL commands. Once I had that it was simple to translate them into a BATS test that I could use to prove the system was working as requested and ideally share with my team so they could verify correctness if they made changes.

share_link tests
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#!/usr/bin/env bats

@test "root" {
  run curl http://example.com
  [[ $output =~ "doctype html" ]]
}

@test "mobile redirects to share" {
  run curl -H "User-Agent: this is an iphone" -i -k http://app.example.com/shareapp/65ac7f12-ac2e-43f4-8b09-b3359137f36c
  [[ $output =~ "302 Found" ]]
  [[ $output =~ "Location: http://app.example.com/share/65ac7f12-ac2e-43f4-8b09-b3359137f36c" ]]
}

@test "mobile redirects to share and keeps query string" {
  run curl -H "User-Agent: this is an iphone" -i -k http://app.example.com/shareapp/65ac7f12-ac2e-43f4-8b09-b3359137f36c?a=b
  [[ $output =~ "302 Found" ]]
  [[ $output =~ "Location: http://app.example.com/share/65ac7f12-ac2e-43f4-8b09-b3359137f36c?a=b" ]]
}

@test "desktop redirects to play" {
  run curl -H "User-Agent: dunno bob" -i -k http://app.example.com/shareapp/65ac7f12-ac2e-43f4-8b09-b3359137f36c
  [[ $output =~ "302 Found" ]]
  [[ $output =~ "Location: http://app.example.com/play/65ac7f12-ac2e-43f4-8b09-b3359137f36c" ]]
}

@test "desktop redirects to play and keeps query string" {
  run curl -H "User-Agent: dunno bob" -i -k http://app.example.com/shareapp/65ac7f12-ac2e-43f4-8b09-b3359137f36c?a=b
  [[ $output =~ "302 Found" ]]
  [[ $output =~ "Location: http://app.example.com/play/65ac7f12-ac2e-43f4-8b09-b3359137f36c?a=b" ]]
}

@test "bots redirect to main site" {
  run curl -H "User-Agent: facebookexternalhit" -i -k http://app.example.com/shareapp/65ac7f12-ac2e-43f4-8b09-b3359137f36c
  [[ $output =~ "302 Found" ]]
  [[ $output =~ "Location: http://www.example.com/app/social?id=65ac7f12-ac2e-43f4-8b09-b3359137f36c" ]]
}

@test "bots redirect to main site and keeps query string" {
  run curl -H "User-Agent: facebookexternalhit" -i -k http://app.example.com/shareapp/65ac7f12-ac2e-43f4-8b09-b3359137f36c?a=b
  [[ $output =~ "302 Found" ]]
  [[ $output =~ "Location: http://www.example.com/app/social?id=65ac7f12-ac2e-43f4-8b09-b3359137f36c&a=b" ]]
}

And running the tests with one mistake in the configuration:

$ bats ~/Downloads/share_link.bats
 ✓ root
 ✓ mobile redirects to share
 ✗ mobile redirects to share and keeps query string
   (in test file /Users/sean/Downloads/share_link.bats, line 17)
     `[[ $output =~ "301 Found" ]]' failed
 ✓ desktop redirects to play
 ✓ desktop redirects to play and keeps query string
 ✓ bots redirect to ndc
 ✓ bots redirect to ndc and keeps query string

 7 tests, 1 failure

With the tests in place it’s more clear when the configurations are correct.

As a bonus, if you use Test Kitchen for your Chef recipes, you can include BATS style tests that will be run. So if this configuration is in Chef (which it was) I can have my CI system run these tests if the cookbook changes (which I don’t yet).

Using Google Universal Analytics With NationBuilder

We spent a lot of time trying to understand our visitors at Bowman for Winnipeg and part of that was using Google Analytics. The site was built with NationBuilder but they only support the async version of Analytics and it’s difficult to customize. In particular, we used the demographic and remarketing extensions and there was no easy way to alter the generated javascript to get it to work.

Normally you’d just turn off your platform’s analytics plugins and do it yourself, but NationBuilder has a great feature that fires virtual page views when people fill out forms, and we wanted to use that for goal tracking.

The solution was to turn off NationBuilder’s analytics and do it ourselves but write some hooks to translate any async calls into universal calls. Even with analytics turned off in our NationBuilder account, they’d fire the conversion events so this worked out well.

In the beginning of our template:

Header code
1
2
3
4
5
6
7
8
9
10
11
12
<script type="text/javascript">
  var _gaq = _gaq || []; // leave for legacy compatibility
   var engagement = {% if request.logged_in? %}"member"{% else %}"public"{% endif %};
  (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
  (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
  m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
  })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
  ga('create', 'UA-XXXXXXXX-1', 'www.bowmanforwinnipeg.ca');
  ga('require', 'displayfeatures');
  ga('require', 'linkid', 'linkid.js');
  ga('send', 'pageview', { "dimension1": engagement});
}

It’s pretty vanilla Google Analytics code with a couple of features – display features for layering on demographic features and retargeting integration and the enhanced link attribution for better tracking of clicks within the site. We also added a custom dimension so we could separate out people who took the time to create an account in our nation vs those that were public.

Then, at the bottom:

Footer code
1
2
3
4
 $(function() {
    // Send anything NB tried to inject
    for(i in _gaq) { var c = _gaq[i]; if(c[0] == "_trackPageview"){ ga('send', 'pageview', c[1]) } }
  });

That’s a simple loop that iterates through the _gaq array (that the async version of the tracker uses as an event queue) and pushes out any page views using the universal API. We didn’t have to worry about the initial page view because turning off NationBuilder’s analytics feature removes all of that.

Review: Agile Web Development With Rails 4

I had the good fortune to receive Agile Web Development with Rails 4 from the publisher to review.

The book is really two books in one. The first is a walkthrough of building a simple Rails 4 application for an online bookstore. The second is a component by component look at Rails 4. If you want to be generous there’s a quick introduction to Ruby and Rails concepts at the beginning of the book.

The online bookstore walkthrough is interesting especially if you are new to Rails and the ideas behind Agile development. You take the role of a developer who is building an online bookstore for a client. You start off with a general idea of what needs to be done, but you build it incrementally, showing it to your customer at each step. Based on feedback you decide what to tackle next, such as showing pictures, getting details, or adding a shopping cart. Along the way there are some discussions of deployment, internationalization, authentication, and sending email.

Through the examples you learn the basics of creating Rails models, views, and controllers. Though the examples lean heavily on scaffolding to automatically generate the CRUD actions, you do extend it somewhat to handle XML and JSON responses. You also do some AJAX and automated testing. The book does stick pretty closely to the default Rails toolset including the test framework, though at the very end of the book there are some quick examples on using HAML for views.

At the end of each chapter are some exercises. Unlike many other books I found them to be of appropriate difficulty, with answers available online.

The second half of the book is a detailed look at the components of Rails. This is the more technical part of the book as it gets into specifics of how a request is processed and how a response is generated. There’s no bookstore application anymore, it’s just discussion of what’s available, code samples, and diagrams.

Along the way there are some interesting sidebars that explain some of the Rails philosophies or some real world scenarios. For example, one of the sidebars talks about when you want to use a database lookup that raises an exception on failure versus one that returns a nil or an empty set.

I didn’t read any of the previous editions so I can’t say with authority how much has changed. The book is up to date on the many changes that came in Rails 4 so it is current in that respect. However there are times when you read it and some older terminology, like fragment or page caching, creep in. This is more a matter of editing than it is about it being out of date, as the associated examples are correct. The index is fairly weak – many of the terms I tried to look up, including those featured on the back cover, were not found.

If you’re an experienced Rails developer this book will not help you much. But if you’re looking to get into Ruby on Rails, either from another language or even with a weak programming background, this book is an excellent way to get started. At a little over 400 pages it’ll take you a few weekends to get through depending on your skill level.

Using an IP Phone With Twilio

Twilio has supported SIP termination for a while but recently announced SIP origination. This means that previously you could receive calls with SIP but now you can also make calls from a hard phone using SIP instead of using the browser client or going through the PSTN.

It was this second announcement that got my interest. I have an IP phone that I use in my office, currently it’s through Les.net but I like the pricing and interface of Twilio and would rather use them.

For some reason everything I read about SIP and Twilio uses a separate SIP proxy even if they have a compliant SIP device. Even their own blog takes a working SIP ATA and puts it behind Asterisk. I knew I could do better.

What you’ll need

  • An IP phone. I used a Cisco 7960 converted to SIP
  • A publicly available web server running PHP (feel free to use another language, we have to do some basic processing of the request so static won’t work)
  • A Twilio account

When thinking about VoIP, always think in two directions. Inbound and outbound. Sending and receiving. Talking and listening.

Receiving calls

Get a number and point the Voice Request URL to your web server. Please don’t use mine.

Phone Number | Dashboard | Twilio 2013-10-21 08-04-09

Your outbound.php script will send some TwiML to dial your phone:

1
2
3
4
5
6
7
8
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Dial>
    <Sip>
        sip:line_number@address_of_phone
    </Sip>
</Dial>
</Response>

Note: this part was a lot of trouble. After some packet tracing and some brilliant detective work by Brian from Twilio support, it turns out that the address of the phone in the SIP invite had to be an IP address, not a hostname. With a hostname the phone received the INVITE but always returned 404.

Your phone will need to be on the Internet, either with a public address or with UDP port 5060 port forwarded to it. The “line_number” has to match the name of the line on your phone. In my case, I named my line after my phone number:

1
2
3
4
5
proxy2_address: "ertw.sip.twilio.com"
line2_name: "204xxxxxxx"
line2_shortname: "204xxxxxxx"
line2_displayname: "204xxxxxxx"
line2_authname: "204xxxxxxx"

One thing to note is that you don’t register your phone to Twilio. I left the proxy address there so that the requests will keep the NAT translation alive. After detailed packet tracing it looks like the Twilio SIP messages originate from different IP addresses so this might not be helping as much as I thought.

At this point you should be able to dial your Twilio number from a regular phone. Twilio will request inbound.php and then do a SIP INVITE to the phone. The phone will accept it and then you have voice!

Making calls

The first step is to set up a SIP domain in Twilio:

Twilio User - Account Sip Domains 2013-10-21 08-15-14

Call it whatever you want, but you’ll need to set the Voice URL.

Twilio User - Account Sip Domains 2013-10-21 08-15-57

The script you point it at has to parse the data coming in from Twilio to find the phone number and then issue a Dial instruction to get Twilio to dial the phone and connect the two ends.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
<?php
  $called = preg_replace('/sip:1?(.*?)@.*/', '{$1}', $_POST['Called']);
  header("content-type: text/xml");
  echo "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n";
  ob_start();
  var_dump($called);
  var_dump($_POST);
  $contents = ob_get_contents();
  ob_end_clean();
  error_log($contents, 3, "/tmp/twilio.txt");
?>
<Response>
  <Dial timeout="10" record="false" callerId="YOURTWILIONUMBER"><?= $called ?></Dial>
</Response>

All we’re doing here is extracting the phone number from the Called header that Twilio sends us, stripping any leading 1’s, and then sending a TwiML response to dial that number. The ob_start through to the error_log is just logging the parameters if you’re interested.

Don’t forget to change the caller ID to your phone number, otherwise you get errors in the Twilio console.

So now when you place a call on your phone, Twilio will send the digits to the application which will return a Dial verb and the proper 10 digit number. Twilio links the two calls.

Conclusions

It took a bit of playing around to get this going but now I’ve shown that you don’t need an Asterisk server to integrate a SIP phone with Twilio. If you are setting up a phone bank or something with hard phones you can just configure them to hit Twilio, and for Twilio to hit them.

Of course, if you are expecting significant inbound traffic the benefit of a SIP proxy is that it can direct calls to multiple numbers without needing Twilio to be able to reach the phone directly. I’m hoping that Twilio can improve on that in the future!

Find Method Definitions in Vim With Ctags

Ever asked yourself one of these questions?

  • Where is the foo method defined in my code?
  • Where is the foo method defined in the Rails source?

Then you spend a couple of minutes either grepping your source tree or looking on GitHub and then going back to your editor.

This weekend I went through VIM for Rails developers. There’s a lot that’s out of date, but there’s also some timeless stuff in there too. One thing I saw in there was the use of ctags which is a way of indexing code to help you find out where methods are defined.

Install the ctags package with brew/yum/apt/whatever. Then generate the tags with

ctags -R –exclude=.git –exclude=log *

You may want to add tags to your ~/.gitignore because you don’t want to check this file in.

Also add set tags=./tags; to your .vimrc which tells vim to look for the tags in the current directory. If you have it in a parent directory, use set tags=./tags;/ which tells vim to work backward until it’s found.

Then, put your cursor on a method and type control-] and you’ll jump to the definition. control-T or control-O will take you back to your code. control-W control-] opens it up in a horizontal split. Stack Overflow has some mappings you can use to reduce the number of keystrokes or use vertical splits.

If you use bundler and have it put the gems in your vendor directory, ctags will index those too. So you can look up Rails (or any gem) methods.

Is It Hacked?

After coming across a few sites that were serving pharma spam to Google Bot but not regular browsers, I thought it would be fun to give Sinatra a try and come up with a quick web app that checks for cloaked links. That lead to a few more checks, then some improvements, and Is it hacked? was launched. I’ve got some ideas for where I want to go with the project, but in the meantime it’s catching stuff that other sites in the space are missing.

There’s also a bookmarklet at the bottom of the page. You can drag it to your button bar and check the site you’re on for any infection.

Update: I’ve sold the site to someone else and am no longer involved.

Understanding Capistrano Hooks and Deploy vs Deploy:migrations

Capistrano is a deployment framework for Rails, though it can be extended to do pretty much anything. The system comes with a bunch of built in tasks, and each task can be made up of other tasks and code. Plugins can add their own tasks and hook into default tasks, and the user can do the same through the configuration files.

Like any popular open source project capistrano has gone through changes. Documentation on the Internet is plentiful but often refers to old ways of doing things, so copy/pasting recipes can often result in stuff that doesn’t work quite right.

One common thing people need to do is to run some command after the deploy or at some time during the deploy. Capistrano has a very easy way of doing this.

1
after 'deploy:update_code', 'deploy:symlink_shared'

This says “after running deploy:update_code, run deploy:symlink_shared”. The latter is custom task that’s defined elsewhere.

The problem comes in when you look at the way the “deploy” and “deploy:migrations” tasks differ. I’ve seen a common problem where something works when deploying without migrations but not when migrations are used. Usually this is because the hook used is not the right one, either because of bad information or the user figured out where to hook into by looking at the output of a deploy.

If you look at Capistrano’s default deploy.rb you can piece together the tasks that are run in both cases.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
deploy:migrations
  update_code
   strategy.deploy!
   finalize_update
  migrate
  create_symlink
  restart
deploy
  update
    update_code
      strategy.deploy!
      finalize_update
    create_symlink
  restart

From this, you can see that the sequence is somewhat different. The “update” task isn’t used in the migrations case. Instead, the components are replicated.

If you want to hook in, use

  • update_code to run before/after the code is put into the release directory, such as if you want to make more symlinks or do something before migrations are potentially run.
  • create_symlink to run before/after the symbolic link pointing to the current release is made. Note that symlink is deprecated. You can run it from the command line, but if you try and hook in to it, you won’t be happy.
  • restart to run before/after the application server is restarted, e.g. to restart background workers or do deployment notifications

BASH History Expansion

The Unix shell has a whole bunch of features to make your life easier. One has to do with the history. Some I have managed to ingrain into muscle memory, others I have to remember which often means I do it the long way. I hope these examples help you out.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# Start off with some files
$ touch one two three four
# !$ expands to all the arguments from the last command
$ ls !*
ls one two three four
four  one three   two
# !foo runs the last command that starts with foo
$ !touch
touch one two three four
# Carat (^) does a search/replace in the previous command
$ ^touch^ls
ls one two three four
four  one three   two
# !$ is the last item in the previous command, !^ is the first _argument_
$ head !$
head four
$ ls four three two one
four  one three   two
$ cat !^
cat four
# !:n is the n'th item on the previous command line, 0 indexed
$ ls four three two one
four  one three   two
$ cat !:3
cat two
$ ls four three two one
four  one three   two
$ which !:0
which ls
/bin/ls

There are a lot more, run a man bash and look for the section called HISTORY EXPANSION.

Mixpanel Track_links and Track_forms

I’ve long been a fan of MixPanel and was happy that I got to get to use them again over at Wave Payroll. Last time I used them you could only track a page view, but now they have added track_links and track_forms which fire the event asynchronously after the link is clicked or the form is submitted.

I started off by using the event_tracker gem which only handles the page view events, but it does make it easier to fire events in the view based on controller actions so it is well worth using. I talked to the author about adding support for track_links and track_forms, but after a good discussion he convinced me that the gem was not the right place for this and that I should pursue something more elegant such as tagging the links.

Ideally, what we wanted to arrive at was something like

1
<a href="blah" class="track-with-mixpanel" data-event="clicked on login">

or with Rails helpers:

1
=link_to login_path, :class => "track-with-mixpanel", :data => { :event => "clicked on login" }

which would magically call

1
mixpanel.track_links("#id_of_element", "clicked on login")

One problem is that not all the elements had IDs and I didn’t want the added friction of having to add that in.

What I came up with was:

1
2
3
4
5
6
7
8
9
10
11
12
$(".track-with-mixpanel").each( function(i) {
  var obj = $(this);
  // Make up an ID if there is none
  if (!obj.attr('id')) {
    obj.attr('id', "mixpanel" + Math.floor(Math.random() * 4294967296));
  }
  if (obj.is("form")) {
    mixpanel.track_forms("#" + obj.attr("id"), obj.data('event'), obj.data('properties'));
  } else {
    mixpanel.track_links("#" + obj.attr("id"), obj.data('event'), obj.data('properties'));
  }
});

So this simply finds all the links with the track-with-mixpanel class, assigns them a random ID if they don’t have one, and then calls the appropriate mixpanel function on them.

A couple of notes, though…

The first is that the data-properties doesn’t quite work right. Pulling the data into a hash requires some finesse that I haven’t been able to figure out.

The second is that track_links and track_forms are horribly broken and will mess up any onClick handlers you have. So if you have links that you use :method => :post for, some third party javascript like validations or Stripe, you’re better off with the old track method because Mixpanel won’t play nicely with it. But for regular links and forms, this is great.