RabbitMQ a Quick and Dirty Introduction

RabbitMQ has become one of the most popular solutions for application messaging. Owned and maintained by VMWare, its proven track record, stability, and speed has made it the poster child of what a powerful message bus should be. This introduction should be viewed as a boiled down crash course in RabbitMQ.

The Quick

AMQP

RabbitMQ is a broker for the AMQP (Advanced Message Queuing Protocol) v0.9.1. On the face of it, the AMQP protocol is pretty simple. If you are used to working with email, the concepts in AMQP will be nothing new. It can be viewed as a series of mail boxes, mail exchanges, and addresses working together to get the job done. My purpose is not to go into great detail about AMQP itself or its history, I will leave that to your own research or perhaps a later post. Rather this post is intended to give the developer enough information that they can navigate their way around or at least have a fairly good understanding of how it all works together to get the job done.

RabbitMQ

RabbitMQ is the goto broker for AMQP messaging. With the financial backbone of EMC and VMWare under the guise of GoPivotal, this will probably be the case for many years to come. It is written in Erlang, a functional language known for its ability for concurrent processing, and backed by Mnesia, Erlang’s powerful persistence database.

Currently RabbitMQ is used by some of the biggest names in the tech world including:

  • VMWare/EMC (obviously)
  • AT&T Interactive
  • NASA
  • Digg
  • BBC
  • Nokia
  • Heroku

The Dirty

Channels

When working with RabbitMQ (or AMQP for that matter) one of the first things you will run into is the Channel. A channel is essentially a virtual connection allowing separate threads to maintain their own point of communication while still using a single TCP connection.

Exchanges

The Exchange can be viewed as the backbone of the AMQP protocol. It acts much as an individual mail server determining which mailboxes (queues) to deliver messages to. These come in several different flavors covering almost all of your delivery needs:

  • Direct – The most commonly used exchange type. Direct exchanges are designed for fast and, you guessed it, direct delivery to queues using a key, or effectively the Queue’s address on that exchange.
  • Topic – The Topic Exchange is similar to the direct exchange with the exception that queues bound to this exchange can listen to variable keys. For example, in a direct exchange if you wanted a queue to listen every message coming into ‘foo.bar.*’( you would have to bind that queue for every possible key. A topic exchange instead allows wildcard subscriptions, so you can literally bind to ‘foo.bar.’ and it will consume everything from ‘foo.bar.baz’ to ‘foo.bar.why.does.everyone.use.foo.bar’.
  • Fanout – In the Fanout Exchange, the message keys are completely ignored. Instead this exchange delivers to every queue bound to it.
  • Headers – This is possibly the least used, but most powerful type of Exchange. Instead of using the message key it actually looks at the message headers to determine where it is routed to.

Queues

The proverbial mailbox. Queues are pretty self descriptive. The only things fo real note are the options. When queues are declared they can be passed a number of attributes that determine how the queue behaves:

  • Durable – Messages in a durable queue will survive restart.
  • Exclusive – The queue is dedicated to a single connection. When that connection is lost, the queue will be removed.
  • AutoDelete – Similar to Exclusive queues, AutoDelete will cause the queue to be deleted when the last subscriber disappears. Unlike Exclusive queues, this allows multiple subscribers.

Bindings

Bindings are the glue that holds queue to exchange. When a queue is declared it can be bound to multiple exchanges, each binding defining its own rules of how messages are to be delivered to the queue. For example, you may bind a single queue to a Topic exchange with key ‘foo.bar.*’ and to a Fanout Exchange. The queue will receive messages directed to it on the Topic Exchange and all messages from the Fanout Exchange both.

Consumers

While not technically part of the broker, consumers still bare mentioning. The most useful point here is that consumers can subscribe to messages transactionally. This allows the subscriber to either acknowledge or reject a message based on whether or not it was able to process it. Combine this with RabbitMQ’s Dead Lettering capabilities and you have the makings of a extraordinarily fault tolerant messaging system.

Differences Between RabbitMQ and other AMQP Brokers

While most of the above details the more the AMQP protocol in general, I would be remiss if I didn’t point out some slight differences RabbitMQ has with the AMQP 0.9.1 protocol proper.

RabbitMQ comes with a number of protocol extensions giving the developer a bit more power than they would have otherwise:

  • Confirms The ability for the publisher to confirm message delivery.
  • Exchange to Exchange Bindings RabbitMQ provides the ability to bind exchanges to each other.
  • Message TTL – RabbitMQ implements a TTL header that defines how long a message is allowed to live in a queue. When this time expires, it is removed from the queue.
  • Dead Lettering – This is, in my opinion one of the best extensions RabbitMQ provides. In the case that a message is considered dead, you can define an exchange to transfer that message to for handling. Using this in combination with Message TTL, one can implement a timed retry of messages or even an automatic consumer delay.

For those of you who have only passively heard about RabbitMQ, hopefully you find this post useful. My goal is to later go into more details about implementing and using Rabbit’s various features, so stay tuned.

A Primer on Ruby C Extensions Part 2 - FFI

In my previous post I covered building a simple C extension written in Ruby. There are times however that you may need to call functions defined in an existing native library. Fortunately there is a tool precisely for this job. Enter FFI, or Foreign Function Interface. As defined by Wikipedia “A foreign function interface (or FFI) is a mechanism by which a program written in one programming language can call routines or make use of services written in another”. Thanks to the team working on Ruby FFI we have a relatively easy set of tools to work with.

First let say I ther is already a native library available that has an already defined method for determining if a number is prime. Lets call this libprime.so. Tying into this library is actually quite simple using Ruby FFI. The first step obviously is installing it. gem install ffi should do the trick. Now we need to take a look at the code defined in libprime.so.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <stdlib.h>
#include <stdint.h>
#include <math.h>

int prime(register int n)
{

  if (n == 2)
    return 1;
  if ((n & 1) == 0)
    return 0;

  register uint64_t sqrt_n = ((uint64_t)sqrt(n)) + 1;
  register uint64_t i=3;
  for (i; i<= sqrt_n; i+=2) {
    if (n % i == 0)
      return 0;
  }
  return 1;
}

As you can see the method prime is made available. So our next step is to tie into the library using FFI.

1
2
3
4
5
6
7
8
9
10
11
12
13
require 'rubygems'
require 'ffi'
module Primed
  extend FFI::Library
  ffi_lib "libprime.so"
  attach_function :prime, [:int], :int

  def prime?
    return false if prime(self) == 0
    true
  end

end

Looks simple enough right? Let me quickly explain what this is doing. Our first task is obviously to make FFI available to us. We call extend FFI::Library and that gives us all the tools that we need to work with. Second, we need to define the library to tie into, in this case libprime.so. Third, we call attach_function. This takes 3 arguments, the name of the library’s method to call, an array of arguments that the method will take, and the data type the method returns. Finally, we wrap the method to our liking and we’re done. For more information check out the FFI Wiki over at Github.

A Primer on Ruby C Extensions

NOTE: This is a repost of an older blog post of mine from another Blog

I will address one of the primary uses for a C extension in Ruby, speed. Due to it’s very nature, Ruby is slow (as compared to compiled languages like C). It gets the job done, but sometimes it takes it’s sweet time doing it. Sometimes it is necessary to speed things up a bit, and here enter C extensions. There are several methods of implementing extensions, from the generic C extension, to ruby-inline. In this particular article I will focus on the generic C extension.

In this example, I am going to use a fairly inefficient piece of Ruby code I created a while ago for Project Euler (Problem 10) for finding the sum of all primes under 2000000:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Integer

  def prime?
    return true if self == 2
    return false if (self & 1) == 0
    square = Math.sqrt(self).round + 1
    i = 3
    while i <= square
      i+= 2
      return false if (self % i) == 0
    end
    true
  end

end

numbers = (2..2000000).to_a

numbers = numbers.select { |n| n.prime? }
puts numbers.inject { |result, element| result = element + result }

UPDATE: Changed i on line 7 from 1 to 3, as otherwise it will cause 3 and 5 to return false.

At the time that I wrote this, I was relatively unaware of more efficient ways of resolving prime numbers (such as a euler sieve), however the code still ran under the allotted 2 minute window (52 seconds) so I went with it. Now to speed it up. To write a C extension you need, at a bare minimum two things:

an extconf.rb file – this file is used by ruby to generate the Makefile that is used to compile the extension the source file for the extension (in this case primed.c) Here is a look at these two files for my new version of problem 10:

primed.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include "ruby.h"
#include <stdlib.h>
#include <math.h>

VALUE Primed;

VALUE method_prime(VALUE obj, VALUE args)
{
  register uint64_t n;
  n = NUM2INT(obj);
  if (n == 2)
    return Qtrue;
  if ((n & 1) == 0)
    return Qfalse;

  register uint64_t sqrt_n = ((uint64_t)sqrt(n)) + 1;
  register uint64_t i=3;
  for (i; i<= sqrt_n; i+=2) {
    if (n % i == 0)
      return Qfalse;
  }
  return Qtrue;
}

void Init_primed() {
  Primed = rb_define_module("Primed");
  rb_define_method(Primed, "prime?", method_prime, -2);
}

extconf.rb

1
2
3
4
5
6
7
8
9
10
11
# Loads mkmf which is used to make makefiles for Ruby extensions
require 'mkmf'

# Give it a name
extension_name = 'primed'

# The destination
dir_config(extension_name)

# Do the work
create_makefile(extension_name)

First let me explain primed.c. The objective of this extension is to determine whether or not a number is prime, so that an integer can call x.prime? and return true or false. It is essentially identical to the method used in the pure ruby script above. One of the first thing you may notice is this line:

1
VALUE Primed

VALUE is a data type defined by Ruby that represents the Ruby object in memory. It is basically a struct that contains the data related to the object. In this case, the object will represent the Primed module in ruby, so it will contain data about the instance methods, variables, etc. for that module. All Ruby objects are represented in C by VALUE, regardless of their type within the Ruby VM, anything else will likely result in a segfault.

Next we define the actual method to calculate whether the value is prime. Note that because we need to return a Ruby object, we set the return type as VALUE as well. QTrue and QFalse are directly representative of true and false in Ruby, and also return correctly within C (QTrue will evaluate as true, QFalse will evaluate as false).

Finally we see the Init_primed method. Every time a class or module is instantiated within the Ruby VM it calles Init_name. It is here we actually instantiate the Primed module and bind the method_prime function to the Ruby method prime?. Both functions used are pretty self explanatory as to what they do, except for the last argument used in ruby_define_method which is essentially the arity or number of arguments to expect in the Ruby method. In this case, -2 actually causes Ruby to send back self as the first argument to the method_prime function, and an array of any other arguments as the second.

Now we have all of our code. The last thing to put in place is extconf.rb:

1
2
3
4
5
6
7
8
9
10
11
# Loads mkmf which is used to make makefiles for Ruby extensions
require 'mkmf'

# Give it a name
extension_name = 'primed'

# The destination
dir_config(extension_name)

# Do the work
create_makefile(extension_name)

Pretty simple right? Now when you call ruby extconf.rb it will generate a Makefile that you can use to build the extension. And the final result? Using the C extension the code runs in just under 3 seconds. Still not really efficient, but it demonstrates the point. When Ruby’s speed is the bottle neck, using C is a viable and easy option.

Go to Part 2 – FFI

The Wrong and Right of Rewrite

Almost every developer has faced it. Staring down code with frustration building until something in side you just snaps. Suddenly you see the light. This code is crap. Useless. There is too much duplication. It’s too hard to understand. There’s no documentation and the guy who wrote went missing in a jungle Safari in the Congo. It’s time to rewrite. Or is it?

The rewrite is an easy trap to fall into. You have to ask yourself though, am I considering a rewrite for the right reasons. The answer lies in the ability for you to be honest with yourself. First, a rewrite is NEVER a snap decision. If it took you less than a few days to determine you need to rewrite then you are %99.99 likely to be doing it for the wrong reasons. Having recently found myself considering a rewrite, I decided to stop myself and think it through. Here is a small list of various reasons for a rewrite. Some good, some bad.

The Wrong Reasons (or the most commonly used excuses)

  • I hate PHP. Or whatever language it is you find a particular distaste for. If this is your reason, you have probably spoiled yourself (I’m talking to most Ruby developers, including myself). Changing languages should only be a factor if you already have legitimate reasons for doing a rewrite.
  • I don’t want to learn the code. The actual typical excuse is that it was just “done wrong”. I think this is the most common excuse for a rewrite. Face it, working with other peoples code is a part of being a Software Engineer.
  • The person that wrote the original code sucks. This is often closely linked to the above. They suck because I cannot understand it.
  • It’s not the way I would have done it. Well of course it isn’t. Code style is almost as unique to a Software Engineer as a fingerprint.

The Right Reasons

  • Unusable in it’s current state. If the codebase is unusable a rewrite is justified. Keep in mind that unusable code doesn’t mean unreadable code, or code you don’t understand, or code you don’t want to debug.
  • NOBODY can read it. Okay, if it is that unreadable then it is likely that a rewrite is justified.
  • The codebase cannot support the current requirements of the software, either due to language limitations or the software was created in a way that prevents th requirements being met. A good albeit controversial example is Twitter’s move from Ruby to Java.

Lightweight API Authentication With Lua and NGiNX

I recently was looking for a way to easily handle API authentication with as little overhead as possible. So first a little background. At Scan we are following a service oriented structure, where each service defines and maintains its own API endpoints. The “API” is in fact nothing more than a glorified proxy, that sends back the request to the appropriate service which then handles the logic. This is all handled through NGiNX. More on the pros and cons of that structure in a future post. That being said, my goals were simple:

  • Little to no long term code maintinence. I want to be able to forget it is even there.
  • Fast. After all, it is authenticating every API request.
  • I wanted it to behave almost like a Ruby on Rails before_filter call. It should authenticate the key before the API endpoints are actually even hit.

Enter the lua-nginx-module. Lua is an extremely light weight fast scripting language made popular by its ability to be embedded in almost anything, even in other scripting languages. In this case, the Lua runtime can be compiled directly into NGiNX. The lua-nginx-module conveniently adds an NGiNX configuration option called access_by_lua which seems to fit all of the above requirements nicely:

  • Little or no maintinence – the entire authentication script run by NGiNX is a total of less than 70 lines of code (with comments and logging removed), and depending on requirements could be done in less.
  • Fast – Lua is on average five times faster than ruby, and is embedded in the actual webserver. It also runs concurrently. Each NGiNX worker gets it’s own Lua runtime.
  • Prevents access before it hits the first endpoint – The access_by_lua acts in the same way as a simple basic authentication directive would. It will stop the call from ever continuing to the API endpoints if authentication fails.

The best thing is that it is flexible. How the lua authenticates is entirely up to the developer. It can read from a file, call a back end service, or even read directly from a database. It is entirely up to you.