Reading notes from RabbitMQ in Action


  1. Getting RabbitMQ
  1. sudo apt-get install rabbitmq-server
  2. /usr/sbin/rabbitmqctl status
  3. sudo pip install pika

RabbitMQ in Action

  1. RabbitMQ terminology
  1. producer – publishes the message to an exchange
  2. channel – sub-socket inside open TCP socket to RabbitMQ (TCP sockets are re-used to avoid overhead connection costs)
  3. consumers – consume the message from the queue
  1. basic.consume – passively receive messages from the queue (push mechanism)
  2. basic.get – poll-based requests, essentially subscribe, basic.consume of 1, unsubscribe
  3. auto_ack is set true, fires off after receiving, or use basic.ack if auto_ack is false
  4. basic.reject rejects the message, requeue=true puts it back on the queue, use requeue=false for malformatted messages
  1. queue properties
  1. exclusive – can only be subscribed by 1 consumer, useful for testing
  2. auto-delete – delete as soon as last consumer unsubscribes
  3. durable – should be re-created after re-starting?
  1. exchanges – publishers send messages to exchanges, who then route them to one or more queues
  1. direct – routing key must match with the name of the queue
  1. $channel->basic_publish($message, $exchange=’’, $routing=’test-queue’)
  1. fanout – broadcast the message to every queue that’s bound to a particular exchange
  2. topic exchange – can use a wildcard key, * if you have more characters, # for everything
  1. Vhosts
  1. each vhost has its own exchanges, bindings and queues
  2. permissions are per vhost
  3. default: username=guest, password=guest
  4. rabbitmqctl list_vhosts
  5. rabbitmqctl add_vhost
  6. rabbitmqctl delete_vhost
  1. Persistence
  1. set exchange durable=true, survives the restart
  2. set queue durable=true, survives the restart
  3. set message persistent=true (delivery mode = 2)
  4. persistent message is flagged in persistence log after being delivered and ACKed
  1. if the server crashes before the garbage collector has run, it will be re-delivered
  1. persistence can cause 10x decrease, but SSD helps
  2. structure clusters into critical (persistent, SSD) and fast (no persistence)
  1. Queue management
  1. rabbitmqctl list_queues
  2. rabbitmqctl list_queues -p specific.vhost
  3. rabbitmqctl rotate_logs suffix_to_appent.to_old_logs
  1. For messages requiring confirmation, the publisher can set reply-to header of the AMPQ message, and then listen for a confirmation to arrive on a separate queue
  2. Inside a cluster, RabbitMQ does not automatically replicate the queues – for persistent queues this would imply being copied over the network, and being copied to disk, which could pose significant performance issues
  1. As a follow-up, if the queue was marked durable, and the node crashed, the node would have to be resurrected for that queue’s messages to be retrieved from persistent storage
  1. Exchanges in RabbitMQ are lightweight – they’re essentially routing tables storing the routing keys and queue names. Therefore replicating (and scaling out) exchanges is pretty easy
  2. What happens when you publish the message into the exchange, and the exchange node fails? You lose the message
  1. If you employ AMPQ transaction, it will block until you get a confirmation
  2. You could also be listening for publisher confirms and have some logic in publisher to deal with messages that were lost by the exchange
  1. Storing metadata on disk allows for easier restart, storing metadata in RAM allows for easier declaration of new exchanges, queues or bindings, as that operation will block until all of the nodes in the cluster have propagated it
  1. But what if you’re doing RPC with separate anonymous queues for ACKs? Super-noisy.
  2. RabbitMQ requires one node stores metadata on disk, everybody else can store metadata in RAM
  1. Declaring a mirrored queue requires hard-coding the specific node name into your application
  1. Cannot mirror on all nodes, since potentially that would degrade performance
  2. A newly declared mirror queue will copy the contents of the queue from the point of declaration – it doesn’t fully mirror the queue
  1. Mirrored queue is really another queue with a fanout exchange
  1. Using publisher confirms on mirrored queue will cause ACKs to be delivered only when ALL slaves received the message
  2. You can have a weird situation when the master host fails and ACK never arrives
  1. When the master fails, all consumers need to re-attach to the new master, the failover is not automatic
  1. This means the app has to understand and process consumer cancellation notices
  1. Shovel is RabbitMQ’s cross-data-center replication plugin, effectively subscribes to the master queue, and re-publishes the data to remote server