Redis Part III

In part 2 we looked at the 2 different ways Redis is setup in production, Cluster and Sentinel, and we started working towards configuring a Redis Sentinel Setup.

We’re going to continue with Configuring Redis Sentinel using our already running Redis1, Reds2, and Redis3 machines.

The next step in the process is to setup the Redis-Sentinel services in Docker. We are going to Pin our services to specific nodes to ensure we don’t ever schedule multiple Redis-Sentinel instances on the same node.

However this time we need to provide a configuration to our sentinel setup. There are several ways we could do that, however, the best solution in terms of future maintainability is to just go ahead and build your own Docker image with your configuration baked in. Once you have done that you can use Docker Hubs CI/CD pipeline to fully automate things. The contents of that thought could easily encompass a full post. So we’ll come back to that part.

For now using your favorite tool (I’m using VS Code) lets get a new project going. If you’re using VS Code create a directory where you like to store projects, open code and select Open Folder, selecting your new folder. I then like to “Save Workspace” into the same folder so I can easily launch Code back into this project.

Once you have the workspace setup create a new file called: Dockerfile

Notice the lack of file extension? You can name it something else like Redis-Sentinel.Dockerfile if you’d like

Next create another file called redis-sentinel.conf and give it these contents:

#You can read the documentation at that link
port 26379
sentinel monitor mymaster 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 180000
sentinel parallel-syncs mymaster 1
  • The 1st line is the url for the sentinel docs so you can get all smart
  • The 3rd line tells redis to start on port 26379
  • The 4th line defines the master we want our sentinel to monitor, it has the Ip, port, and quorum configuration at the end. The Quorum is the number of sentinels that must agree to imitate a fail over.
  • Next is how long a master must fail PING checks before being considered down
  • Next is how long we’ll wait before Sentinel will try to fail over the same master again.
  • parallel-syncs┬ásets the number of slaves that can be reconfigured to use the new master after a failover at the same time. The lower the number, the more time it will take for the failover process to complete, however if the slaves are configured to serve old data, you may not want all the slaves to re-synchronize with the master at the same time

Now update your Dockerfile with these contents:

FROM redis:5
COPY redis-sentinel.conf /usr/local/etc/redis/redis-sentinel.conf
#CMD [ "redis-server", "/usr/local/etc/redis/redis.conf" ]

Notice a couple things here:

  • I specify redis:5 – using a specific version instead of latest is a must for production
  • We copy the configuration file to the container
  • The CMD is for my reference later when we create the containers. We’ll eventually update the redis-server installation to use this custom image as well and I want to be able to use the same image for both.

I have also created a .dockerignore file with this in it:

This prevents the vagrant stuff from being sent to the docker daemon during build.

Now that we have our Dockerfile built and our redis-sentinel.conf file ready we can build the image

docker build -t wjdavis5/redis_sentinel_wordpress .

This isnt a docker tutorial, but in short this command builds an image tagged as “wjdavis5/redis_sentinel_wordpress” based on the current directory “.”

Once it is built we can push it to Docker Hub

docker push wjdavis5/redis_sentinel_wordpress:latest

Now we can move back over to our Docker host and create our services.

sudo docker service create --name redis_sentinel1 --constraint node.hostname==redis1 --hostname redis_sentinel1 --mode global --publish published=26379,target=26379,mode=host wjdavis5/redis_sentinel_wordpress:latest redis-sentinel /usr/local/etc/redis/redis-sentinel.conf

sudo docker service create --name redis_sentinel2 --constraint node.hostname==redis2 --hostname redis_sentinel2 --mode global --publish published=26379,target=26379,mode=host wjdavis5/redis_sentinel_wordpress:latest redis-sentinel /usr/local/etc/redis/redis-sentinel.conf

sudo docker service create --name redis_sentinel3 --constraint node.hostname==redis3 --hostname redis_sentinel3 --mode global --publish published=26379,target=26379,mode=host wjdavis5/redis_sentinel_wordpress:latest redis-sentinel /usr/local/etc/redis/redis-sentinel.conf

Now we should be able to connect to an instance of redis sentinel, using the docker run command from before, only this time we need to specify the port -p 26379

vagrant@redis1:~$ sudo docker run -it redis:latest redis-cli -h -p 26379> info
# Server
os:Linux 4.4.0-131-generic x86_64

# Clients


# Sentinel

Yeah mine says 4 sentinels b/c created an additional while testing this out, you should see 3. The next thing we want to do is test failover. To do that we first want to tail the logs for redis-sentinel. We can do that with the docker logs command:

sudo docker ps|grep sentinel|awk '{print $1}'|xargs sudo docker logs -f

Now from another window connect to the master (should be redis1) and issue the shutdown command:

vagrant@redis1:~$ sudo docker run -it redis:latest redis-cli -h> shutdown
not connected>

Now if you watch the redis-sentinel log you should see the fail over initiate

20 Jan 2019 15:01:07.934 * +sentinel-address-switch master mymaster 6379 ip port 26379 for 08aa07e38473a554911f3aebfd720692b3b7c948
1:X 20 Jan 2019 15:08:36.236 # +sdown master mymaster 6379
1:X 20 Jan 2019 15:08:36.236 # +sdown master mymaster 6379
1:X 20 Jan 2019 15:08:36.326 # +odown master mymaster 6379 #quorum 4/2
1:X 20 Jan 2019 15:08:36.326 # +new-epoch 1
1:X 20 Jan 2019 15:08:36.326 # +try-failover master mymaster 6379
1:X 20 Jan 2019 15:08:36.330 # +vote-for-leader 08aa07e38473a554911f3aebfd720692b3b7c948 1
1:X 20 Jan 2019 15:08:36.330 # 63e82269c31d372eaee4d3bde15aea8a2c2e65f4 voted for 08aa07e38473a554911f3aebfd720692b3b7c948 1
1:X 20 Jan 2019 15:08:36.330 # d7b04497a0b905f692aea802979d030d5c73d0e9 voted for 08aa07e38473a554911f3aebfd720692b3b7c948 1
1:X 20 Jan 2019 15:08:36.330 # 08aa07e38473a554911f3aebfd720692b3b7c948 voted for 08aa07e38473a554911f3aebfd720692b3b7c948 1
1:X 20 Jan 2019 15:08:36.383 # +elected-leader master mymaster 6379
1:X 20 Jan 2019 15:08:36.383 # +failover-state-select-slave master mymaster 6379
1:X 20 Jan 2019 15:08:36.445 # +selected-slave slave 6379 @ mymaster 6379
1:X 20 Jan 2019 15:08:36.445 * +failover-state-send-slaveof-noone slave 6379 @ mymaster 6379
1:X 20 Jan 2019 15:08:36.516 * +failover-state-wait-promotion slave 6379 @ mymaster 6379
1:X 20 Jan 2019 15:08:36.670 # +config-update-from sentinel 63e82269c31d372eaee4d3bde15aea8a2c2e65f4 26379 @ mymaster 6379
1:X 20 Jan 2019 15:08:36.670 # +switch-master mymaster 6379 6379

When we issued the shutdown command it should have terminated the docker container that was running. Because we configured this as a Docker Service when the container terminates Docker is automagically going to reschedule the container to execute again. So at this point you should be able to reconnect to redis1, run an info command, and see that it is now a secondary (slave)

# Replication

Congratulations, you now have created a Redis Sentinel deployment on Docker swarm. In the next post we’ll cover configuring Redis Cluster.

Redis – Part II

In Part I we covered some very basic steps to start up a Redis server and connect to it with the redis-cli tool. This is useful information for playing around in your Dev. environment, but doesn’t help us much when its time to move to production. In this post I will start tocover the steps we took to deploy Redis to production, and keep it running smoothly.

Redis Sentinel

When its time to run in production there are generally two primary ways to offer high availability. The older way of doing this is using Redis Sentinel. In this setup you have a Primary (also called the master) and 2 or more Secondaries (also called slaves). All writes must go to the primary, data is then replicated to the secondaries over the network.

You also must have at least 3 more instances of Redis running in sentinel mode. These sentinels monitor the primary (also called the master). If they determine the master is unavailable a new master will be promoted from one of the available secondaries. All other secondaries will automatically be reconfigured to pull from the new master.

There is a bit more to it than this, but that is a sufficient explanation for right now.

Redis Cluster

In cluster mode we shard reads/writes across multiple instances, and these multiple instances also have Secondaries.


Reads and Writes are distributed across them Primaries by using a computed hash slot. Hash slots are pretty easy to understand if you want to review the Cluster Specification. But suffice it to say that a hash slot is computed based upon the key name, the hash slots are divided between primaries and it is up to the client to route to the correct instance.

Note, when I say Client, I dont mean your application, unless you plan to connect directly Redis and speak the Redis protocols. You’ll likely be using a client library like StackExchange.Redis though

Which One To Choose?

Generally speaking, I think you should just go ahead and choose Redis Cluster if you’re going to be setting this up in a production environment. It gives you the ability to scale horizontally when you need to, and honestly isnt much more difficult to setup than Sentinel. But I’ll cover the setup of both.

Configure Redis Sentinel

I’m going to continue using Docker here. And we’ll assume you have 3 nodes called redis1, redis2, and redis3. If you’re following along on your local machine you can use the following Vagrant file to get started:

Vagrant.configure("2") do |config|
  config.vm.define "redis1" do |redis1| = "bento/ubuntu-16.04"
    redis1.vm.box_version = "201812.27.0"
    redis1.vm.provision :shell, path: "", :args => "redis1"

  config.vm.define "redis2" do |redis2| = "bento/ubuntu-16.04"
    redis2.vm.box_version = "201812.27.0"
    redis2.vm.provision :shell, path: "", :args => "redis2"

  config.vm.define "redis3" do |redis3| = "bento/ubuntu-16.04"
    redis3.vm.box_version = "201812.27.0"
    redis3.vm.provision :shell, path: "", :args => "redis3"


And here is the file mentioned in the above config:

#!/usr/bin/env bash

hostnamectl set-hostname $1
echo $1 >> /etc/hosts
apt-get update
apt-get remove docker docker-engine containerd runc
apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg2 \
    software-properties-common -y
curl -fsSL | sudo apt-key add -
add-apt-repository \
   "deb [arch=amd64] \
   $(lsb_release -cs) \

apt-get update
sudo apt-get install docker-ce  -y

Now when you run vagrant up, in a few moments you’ll have 3 machines running with docker installed. I’ll be configuring a swarm, like I have in production. So next I need to init the swarm.

vagrant ssh redis1
#wait for connect
sudo docker swarm init
#copy the join command that gets output by the last command, it'll look like this
#sudo docker swarm join --token SWMTKN-1-2wwy0py488uvo6u0lhbpgqpvbhha1kd6w4k1t95uox9m0t4ln0-1fjjc15308hvndpdj8ui7lts9

Now you’ll ssh into the other two and join the swarm with the command from the previous example

vagrant ssh redis2
sudo docker swarm join --token SWMTKN-1-2wwy0py488uvo6u0lhbpgqpvbhha1kd6w4k1t95uox9m0t4ln0-1fjjc15308hvndpdj8ui7lts9

Repeat that step on 2 and 3

Now we have a running docker swarm with redis1 as our master. The next steps are to create our redis services. We’ll be pinning each redis instance to a specific node. Here are the commands to create the redis services

sudo docker service create --name redis1 --constraint node.hostname==redis1 --hostname redis1 --mode global --publish published=6379,target=6379,mode=host redis:latest

sudo docker service create --name redis2 --constraint node.hostname==redis2 --hostname redis2 --mode global --publish published=6379,target=6379,mode=host redis:latest

sudo docker service create --name redis3 --constraint node.hostname==redis3 --hostname redis3 --mode global --publish published=6379,target=6379,mode=host redis:latest

We are pinning each instance of Redis to a node b/c we dont want docker to schedule the primary and secondaries to ever be on the same docker host. That would remove some of the high availability we get with running multiple instance.

We now have 3 instances of redis running. You can test connecting to them using the examples from part 1 of this series.

docker run -it redis:latest redis-cli -h
#obviously update your ip address

The next step is to enlist redis2 and redis3 as slaves of redis1. To do that we’ll connect to each and run the slaveof command. First ssh into redis 2 and then connect to redis using the docker command above. Then run

slaveof 6379
#again, update your IP and port (if you changed the port)

Redis should respond with “OK”
Repeat this step on redis 3

Once this is done you should be able to again connect to the instance of redis on redis1 and run the info replication command:

redis-cli info Replication

# Replication

Whew! Now we have 3 instances of redis running with 1 master and 2 slaves. In the next post we’ll work on getting redis-sentinel up and running to monitor them.

Redis Part I

This will be the first in a multi-part write up of how I use redis. I will focus on a few key areas:

  • Configuring redis server
  • General design for storing and retrieving data
  • Language specific stuff using C# / StackExchange.Redis.

To get start you’ll need to download and install redis in some form or fashion.

  1. You can go to the site and download it directly:
  2. Or you can download the source and build it:
  3. Or, my preferred way, is to install Docker, and run redis from there:

I’m not going to walk you through installing docker in this guide, but its pretty easy.

We’ll get started by opening up your favorite command prompt and running:

docker run -it -p 6379:6379 redis:latest

This will download the image, run it interactively, and map the default port to the container.

Thats it! Now you have an instance of redis running that you can play with.

Now there are several ways for you to connect to, and play with, redis once you have it running:

  • A number of gui applications (learn these later if you really want to get good)
  • Install redis-cli locally
  • Run redis-cli from a docker container

Again we’re going to use docker, its just the easiest way to get things going quickly. So again from your favorite command prompt type

docker run -it redis:latest redis-cli -h

You’ll want to enter the ip address of the docker host where redis was started.

I’m greeted by the following:


Now that it is up and running, and we have a client connected, we can easily save our first entry:> set Hello World
OK> get Hello

That was pretty simple!

In the next article we’ll cover different productions setups, and dive into getting things setup for production.

Success as measured by time

I recently read an article on HBR.ORG titled “Stop Working All Those Hours” and it has really made to stop and think about how people, myself included, measure success. The general premise here is that success in the workplace is often measured by the amount of time spent in the office. Stop and think about this for a moment; is it true?
Will spending 50+ hours a week in the office really make you a better employee? Or is it a perfect demonstration of not being as efficient as you could be? Metrics are an important factor in a business, I mean you have to have some empirical data by which you compare employees against each other. But is hours worked a valid metric? Does the employee that came in on Saturday deserve recognition if they could have completed the task on Friday? Is the employee that leaves early to watch their childrens’ little league game less dedicated to his or her job?

Stop working so much!

That last question sort of hits close to home for me because I am frequently hard on myself for having to leave or miss work for any reason, but at the same time I refuse to be one of those parents that doesn’t get to see their children grow up. If time spent truly is a valid metric for measuring employees then my career progression is possibly doomed to be a slow and arduous process.

Admittedly time spent is a metric we instantly gravitate towards, I do however believe that it is, at the very least, an unreliable total picture metric. As leaders and managers we should strive to identify metrics that are specific to a position. By doing so we will be able to obtain a much better idea of how well our employees are actually performing.

As an employee I feel that it is important to closely evaluate our efficiency. If you can make yourself more efficient then perhaps you can spend a few of those hours focusing on your personal life as well. A couple more hours spent catering to yourself will go a long way in ensuring your happiness and overall well being.

Our Move to Dot Net Core

I work at Synovia Solutions LLC. creators of the Silverlining Fleet Management software and Here Comes The Bus. Our solution installs hardware devices on vehicles that then report back over cellular to us. During peak times we are processing about 3000 messages / second over UDP.

Our current system includes a monolithic windows service that handles pretty much all aspects of message processing. Its written in .NET (currently 4.6.1) and runs on several physical machines located in a local Data Center. It uses SQL Server as a backend data store.

When I was brought on board one of my primary tasks was to migrate the existing queuing infrastructure, several Sql Server Tables, into a new queuing solution. We chose RabbitMq via the hosted provider CloudAMQP. This was a pretty new paradigm for me, as I had never worked with anything other than MSMQ (GAG!) .

After the initial implementation of Rabbit was written we discovered a show stopper. To explain that I’ll need to cover a bit more on how this all works.

You see the devices on the vehicle communicate over UDP only, but once we receive a message and persist it we have to send an ACK message back to the device. If the device doesnt receive this ACK within 30 seconds it will retransmit the same mesasge. With our existing infrastructure already strained, several times we found ourselves falling behind processing inbound messages, as both the number of incoming messages and the average processing times increased, during peak hours, we hit critical mass. If we had 3k messages in the queue to persist, and persisting was taking upwards to 10ms, devices would begin re transmitting messages, which, at this point are duplicates, and our queue would snowball.

This problem was only made worse by the fact that if the vehicle turned off before all of its data was sent and ack’d the data would reside on the device until the next time the vehicle’s ignition was turned on, at which point it would again resume trying to send out. This was usually during another peak time and so the cycle continued.

When we introduced hosted RabbitMq this problem got worse, because now we have at least a 25ms round trip time from our data center to the AWS data center where CloudAMQP was hosted. We could have opted to host RabbitMq ourselves, but lacking a dedicated Sys Admin this just wasnt in the cards.

It was around this time that Dot Net Core was in beta and we were looking at migrating our ‘Listener’ infrastructure into AWS to eliminate the 25MS round trip time, and move forward with RabbitMq.

We had the idea to take another step, and write a Listener Microservice. At the time I was really torn between using NodeJs, Python, Java or risking it and using the Beta version of Dot Net Core. The main requirements where:

*Be cross platform, specifically it had to run on Linux (Ubuntu).
*Be really freakin fast
*Accept messages, persist them, and Ack them
*Be stateless
*Scale automagically.
*Live behind a load balancer.

Thats it, small and fast. That last one though, the load balancer, yeah there wasnt much in the way of UDP load balancing when we started the project. AWS didnt support UDP, NGINX didnt and the only one I found at the time was Once I was about halfway done NGINX released their UDP load balancing. AWS still doesnt support it.

At the end of the day we went with the following stack:

* for balancing
*Ubuntu OS
*AWS OpsWorks
*Dot Net Core
*Rabbit Mq
*Redis (for syncing sequence numbers across all instances)

There is still a slight bottleneck when persisting to RabbitMq b/c we have to utilize CONFIRMS to ensure the message is persisted before we can send the ACK. The average time from start to finish is between 5 – 7ms.

That may not sound like much of an improvement, but it doesnt tell the whole story either. Thats the time to process a single message, on a single instance on a single thread. Its about 200 messages / second.

But when I use a typical Producer Consumer model with 25 processing threads we can hit 5000 messages / second, and do so without incurring any additional latency b/c RabbitMq is just awesome. At that rate here is my cpu utilization (from DataDog) over the last 24 hours:

Needless to say we arent even scratching the surface of what this thing can do. Of course we have a few instances running behind the LB.ORG appliance, so we can easily handle 30k messages / second.

All of this running on a T2.Medium instance with 2vPUs and 4GB of RAM, costing ~ $300 / year / instance to run 24/7. We could save even more money utilizing Spot Instances for peak times, but we just dont need it right now.

At the end of the day it has been a pretty awesome experience learning all of these new technologies. RabbitMq is amazing. But I have to give an enormous shout out to the ****Dot Net Core**** team and MSFT. What they have done is really going to shake shit up in the development world.

Gone are the days of going to a Hackathon and being laughed at for not rocking Ruby or NodeJs. C# is an incredibly powerful language and now that it is truly cross-platform I think we are going to start seeing a major paradigm shift in the open-source world.

Some will say yeah, Java has been doing that for ever, and I get that. Java is great. But what Java lacked, at least in comparison to .NET / C#, was Visual Studio. VS is, in my humble opinion, hands down the best development environment in existence.

Seriously, you couple VS with JetBrains ReSharper and you have a code churning productivity machine. Now add in Docker for windows and I can prototype and hack on a level never before possible in this world; and on a level that is, at the very least, equal to that of any other language. I would probably even say it is superior.

High Speed Log4Net

Log4Net is a great logging extension for the .NET EcoSystem, that also supports .NET Standard / .NET Core (which you should be using if you arent).

Unless you want to really read up on the framework extensively it can be easy to fall into some performance traps. I’ve found that in many cases these performance issues are caused by less-than-desirable appender configuration. For example lets say you have a FileAppender.

<appender name="FileAppender" type="log4net.Appender.FileAppender">
<file value="log-file.txt" />
<appendToFile value="true" />
<layout type="log4net.Layout.SimpleLayout" />

When you log (assuming no other appenders are configured) your application will call into log4net, which will attempt to get a file handle, and once successful will write this to your log file, followed by releasing the lock. The point is your application is going to block on this thread whilst completing all those steps.

One solution that can be implemented easily is to just use the BufferingForwardingAppender.

You simply add this appender (which is a forwarding appender)

<appender name="BufferingForwardingAppender" type="log4net.Appender.BufferingForwardingAppender" >
<bufferSize value="100"/>
<appender-ref ref="FileAppender" />
<evaluator type="log4net.Core.LevelEvaluator">
<threshold value="WARN"/>
<appender name="FileAppender" type="log4net.Appender.FileAppender">
<file value="log-file.txt" />
<appendToFile value="true" />
<layout type="log4net.Layout.SimpleLayout" />

Now instead of writing directly to the FileAppender you will write to the BufferingForwardingAppender. Essentially this thread will drop your message in a queue. When that queue fills up (bufferSize) or if you log an event that will trigger the evaluator (warn) all messages will be written to the downstream appenders. In this case the FileAppender. This action will block the calling thread that triggered it, but writing 100 entries to the logfile in one action is much faster than writing 100 individual messages.


Here Are Some Links: Security Concerns

Please note that as of this writing, the majority of the problems discussed below have been addressed by the team. I will point out that they were fairly responsive and thankful for the issues that I presented them.

However there are still some of these problems that exist on their site.

On or around June 25th I discovered several security issues with the website This website provides a platform for photographers and other artists to sell their work. I reached out to a well known security researcher whose name I wont mention until I get permission.

Following that individuals guidance I contacted the folks at where I provided them with the details of my findings. Over the course of the next few weeks we emailed a few times.

Below are the details of my findings. vulnerability


Cookies are used to store session information. The cookies that are set contain 3 pieces of information:

  1. PHPSessionId
  2. Adult Content filter setting
  3. User Id

It is possible to sign in with a valid user name and password to get a valid PHPSessionId and then edit the cookies stored locally to insert a different User Id in order to access another user’s private account information.

You can easily obtain any user’s user id value by browsing to one of their boxes and looking at the page source

var loggedinUserId="";var userId="";

It is trivially simple to gain access to any user’s private information in this manner. In fact the entire website could be easily compromised with a simple script used to harvest user id’s.

What information is accessible?

From my initial research it appears that everything related to a user’s account is accessible, including:

  1. Private account settings
  2. Private messages that have been sent / received
  3. All boxes and their content
  4. Payment Information

Other Information

Another major issue is that the website does not use HTTPS by default. This means that every time the page is loaded the cookie’s containing the PHPSessionId value are transmitted in clear text. This is a major problem b/c it allows for trivial session hijacking.

How to Fix?

Here are some of my suggestions on how to resolve this problem:

  1. Enable HTTPS by default for ALL PAGES
  2. Do not store the userId in the cookie, instead only the session id.
  3. Map session Ids to the correct account in memory server side
  4. Enforce access control checks on all page loads that verify that the session is still active and is valid for the account.
  5. I would suggest not using that user’s Id anywhere in the page, but, it appears this would require significant work to achieve and may not be feasible.
  6. Expire sessions after a shorter period of time.