Mar 23 23:38

It's just rubber , with some air in it.

- A child balloon
you inflate it, play with it for a while, then it explodes, you throw it away you inflate another one, maybe even a different color. the kid plays with it .. till it breaks, then you throw it away...

- An inflatable castle.
you inflate it, play with it for a while, deflate it, move it around, inflate it again, if it has a hole in it, you patch the hole.

- The tyres on your kids bike,.
You inflate it , kid rides on it, if it starts losing air, you fix it asap so you can continue using it.

All of the 3 are really valuable use cases for rubber with air in it,
in a way they are all equally valuable, but serve different purposes, different use cases.

Now think about this next time you spin up a container, that's running a database, application server and where your users ssh to.

It's not just another VirtualMachine

Mar 23 23:28

Lies, Damn Lies and Statistics, 2016 Edition

When people sign up for Configuration Management Camp, we ask them what community room they are mostly interested in.
We ask this question because we have rooms in different sizes and we don't want to put communities with 20 people showing interest in a 120 seat room and we don't want to put a community with 200 people in a 60 seat room.

But it also gives us to opportunity to build some very interesting graph over the potential evolution of the communities.

So looking at the figures ... the overall community is obviously growing,From 350 to 420, to just short of 600 people registered now.

The Puppet Community is not the biggest anymore, that spot went to the Ansible Community room. And all but the CFengine communities are growing.

One more thing , The organisation team discussed several times if we should rebrand the event. We opted not to .. could have been an alternative name .. but we decided to stick with the name that already is known,
the content will evolve.. but Config Management Camp will stay the place where people that care about Infrastructure as Code and Infrastructure automation meet.

Mar 23 22:58

Recent and Upcoming Talks

I gave a couple of new and updated talks the last couple of months.

At Fosdem, I opened up the Distributions Devroom with a talk on how we could improve the collaboration between the developers of a distro, and their users, the ops folks. I seemed to not be the only person with similar ideas as the smart folks over at the CentOS project were already actively talking about their efforts into making some my ideas become reality the day before at the CentOS Dojo.

Another talk of early this year was an update on the why Security is requires a devops aproach, why you want to embed security as a standard practice in your development process and why continuous delivery actually is a security requirement.

At last weeks FLOSS UK Conference in London , I gave an updated version of My MonitoringLove talk, giving an opinionated overview of the current state of Open Source Monitoring tools.

I was scheduled to give the Opening Keynote today (23/3/2016) at the Making Open Source Software Conference , but sadly I had to cancel that due to yesterdays (22/3/2016) events at Brussels Airport. My flight to Bucharest was obviously cancelled.

I`m scheduled to open up the 2nd day of the the upcoming Devopsdays London Edition

And Bernd Erk tricked me into giving a follow up talk on my popular 7 tools for your devops stack talk, aptly titled Another 7 tools for your devops stack

Jan 07 10:48

Bimodal IT , redefined

There's been a lot of discussion about the sillynes of the term BiModal IT, aka the next big thing for IT-organisations that don't dare to change, but still want to sound cool.

So here is my idea to reuse that term for something relevant.

BiModal IT, is the idea where you take a fully automated infrastructure which has been build on the principles of Infrastructure as Code. Which gets periodic idempodent updates (e.g every 15 or 30 minutes, or when you orchestrate it), and consistency checks , where the source code for that infrastructure is versioned , tested and delivered trough a traditional Continuous Delivery pipeline for the majority of your services. and add realtime reconfiguration capacities based on service discovery for the other services that really change fast, or in a real elastic way, using tools like Consul, Consul_template, etcd etc..

That way you have 2 modes of managing your infrastructure, aka BiModal

Jul 28 08:35

The power of packaging software, package all the things

Software delivery is hard, plenty of people all over this planet are struggling with delivering software in their own controlled environment. They have invented great patterns that will build an artifact, then do some magic and the application is up and running.

When talking about continuous delivery, people invariably discus their delivery pipeline and the different components that need to be in that pipeline.
Often, the focus on getting the application deployed or upgraded from that pipeline is so strong that teams
forget how to deploy their environment from scratch.

After running a number of tests on the code , compiling it where needed, people want to move forward quickly and deploy their release artifact on an actual platform.
This deployment is typically via a file upload or a checkout from a source-control tool from the dedicated computer on which the application resides.
Sometimes, dedicated tools are integrated to simulate what a developer would do manually on a computer to get the application running. Copy three files left, one right, and make sure you restart the service. Although this is obviously already a large improvement over people manually pasting commands from a 42 page run book, it doesn’t solve all problems.

Like the guy who quickly makes a change on the production server, never to commit the change, (say goodbye to git pull for your upgrade process)
If you package your software there are a couple of things you get for free from your packaging system.
Questions like, has this file been modified since I deployed it, where did this file come from, when was it deployed,
what version of software X do I have running on all my servers, are easily answered by the same
tools we use already for every other package on the system. Not only can you use existing tools you are also using tools that are well known by your ops team and that they
already use for every other piece of software on your system.

If your build process creates a package and uploads it to a package repository which is available for the hosts in the environment you want to deploy to, there is no need anymore for
a script that copies the artifact from a 3rd party location , and even less for that 42 page text document which never gets updated and still tells you to download yaja.3.1.9.war from a location where you can only find
3.2 and 3.1.8 and the developer that knows if you can use 3.2 or why 3.1.9 got removed just left for the long weekend.

Another, and maybe even more important thing, is the current sadly growing practice of having yet another tool in place that translates that 42 page text document to a bunch of shell scripts created from a drag and drop interface, typically that "deploy tool" is even triggered from within the pipeline. Apart from the fact that it usually stimulates a pattern of non reusable code, distributing even more ssh keys , or adding yet another agent on all systems. it doesn’t take into account that you want to think of your servers as cattle and be able to deploy new instances of your application fast.
Do you really want to deploy your five new nodes on AWS with a full Apache stack ready for production, then reconfigure your load balancers only to figure out that someone needs to go click in your continuous integration tool or deployment to deploy the application to the new hosts? That one manual action someone forgets?
Imvho Deployment tools are a phase in the maturity process of a product team.. yes it's a step up from manually deploying software but it creates more and other problems , once your team grows in maturity refactoring out that tool is trivial.

The obvious and trivial approach to this problem, and it comes with even more benefits. is called packaging. When you package your artifacts as operating system (e.g., .deb or .rpm) packages,
you can include that package in the list of packages to be deployed at installation time (via Kickstart or debootstrap). Similarly, when your configuration management tool
(e.g., Puppet or Chef) provisions the computer, you can specify which version of the application you want to have deployed by default.

So, when you’re designing how you want to deploy your application, think about deploying new instances or deploying to existing setups (or rather, upgrading your application).
Doing so will make life so much easier when you want to deploy a new batch of servers.

May 11 20:06

On the importance of idempotence.

A couple of months ago we were seeing weird behaviour with consul not knowing all it's members at a customer where we had deployed Consul for service registration as a POC
The first couple of weeks we hadn't noticed any difficulties but after a while we had the impression that the number of nodes in the cluster wasn't stable.

Obviously the first thought is that such a new tool probably isn't stable enough so it's expected behaviour , but rest asured that was not the case.

We set out to frequently monitor the number of nodes
a simple cron to create a graph.

  1. NOW=`date +%s`
  2. HOST=`hostname -f`
  3. MEMBERS=`/usr/local/bin/consul members | wc -l`
  5. echo "consul_members.$HOST $MEMBERS $NOW" | graphite 2003

It didn't take us very long to see that indeed the number members in the cluster wasn't stable, frequently there were less nodes in a cluster then slowly the expected number of nodes came back on our graph.

Some digging learned us that the changes in number of nodes was in sync with our puppetruns.
But we weren't reconfiguring consul anymore, there were no changes in the configuration of our nodes.
Yet puppet triggered a restart of consul on every run. The restart was because knew it had rewritten the consul config file.
Which was weird as the values in that file were the same.

On closer inspection we noticed that the values in the file didn't change, however the order of the values in the file
changed. From a functional point of view that did not introduce any changes, but puppet rightfully assumed the configuration file
had changed and thus restarted the service dutyfully.

The actually problem lied in the implementation of the writing of the config file which was in JSON,
The ancient Ruby library just took the hash and wrote it in no specific order, each time potentially resulting
in a file with the content in a different order.

A bug fix to the puppet module made sure that the hash was written out in a sorted way , so each time resulting in the
same file being generated.

After that bugfix obviously our graph of the number of nodes in the cluster flatlined as restarts were not being introduced anymore.

This is yet another example of the importance of idempotence . When we trigger a configuration run , we want to
be absolutely sure that it won't change the state of the system if it already has been defined the way we want.
Rewriting the config file should only happen if it gets new content.

The yak is shaved .. and sometimes it's not a funky dns problem but just a legacy ruby library one ..

May 03 2015

What done REALLY looks like in devops

Steve Ropa blogged about What done looks like in devops , I must say I respecfullly , but fully disagree with Steve here.

For those of you that remember I gave an Ignite about my views on the use of the Definition of Done back ad #deovpsdays 2013 in Amsterdam.

In the early days we talked about the #devops movement partly being a reaction against the late friday night deployments where the ops people got a tarball with some minimalistic notes and were supposed to put stuff in production. The work of the development team was Done, but the operations team work just started.

Things have improved .. like Steve mentions for a lot of teams done now means that that their software is deployable, that we have metrics from them, that we can monitor the application.

But lets face it .. even if all of that is in place there is still going to be maintenance, security fixes, major stack upgrades, minor application changes, we all still need to keep the delivery pipelines running.

A security patch on an appliction stack means that both the ops and the developers need to figure out the required changes together.

Building and delivering value to your end users is something that never ends, we are never actually done.

So let me repeat ,

"Done is when your last enduser is in his grave"
In other words, when the application is decomissioned.

And that is the shared responsability mindset devops really brings, everybody is caring about the value they are bringing to their customers, both developers and operations people. Thinking about keeping the application running. And not assuming that because a list of requirements have been validated at the end of a sprint we are done. Because we never are...

BTW. Here's my original slides for that #devopsdays Amsterdam talk.

Feb 09 2015

2014 vs 2015 interest in Open Source Configuration Management

A couple of people asked me to results of the survey of the 2015 vs 2014 Configuration Management Camp room interrests.

This is a bunch of 350 last year and 420 people telling us what tools they are interested in so we can map the right roomsizes to the communities.

2014 :


Enjoy.. but remember there's Lies, Damn Lies and Statistics ..
PS. this is a mostly European Audience .

Sep 20 2014

On Systemd and devops

If it's not broken , don't fix it.
Those who don't understand Unix are doomed to reinvent it, poorly
Complexity is the enemy of reliability.

Are some of the more frequently heard arguments in the systemd discussion. Indeed I see and hear a lot of senior Linux people react openly and probably way to late against the introduction of systemd in a lot of our favorite Linux distributions.

To me this is a typical example of the devops gap. The gap between developers writing code and operations needing to manage that code on production platforms at scale.
Often developers writing code that they think is useful and relevant while they are not listening to their target audience , in this case not the end users of the systems but the people that are maintaining the platforms. The people that work on a daily base with these tools.

I have had numerous conversations with people in favor and against systemd, till today I have not found a single general purpose use case that could convince me of the relevance of this large change in our platforms. I've found edge cases where it might be relevant. but not mainstream ones. I've also seen much more people against it than in favor. I've invited speakers to conference to come and teach me. I've probably spoken to the wrong people,

But this is not supposed to be yet another systemd rant.. I want to tackle a bigger problem. The problem that this change and some others have been forced upon us by distributions that should be open, and listen to their users, apparently both Debian and Fedora/RHEL failed largely but somehow fail to listen to their respective communities. Yes we know that e.g Fedora is the development platform and acts as a preview of what might come up in RHEL and thus CentOS later , but not everything eventually ends up in RHEL. So it's not like we didn't have an 'acceptance' platform where we could play with the new technology. The main problem here is that we had no simple way to stop the pipeline, it really feels like that long ago Friday evening rush deploy. Not like a good conversation between developers and actual ops on the benefits and problems of implementing these changes. This feels like the developers of the distributions deciding what goes in from their own little silo and voting in 'private' committee.

It also feels like the ops people being to busy to react, "Someone else will respond to this change, it's trivial this change is wrong , someone else will block this for sure",

And the fact that indeed Operating System developers, like Fedora and Debian friends kinda live in their own silo. (specifically not listing CentOS here..)

So my bigger question is .. how do we prevent this from happening again.. how do we make sure that distributions actually listen to their core users and not just the distribution developers.

Rest assured, Systemd is not the only case with this problem .. there's plenty of cases where features that were used by people, sometimes even the something people considered the core feature of a project got changed or even got ripped out by the developers because they didn't realize they were being used, sometimes almost killing that Open Source project by accident.
And I don't want that to happen to some of my favourite open source projects ..

Aug 12 2014

Upcoming Conferences

After not being able to give my planned Ignite at #devopsdays Amsterdam because I was down with the flu here's some fresh opportunities to listen to my rants :)

In September I`ll be talking at PuppetConf 2014, San Francisco , USA about some of the horror stories we went trough over the past couple of years when deploying infrastructure the automated fasion.

Just one week later I`ll be opening the #devops track at DrupalCon Amsterdam together with @cyberswat (Kevin Bridges) where we'll talk about the current state of #drupal and #devops , We'll be reopening the #drupal and #devops survey shortly, more info about that later here..

Just a couple of weeks later I will be ranting about Packaging software on Linux at LinuxConf Europe in Dusseldorf, Germany

And in November , I`m headed to Nuremberg, Germany where I will be opening the Open Source Monitoring Conference tinkering about the current state of Open Source Monitoring, do we love it .. or does it still suck :)

That's all ..
for now ..