Mar 08 2012

Devops and Drupal, the Survey, the Results

I've just finished presenting the results of our Drupal and Devops survey at the Belgian Drupal User Group meetup at our office

and I've uploaded the slides to slideshare for the rest of the world to cry read.

Honestly I was hoping for the audience to prove me wrong and I was expecting all of them to claim they were doing automated and repeatable deployments.

But there's hope...

Feb 15 2012

Logstash and ElasticSearch

"An expert is a man who has made all the mistakes which can be made, in a narrow field."
Niels Bohr

When I setup Logstash for the very first time I got bitten by an empty search, aparently no logs were indexed. Reading the log files indeed told me about

  1. WARN: org.elasticsearch.discovery.zen.ping.unicast: [Blaire, Allison] failed to send ping to [[#zen_unicast_1#][inet[/127.0.0.1:9300]]]
  2. INFO | jvm 1 | 2012/02/06 22:45:55 | org.elasticsearch.transport.RemoteTransportException: [Page, Karen][inet[/127 .0.0.1:9300]][discovery/zen/unicast]
  3. INFO | jvm 1 | 2012/02/06 22:45:55 | Caused by: java.io.EOFException

The above is the typical error when the ElasticSearch version you are using externally is not in sync with the one Logstash is using, yes those versions need to match.

Fast forward a couple of weeks.. and I`m upgrading Logstash and therefore also ElasticSearch .. I have a Vagrant setup to play with so all of the components are running on 1 node.

I kept running into a similar problem, this time however I saw log entries being indexed, I could get data from my ElasticSearch setup using
wget -q -S -O - http://localhost:9200/_status?pretty=true

But the web interface kept showing no results ;(

While nagging about it on irc .. Jordan gave me the insight :

2012-01-31.194347+0100CET.txt:(07:55:36 PM) whack: slight caveat that elasticsearch clients also join the cluster, so if you point everyone at 127.0.0.1:9300, that :9300 could be one of your clients, not the server

Indeed when you by accident start any of the logstash instances (server/shipper/web) before you start your ElasticSearch instance you can be in trouble.
Ordering really matters , you really need to start ElasticSearch before you start the clients.

Obviously is you don't use the unicast setup you don't run into this problem ..

So what other mistakes should I make ?

Feb 11 2012

The ultimate 2012 open source and devops conference

Kent Skaar pinged me last week , asking for feedback on Lisa'11 and input for Lisa 2012.

Thought I should share my advise to him with the rest of the world

So If I were to host an event similar to Lisa I'd had either
Jordan Sissel or Mitchell Hashimoto give the keynote because over the past 24 months those people have written more relevant tools for me than anyone else :)

I'd have someone talk about Kanban for Operations, There's 2 names that pop up Dominica DeGrandis and Mattias Skarin

I'd have the Ubuntu folks talk about JuJu and I'd have RI Pienaar talk about MCollective .. while you have RI have him talk about Hiera too. Have Dean Wilson carry RI's bags and put him unknowingly on a panel. (Masquerade it as a Pub with hidden cameras)

Obviously as #monitoringsucks you want to hear about new monitoring tools initiatives and how people are dealing with them , so you want people talking about Graphite, Collectd, Statsd, Sensu , Icinga-MQ And how people are reviving Ganglia and using that in large scale environments.

You want someone to demistify Queues, I mean .. who still knows about the differences between Active, Rabbit , Zero, Hornet and many other Q's ?

You want people talking about how they deal with logs, so talks about Logstash and Graylog2.

You want to cover Test Driven Infrastructure How do you test your infrastructure , someone to demystify Cucumber and Webrat , and talk about testing Charms, Modules, and Cookbooks.

Oh and Filesystems , distributed ones the Ceph, FraunhoverFS, Moose, KosmosFS, Glusters, Swifts of this world ... you want people to talk about their experiences , good and bad with any of the above, someone who can actually compare those rather than heresay stuff. :) With recent updates on what's going on in these projects.

Now someone please organise this for me :) In a warm and sunny place ... preferably with 27 holes next door , and daycare for my kids :)

PS. Yes the absence of any openstack related topic is on purpose .. that's for 2013 :)

Feb 10 2012

We didn't fix it

MonitoringSucks and we didn't fix it.

Earlier this week Inuits hosted a 2 day hackfest titled #MonitoringSucks. A good number of people with a variety of backgrounds showed up on monday morning. I don't know why but people had high expectations for this event , did they really expect us to fix the #monitoringsucks problem in a mere 2 days ?

Next to myselve we had Patrick Debois , Grégory Karékinian, Stefan Jourdan, Colin Humphreys, Andrew Crump, Ohad Levy , Frank Marien, Toshaan Bharvani, Devdas Bhagat, Maciej Pasternacki Axel Beckert Jelle Smet, Noa Resare @blippie , John John Tedro @udoprog, Christian Trabold @ctrabold and obviously some people I missed


A good mixture of Fosdem visitors that stayed a litte longer in our cold country and locals with ideas. We had people from TomTom, RedHat , Spotify, Booking.com, Inuits, Atlassian, coming from Belgium, The Netherlands France, Israel, the UK, Sweden, Germany, Poland and Switzerland if I`m not mistaken.

The format was pretty open, much of the first day was spend around the drawingboard.

people around the drawingboard
(Ohad Levy, Jelle Smet, PatrickDebois and Frank Marien) discussing a variety of topics

This monitoring topic is complex, there are different areas that need to be covered. The drawing below documents how we splitted the problem into different areas , and listed the different tools people use for these areas.

  • Collection: Collectd, Nagios, Ganglia
  • Transport: XMPP, Smiple, Smtp, 0mq , APMQ, rsyslog, irc, stomp
  • Storage : rrd, graphite, opentsdb, hbase,
  • Filtering: logstash, esper,
  • Visualisation : Graphite,
  • Notifcation: PagerDuty
  • Reporting: Jasper

Obviously above list is far from complete.




The afternoon discussion continued where we left of before lunch, just after the powercut. Only now we started refocussing on filtering and aggregating values using Logstash
@patrickdebois had been talking about the idea to use Logstash as a way to collect data , transform it and throw it either to another tool, or onto a Queue before.
Looking at Logstash it makes kind of sense. Logstash already has a zillion of input types, filters and outputs. Including popular queues such as amqp and zeromq. Yes, the default behaviour for a lot of people is to get data from different inputs, filter it and then send it to ElasticSearch, but much more is possible with the available outputs.






It was only on tuesday that people really started writing code
So what did really come out of the #monitoringsucks hackfest. ?

A couple of people were working on packaging existing tools for their favourite distro. Others were working on integrating a number of other already existing tools (e.g Patrick working on more inputs for Logstash., me working on replacing logster with Logstash, setting up Kibana etc. New tools were learned, items were added to todolists (Kibana, (doesn't work on older Firefox instances) Tattle, statsd) and items were scratched from todolists (Graylog2 (Kibana replaces that as a good Frontend for Logstash) )

A lot of experiences with different tools were exchanged

Frank Marien showed us a demo of his freshly release ExtremeMon framework. A really promising project.

The sad part about a workshop like this one is that you enter with a bunch of ideas , and leave with even more ideas, hence more work. We haven't solved the problem yet, but a lot of more people are now thiking about the problem and how to solve it a more modulare (unix style) approach. With different litte tools, all being good at something and all being interconnectable.

Feb 01 2012

#monitoringsucks hackathon 6&7 february Practical details:

As announced earlier next monday and tuesday we're opening up the Inuits offices for everybody working on monitoring problems.

There's already a good number of people that have confirmed their presence and some people have asked

As for practical details .. the plan is simple.
I`m going to be at the place somewhere between 8:30 and 9:00 on monday. ( Hey .. it's the day after Fosdem you know :))

The only thing I've planned is to do a get to know eachother round around 10:30 after that I`m expecting the hackathon to be self organising,

There will be water, coffee , etc , IP connectivity, and electricity.

The location is still Duboisstraat 50, Antwerp

Free parking is on the Hardenvoort or Kempenstraat ( 3minutes walk) , paid parking right in front of the door.

Jan 03 2012

Graphite, JMXTrans, Ganglia, Logster, Collectd, say what ?

Given that @patrickdebois is working on improving data collection I thought it would be a good idea to describe the setup I currently have hacked together.

(Something which can be used as a starting point to improve stuff, and I have to write documentation anyhow)

I currently have 3 sources , and one target, which will eventually expand to at least another target and most probably more sources too.

The 3 sources are basically typical system data which I collect using collectd, However I`m using collectd-carbon from https://github.com/indygreg/collectd-carbon.git to send data to Graphite.

I`m parsing the Apache and Tomcat logfiles with logster , currently sending them only to Graphite, but logster has an option to send them to Ganglia too.

And I`m using JMXTrans to collect JMX data from Java apps that have this data exposed and send it to Graphite. (JMXTrans also comes with a Ganglia target option)

Rather than going in depth over the config it's probably easier to point to a Vagrant box I build https://github.com/KrisBuytaert/vagrant-graphite which brings up a machine that does pretty much all of this on localhost.

Obviously it's still a work in progress and lots of classes will need to be parametrized and cleaned up. But it's a working setup, and not just on my machine ..

Jan 03 2012

#monitoringsucks and we'll fix it !

If you are hacking on monitoring solutions, and want to talk to your peers solving the problem
Block the monday and tuesday after fosdem in your calendar !

That's right on february 6 and 7 a bunch of people interrested to fix the problem will be meeting , discussing and hacking stuff together in Antwerp

In short a #monitoringsucks hackathon

Inuits is opening up their offices for everybody who wants to join the effort Please let us (@KrisBuytaert and @patrickdebois) know if you want to join us in Antwerp

Obviously if you can't make it to Antwerp you can join the effort on ##monitoringsucks on Freenode or on Twitter.

The location will be Duboistraat 50 , Antwerp
It is about 10 minutes walk from the Antwerp Central Trainstation
Depending on Traffic Antwerp is about half an hour north of Brussels and there are hotels at walking distance from the venue.

Plenty of parking space is available on the other side of the Park

Dec 31 2011

What is devops ?

I`m parsing the responses of the Deploying Drupal survey I started a couple of months ago (more on that later)

One of the questions in the survey is "What is devops" , apparently when you ask a zillion people (ok ok, just a large bunch of Tweeps..), you get a large amount of different answers ranging from totally wrong to spot on.

So let's go over them and see what we can learn from them ..

The most Wrong definition one can give is probably :

  • A buzzword

I think we've long passed the buzzword phase, definitely since it's not new, it's a new term we put to an existing practice. A new term that gives a lot of people that were already doing devops , a common word to dicuss about it. Also lots of people still seem to think that devops is a specific role, a job description , that it points to a specific group of people doing a certain job, it's not . Yes you'll see a lot of organisations looing for devops people, and giving them a devops job title. But it's kinda hard to be the only one doing devops in an organisation.

I described one of my current roles as Devops Kickstarter, it pretty much describes what I`m doing and it does contain devops :)

But devops also isn't

  • The connection between operations and development.
  • people that keep it running
  • crazy little fellows who find beauty in black/white letters( aka code) rather than a view like that of Taj in a full moon light.
  • the combination of developer and operations into one overall functionality
  • The perfect mixture between a developer and a system engineer. Someone who can optimize and simplify certain flows that are required by developers and system engineers, but sometimes are just outside of the scope for both of them.
  • Proxy between developer and management
  • The people in charge of the build/release cycle and planning.
  • A creature, made from 8-bit cells, with the knowledge of a seasoned developer, the skillset of a trained systems engineer and the perseverence of a true hacker.
  • The people filling the gap between the developer world and the sysadmin world. They understand dev. issues and system issues as well. They use tools from both world to solve them.

Or

  • Developers looking at the operations of the company and how we can save the company time and money

And it's definitely not

  • Someone who mixes both a sysop and dev duties
  • developers who know how to deploy and manage sites, including content and configuration.
  • I believe there's a thin line line between Ops and Devs where we need to do parts of each others jobs (or at least try) to reach our common goal..
  • A developer that creates and maintains environments tools to help other developers be more successful in building and releasing new products
  • Developers who also do IT operations, or visa versa.
  • Software developers that support development teams and assist with infrastructure systems

So no, developers that take on systems roles next to their own role and want to go for NoOps isn't feasable at all ..you really want collaboration, you want people with different skillsets that (try to) understand eachoter and (try to) work together towards a common goal.

Devops is also not just infrastructure as code

  • Writing software to manage operations
  • system administrators with a development culture.
  • Bring code management to operations, automating system admin tasks.
  • The melding of the art of Systems Administration and the skill of development with a focus on automation. A side effect of devops is the tearing down of the virtual wall that has existed between SA's and developers.
  • Infrastructure as code.
  • Applying some of the development worlds techniques (eg source control, builds, testing etc) to the operations world.
  • Code for infrastructure

Sure infastructure as code is a big part of the Automation part listed in CAMS, but just because you are doing puppet/chef doesn't mean you are doing devops.
Devops is also not just continous delivery

  • A way to let operations deploy sites in regular intervals to enable developers to interact on the systems earlier and make deployments easier.
  • Devops is the process of how you go from development to release.

Obviously lots of people doing devops also often try to achieve Continuous delivery, but just like Infrastructure as Code it devops is not limited to that :)

But I guess the truth is somewhere in the definitions below ...

  • That sweet spot between "operating system" or platform stack and the application layer. It is wanting sys admins who are willing to go beyond the normal package installers, and developers who know how to make their platform hum with their application.
  • Breaking the wall between dev and ops in the same way agile breaks the wall between business and dev e.g. coming to terms with changing requirements, iterative cycles
  • Not being an arsehole!
  • Sysadmin best-practise, using configuration as code, and facilitating communication between sysadmins and developers, with each understanding and participating in the activities of the other.
  • Devops is both the process of developers and system operators working closer together, as well as people who know (or who have worked in) both development and system operations.
  • Culture collaboration, tool-chains
  • Removing barriers to communication and efficiency through shared vocabulary, ideals, and business objectives to to deliver value.
  • A set of principles and good practices to improve the interactions between Operations and Development.
  • Collaboration between developers and sysadmins to work towards more reliable platforms
  • Building a bridge between development and operations
  • The systematic process of building, deploying, managing, and using an application or group of applications such as a drupal site.
  • Devops is collaboration and Integration between Software Development and System Administration.
  • Devops is an emerging set of principles, methods and practices for communication, collaboration and integration between software development (application/software engineering) and IT operations (systems administration/infrastructure) professionals.[1] It has developed in response to the emerging understanding of the interdependence and importance of both the development and operations disciplines in meeting an organization's goal of rapidly producing software products and services.
  • bringing together technology (development) & content (management) closer together
  • Making developers and admins understand each other.
  • Communication between developers and systems folk.
  • a cultural movement to improve agility between dev and ops
  • The cultural extension of agile to bring operations into development teams.
  • Tight collaboration of developers, operations team (sys admins) and QA-team.

But I can only conclude that there is a huge amount of evangelisation that still needs to be done, Lots of people still don't understand what devops is , or have a totally different view on it.

A number of technology conferences are and have taken up devops as a part of their conference program, inviting experienced people from outside of their focus field to talk about how they improve the quality of life !

There is still a large number of devops related problems to solve, so that's what I`ll be doing in 2012

Dec 29 2011

Installing Vagrant, on Ubuntu Natty

(Warning some Ubuntu ranting ahead)

  1. apt-get install virtualbox-ose
  2. apt-get install rubygems
  3. gem install vagrant

That's what I assumed it would take me to install vagrant on a spare Ubuntu (Natty) laptop.

Well it's not. after that I was greeted with some weirdness.

  1. $vagrant
  2. vagrant: command not found...

Yet gem list --local showed the vagrant gem installed.

  1. $ruby
  2. ruby: command not found

I looked twice, checked again and indeed it seems you can install rubygems on natty with no ruby installed #dazedandconfused

So unlike other distro's on Ubuntu doesn't add the rubygems binary path to it's default path
After adding that to my .bashrc things started working better.

The active reader has noticed that by now half of the Twittersphere was pointing me to the already implemented
above solution and the other half was telling me to not install rubygems using apt-get, or to use rvm for all my rubygem troubles

Apart from the point that if you need tools to like rvm to fix things that are fundamentally broken, the fact is that joe average java developer doens't want to be bothered with RubyGem hell , he just wants to do apt-get install Vagrant and get on with his real work, and that's exactly what I'd expect from Linux for human beings

I'd expect any junior guy to be able to go to vagrantup.com read the 4 commands on the main page and be up and running
Coz that's how it works on my Bleeding Edge Enterprise Development Distro, the one I usually would not advise those people (and my mother) to use.

Dec 19 2011

How I like my Java

This is a repost of my article earlier posted at Jordan Sissel's awesome SysAdvent

After years of working in Java-based environments, there are a number of things that I like to implement together with the teams I`m working with - the application doesn't matter much, whether it's plain java, Tomcat, JBoss, etc, these deployment strategies will help your ops and dev teams build more managable services.

Packaging

The first step is to have the native operating system packages as build artifacts rolling out of your continuous integration server - No .ear, .war or .jar files: I want to have rpms or debs. With things like fpm or the maven rpm plugin this should not be an extra hassle, and the advantages you get from doing this are priceless.

What advantages? Most native package systems support dependency resolution, file verification, and upgrades (or downgrades). These are things you would have to implement yourself or cobble together from multiple tools. As a bonus, your fellow sysadmins are likely already comfortable with the native package tool used on your systems, so why not do it?

Proxied, not running as root

Shaken, not stirred

Just like any other daemon, for security reasons, I prefer to run run Tomcat or JBoss as its own user, rather than as root. In most cases, however, only root can bind to ports below 1024, so you need to put a proxy in front. This is a convenient requirement because proxying (with something like Apache) can be used to terminate SSL connections, give improved logging (access logs, etc), and provides the ability to run multiple java application server instances on the same infrastructure.

Service Management

Lots of Java application servers have a semi functional shell script that allows you to start the service. Often, these services don't daemonize in a clean way, so that's why I prefer to use the Java Service wrapper from Tanuki to manage most Java based services. With a small config file, you get a clean way to stop and start java as a service and even the possibility to add some more monitoring to it.

However, there are some problems the Java Service wrapper leaves unsolved. For example, after launching the service, the wrapper can return back with a successful exit code while your service is not ready yet. The application server might be ready, but your applications themselves are still starting up. If you are monitoring these applications (e.g for High Availability), you really only want to treat them as 'active' when the application is ready, so you don't want your wrapper script to return, "OK," before the application has been deployed and ready. Otherwise, you end up with false positives or nodes that failover before the application has ever started. It's pretty easy to create a ping-pong service flapping scenario on a cluster this way.

One application per host

I prefer to deploy one application per host even though you can easily deploy multiple applications within a single Java VM. With one-per-host, management becomes much easier. Given the availability and popularity of good virtualization, the overhead of launching multiple Linux VM's for different applications is so low that there are more benefits than disadvantages.

Configuration

What about configuration of the application? Where should remote API urls, database settings, and other tunables go? A good approach is to create a standard location for all your applications, like /etc/$vendor/app/, where you place the appropriate configuration files. Volatile application configuration must be outside the artifact that comes out the build (.ear , .jar, .war, .rpm). The content of these files should be managed by a configuration management tool such as puppet, chef, or cfengine. The developers should be given a basic training so they can provide the systems team with the appropriate configuration templates.

Logs

Logs are pretty important too, and very easy to neglect. There are plenty of alternative tools around to log from a Java application: Log4j, Logback, etc .. Use them and make sure that they are configured to log to syslog, then they can be collected centrally and parsed by tools much easier than if they were spread all over the filesystem.

Monitoring

You also want your application to have some ways to monitor it besides just checking if it is running - it is usually insufficient to simply check if a tcp server is listening. A nice solution is to have a simple plain text page with a list of critical services and whether they are OK or not (true/false), for example:

  1. someService: true
  2. otherService: false

This benefits humans as well as machines. Tools like mon, heartbeat or loadbalancers can just grep for "false" in the file. If the file contains false, it reports a failure and fails over. This page should live on a standard location for all your applications, maybe a pattern like this http://host / servicename/health.html and an example "http://10.0.129.10:8080/mrs-controller/health.html". The page should be accessible as soon as the app is deployed.

This true/false health report should not be a static HTML file; it should be a dynamically generated page. Text means that you can also use curl, wget, or any command-line tool or browser to check the status of your service.

The 'health.html' page should report honestly about health, executing any code necessary to compute 'health' before yielding a result. For example, if your app is a simple calculator, it should verify health by doing tests internally like adding up some numbers before sharing 'myCalculator:true' in the health report.

The 'health.html' page should report honestly about health, executing any code necessary to compute 'health' before yielding a result. For example, if your app is a simple calculator, then before reporting health it should put two and two together and get four.

This kind of approach could also be used to provide you with metrics you can't learn from the JVM, such as number of concurrent users or other valid application metadata for measurement and trending purposes.

Conclusion

If you can't convince your developers, then maybe more data can help: Check out Martin Jackson's (presentation on java deployments) Automated Java Deployments with RPM

With good strategies in packaging, deployment, logging, and monitoring, you are in a good position to have an easily manageable, reproducible, and scalable environment. You'll give your developers the opportunity to focus on writing the application, they can use the same setup on their local development boxes (e.g. by using vagrant) as you are using on production.