hpc

Nov 10 2007

Scaling Drupal

John Quinn writes about Scaling Drupal he is taking a one step at a time approach and is still writing his 4th and 5 stages.

His first step obviously is separating the drupal from a separate database server, and he chooses mysql for this purpose, moving your DB to a different machine is a good thing to do.

However then he gets this crazy idea of using NFS to share his his drupal shared files :(
(he even dares to mention that the setup ease is good) Folks, we abandonned NFS in the late nineties. NFS is still a recipe for disaster, it has performance issues , it as stability issues (stale locks), and no security admin in his right mind will tolerate portmap to be running in his DMZ.
(Also think about the IO path that one has to follow to serve a static file to a surfer when the file is stored on a remote NFS volume)

On top of that he adds complexity in a phase where it isn't needed yet. Because of the fact he needs to manage and secure NFS and he is storing his critical files on the other side of the ethernet cable he did create a single point of failure he didn't need creating yet.
Yes as soon as you start to scale you need to look at a scalable and redundant way to share your files.
When those files are pretty static you'll start out with a set of rsync scripts or scripts that push them to different servers upon deploying your application. When they are changing often you start looking into filesystems or block devices that bring you replication, such as DRBD or Lustre
But if today his NFS server goes down he is screwed, much harder than when his database has a hickup.

One could discuss the order of scaling, but adding more webservers might not always be the first step to take, one might want to tackle the database first depending on the application.
He decides to share the load of his application over multiple Drupal instances using apache mod_proxy , then adds Linux-HA to make it highly available.
I`m interested in knowing why he chose for apache mod_proxy and not for LVS

Although using NFS for me belongs in a How NOT to scale tutorial, his other steps give you a good idea of the steps to take.

I`m looking forward to his next steps :) I hope that in part 4 he also removes NFS in favour of a solution with no performance and locking issues that really takes away a big fat single point of failure. In part 5 he discusses how to scale your database environment. The actual order of implementing step 2 and 5 will be different for each setup.

Anyway.. I`m following up on his next steps.. interesting reading