Knee-deep

Posted 11 months, 3 weeks ago at 11:32. 5 comments

I’m in the process of wading into the Mozilla Buildbot story. It’s a little murky, and things seem to be trying to grab my ankles, but otherwise the water looks strangely familiar so far.

Tinderbox is a bit of a beast, have no doubt. Much of the base code remains unchanged since I first interacted with it in 1998, but we’ve made some key improvements “recently.” Things like automatic config updating from CVS and CLOBBER support have reduced our Tinderbox maintenance substantially, and need to find their way into our Buildbot setup ASAP (bugs are on file for as much). Buildbot allows us to do some things in a much smarter way than Tinderbox, but our current Buildbot setup seems like a bit of a step backwards for me maintenance-wise.

I’m happy to bring my experiences with Tinderbox to bear on the problem. Hopefully we can apply those hard-learned lessons and quickly get to a better maintenance state.

Current Tunes: The Orb - DJ Asylum (7" Edit) | Filed under Build/Release, Mozilla, QA |

5 Replies

  1. For better or worse, I think this is the result of the way we “chose” BuildBot as a continuous integration tool (which is to say, a bunch of people said “Hey, let’s switch to BuildBot!” but then didn’t really take the time to understand the intricacies of fully replacing Tinderbox, so it wasn’t really ever a solidly-made choice), and then started using it for a bunch of things, and suddenly, we’re relying on it in production.

    Because the decision wasn’t coherently made, the rollout stage was haphazard, and conducted at different times, and so knowledge learned from setting up certain types of Buildbots wasn’t available to others setting up classes of Buildbots. And then you got a proliferation of different configurations and setups… and as you point out, it’s sort of a mess and it’s missing some pretty core functionality that… well… turns out to be pretty important in a production environment.

    Don’t get me wrong… I think BuildBot is a lot better than Tinderbox… I mean, come on… any tool that reliably captures STDERR has to be… but I wish we had been more… what’s the word I’m looking for… … coherent? Measured? Predictable? about it… and less “Let’s replace this [admittedly crappy] tool that’s got a decade of history in the form bugfixes with this other tool that is shiny and new.”

    Having said that, when the problems do finally get addressed, I think there will be a LOT less hand holding than Tinderbox required.

  2. rhelmer Aug 29th 2007

    Preed, what you are saying would make sense if we were doing a wholesale replacement of Tinderbox and had to have all of these issues fixed up-front. What’s happening is an incremental shift to using Buildbot where appropriate. We’re obviously still using Tinderbox server and client, precisely because of this type of issue.

    I am not sure what “the decision wasn’t coherently made” is supposed to mean, because decisions are still actively being made and problems are being fixed as they come up.

  3. Things are grabbing your ankles, or you are grabbing your ankles? :)

  4. rhelmer: What you say is true, but we’ve gotten to the point now where Buildbot works well enough that the maintenance is killing us (well, robcee, but I’m not anxious to join his funeral march) because those parts have been neglected.

    My lament here is that people seem to have discounted everything to do with Tinderbox in the move to Buildbot — thrown out the baby with the bathwater, if you will — and now we’re playing catchup with functionality. We’re just getting scripts and configs into CVS now, e.g., which is completely backwards in my book, especially after the config horror story that was Tinderbox.

    kev: The end result is the same either way.

  5. I thought this was going to be about some Indian guy named Kneedeep. =(