Sunday, June 17, 2007

Nerd Food: Ubuntu in the Real World

After raving and ranting about Ubuntu so many times, I decided it was time to put it to the test in really demanding conditions. And there are no more demanding conditions than those set by children, in particular when they are nearing their teens. So it was that I installed Feisty in two machines and gave them to my nephews. After 24 hours, the experiment had already produced some interesting results.

The two machines in question are as follows:
  • Machine A is a 64-bit AMD, NVidia graphics card and 512 RAM (less than 4 years old);
  • Machine B is 32-bit AMD, ATI graphics card and 512 RAM (around 6 years old).
The installation in both was pretty straightforward, with all hardware detected. Additional software was easily installed via Synaptic. After the easiness, problems started to appear.

Problem one is related to the well-known issues with ATI cards. Enabling restricted drivers doesn't seem to do anything at all (finds none suitable to my card), and enabling desktop effects without binary drivers results in some kind of video corruption (there's a 5 centimetre area on the right side of my screen that doesn't appear to be used). Even if I had managed to get the ATI card going with binary drivers - which I don't think I would have as this particular card is really old and unsupported - but even if I did, I would have had to setup XGL in order to use compiz, which is really not something I think the average user should do. On the positive side, NVidia support is brilliant. All I had to do on machine B was to enable the restricted driver and enable desktop effects. Compiz was up and running in no time.

Problem two is related to flash and 64-bit. There is no available Firefox/Epiphany plug-in for flash, a vital element of the browsing experience for any young kid these days. I could have gone and installed some less standard flash support, but again, this is well beyond the call of duty for a normal user.

Problem three was DVD playout. I never quite got Totem-GStreamer playing encrypted DVDs. Not sure if it's me being thick and not understanding how to configure decss for GStreamer, but regardless of the underlying causes, my solution for this problem has always been to install Totem-Xine. In addition, I never quite got subtitles working with Totem-Xine so I play all movies requiring subtitles from XineUI (and here, for some reason, I never quite managed to increase the subtitle's font size but at least they're there). None of these steps make sense to the average user.

At this point I had the machine sorted out, and ready for my two eager customers. The first one was impressed with her desktop for a few moments, until she realised I hadn't installed The Sims. Now, as far as games go, she isn't much of a gamer. In fact, for the last couple of years she only bought two PC games: The Sims and The Sims expansion pack. So yes, this was a vital requirement for the experiment.

Problem four: the dreaded Sims. The installer worked well enough, as well as one could expect on a windows box even, up to the point when it finished CD 1. You are then expected to insert CD 2 and continue the installation. Alarm bells started ringing at this point. You see, UNIX and Linux have a very special relationship with devices. When you mount your CD and start running applications from it, its neigh impossible to unmount it until you close all the open files you have on that device. In my particular case I had cd'd to /media/cdrom to run wine start.exe. This basically meant I could not unmount the cdrom properly until I exited from wine, but I couldn't really exit from wine until I got CD 2 mounted. I was about to restart the whole process again when it occurred to me that even if I didn't cd into that directory, I would still have start.exe as an open file (I needed to run it, of course) which meant that I would not be able to unmount the device. I'm not sure about this but logic seems to imply that wine cannot cope with installing programs with two or more discs because of the underlying UNIX mentality. I hope some wine person will prove me wrong.

The temporary solution was to copy the contents of disc 1 to a local folder and execute start.exe from there. This worked a treat until I got to disc 2. Here, for some random reason the installer refused to recognise the disc. Because the installer gives you absolutely no clues as to where it is looking for disc 2 I couldn't tell if there was something wrong with the CD or if it was just looking in the wrong place. After much fiddling it occurred to me that the installer was probably looking for disc 2 on the local directory. However, disc 2 had a similar structure to disc 1 (same setup directory) which meant I couldn't just copy it over disc 1. My final solution was to rename disc two's setup to setup2 while installing disc one, and then renaming setup2 to setup when it asked for disc 2. This perverted experiment actually produced the expected results and the installation completed successfully.

However, the problems were far from over. After all that pain, the sims.exe binary simply refused to launch. It would start but do nothing afterwards. The problem appears to be as reported in the wine DB here and here. There isn't much I can do until wine support improves for the sims. As one would expect, this did not please my user at all.

Problem five: the crash. When I thought everything was quiet and everyone was happy enough, it all collapsed in front of me. Two kids were happily playing with the machine when electricity suddenly went off. On reboot, they were stuck on the fsck screen, at which point they proceeded to randomly press keys and switch the computer on and off, probably several times, in a desperate attempt to fix it. Apparently this technique worked quite well with their previous box running Windows 95.

This, methinks, is one fundamental problem with the current Ubuntu boot. If some operation takes too long, you are thrown back in to the console with huge amounts of text output. This is fine for a nerd user but an absolutely scary experience for any user. It would be much better if one could have some kind of user interface with a massive warning saying "DO NOT SWITCH THE COMPUTER OFF" or something like that. Text mode just scares people off, and when it is compounded with things like "running fsck" then there's absolutely no hope of survival. The end result of all of this was that the machine was rendered unbootable by the time I got there. Yep, you read that right. First time it went all the way to fsck again and froze, second time it froze on grub, the exact same point it froze on for the subsequent twenty reboots.

So here I am on the local Internet cafe, downloading an Ubuntu amd64 ISO image, preparing myself to re-install it yet again. Ele há dias...


Monday, June 04, 2007

Nerd Food: Merging and Branching Procedures

It seems version control is a popular topic again, thanks to the ever courteous Linus. If you haven't seen the talk he gave at Google, do watch it as it's quite interesting. Linus, in his usual so-offensive-its-funny style, criticises SVN to death. I got to say that I quite like SVN, perhaps because I've been forced to use ClearCase, SourceSafe, RCS and CVS for far too long. My only complaint with SVN has always been the terrible merging, something that Linus rightly criticises on his talk. The good news is it appears the most severe problems with merging will be fixed on the next SVN release.

Linus' talk did make me more aware of distributed version control though, but I'm not entirely convinced it would work in a commercial software house. After all, we already have a hard time with branches - let alone having multiple repositories...

All this talk about version control reminded me of a set of procedures for merging and branching I once wrote. I can't take all the credit, of course, since my good friends Kevin and Chris - the ClearCase genius - fixed some mistakes and added important bits. Here is the procedure, in the hope that someone else may find it not entirely without merit. Apologies for the (lack of) indentation.

Merging and Branching

1. Trunk (HEAD/LATEST/[insert version control term here]) is always compilable and stable.

2. When a new project is started, a set of branches are created:
2.1. Integration branch: off trunk;
2.2. Branch for each developer (or whatever the required granularity is, one branch per 2 developers, etc.): off the integration branch;
2.3. All branches should be named sensibly and follow a repository-wide convention (e.g. PRODUCT_TEAM_PROJECT or something equally meaningful).

3. Each developer works on his/her own development branch. Developers must check in every day (PC never contains unsaved data), but are encouraged to do so more frequently.

4. When the developer is happy enough with his/her changes, he/she "rebases" (aka forward merges), that is:
4.1. Updates development branch to the current state of the integration branch (this should be done as often as possible anyways);
4.2. Ensures no one else is merging to the integration branch;
4.3. Tests development with the new changes (in theory runs his/her [J|N]Unit tests, in practice, well... :-)
4.4. "merges back", that is updates integration branch to the state of the development branch;
4.5. Features should be merged one-at-a-time, that is if a developer is working on 5 features for a given release, he/she should merge to integration each feature at a time, allowing other developers to pick and test each change rather than one huge patch.

5. While the project is in development, the integration branch may be rebased from the trunk, but never the opposite (see small bugfixes below).

6. When the project enters development testing (feature freeze):
6.1. All developers rebase from integration and merge back to integration as described above;
6.2. Developers test the current state of the integration branch (normally this means validating the functionality they've coded). Integration branch is by now equal to dev branches;
6.3. Bugfixes are applied to development branches, and rebased/merged back to integration (iterate till right).

7. When integration branch is ready for a release to QC (UAT):
7.1. Release branch is created with unique release number, off integration branch. (i.e. release is "labeled", but this is equivalent to branching in SVN). All dev branches are locked;
7.2. Release is shipped to QC and release branch is locked;

7.3. If release passes QC, ship it. If release needs another spin:
7.3.1. Bugfix branch is created with the bug ticket number and the version number. This is off the integration branch;
7.3.2. Bugfix is made, tested in bugfix branch and rebased back to integration;
7.3.3. When all bugfix branches have been merged in, integration branch is dev tested;
7.3.4. New release branch with release number is created and tested. rinse, repeat;
7.3.5. If required, a "special" release branch is created for the final release so we can distinguish between release candidates and final release.

8. When a release passes QC and is shipped:
8.1. No one is allowed to merge back to trunk (this has to be serialized across all teams using the trunk and must be done asap after the release);
8.2. Integration branch, which at this point is identical to the final release branch, is rebased off trunk;
8.3. Integration branch is tested (QC should get involved);
8.4. Integration branch is merged back into the trunk;
8.5. At this point the release is complete.

9. Small bugfixes:
9.1. Branch of trunk with version number and bug ticket number (just integration branch will do, no need for dev branches);
9.2. Do bugfix in integration branch;
9.3. Dev test integration branch;
9.4. Create release branch off integration branch;
9.5. QC release branch;
9.6. Ship;
9.7. Rebase / merge back to trunk.

10. Ideally, once all the merging is done, branches should be deleted IF the version control system keeps the history of the merged files. This greatly avoids clutter (i.e. you can actually find a branch when looking for it), makes the repository smaller and improves the performance of a few operations.

11. Ideally you should be running something like CruiseControl on selected branches (such as HEAD and integration).

Saturday, June 02, 2007

Nerd Food: On Ubuntu, DELL and the Playstation 3

Unlike many ubunteros, I'm not entirely pleased with the DELL "victory". I mean, I was initially, but reading the small print made me cringe uncontrollably. As a quick summary, for those not following the latest developments, DELL setup a suggestions website called IdeaStorm which was quickly swamped with "I Want Linux" comments. First DELL thought they wouldn't be able to pick and choose from the myriad of available Linux distros, but eventually someone upstairs concluded that Ubuntu was popular enough. As a result, DELL now has a limited range of models with Ubuntu pre-installed (a better description is available here).

While the idea is a good one in principle, the devil is in the details. These machines will not be able to play any of the codecs modern users require, and there will be no DVD support (as in, you can't play movies in encrypted DVD format). This comment is very much telling about the Linux attitude towards the codecs: "they are not free!", "you'd have to install them on windows anyway!", "it's not that hard!".

I've criticised this attitude in the past and will have to do so again. Just because Microsoft, the biggest desktop company in the world, can get away with things, it doesn't mean that all aspiring desktop-wannabees can do it too. Mac is gaining market share because their stuff "just works" - or at least, its perceived as such by everyone. We're not trying to be like Microsoft, we're trying to improve on them. I'm not a stategist, but it seems obvious that DELL and Ubuntu should have talked with Fluendo first before embarking on this adventure and made sure the full range of codecs was available as standard. This would have been great for all parties involved. Fluendo would have agreed to a massively discounted price, a still rather rewarding proposition due to the potential in terms of volume. Codeweavers could have also had a piece of the pie, since software such as iTunes is popular with the crowds. This would have been a more challenging proposition a) because Microsoft seems to dislike Wine quite a lot (and for all we know explicitly asked DELL not to include it) and b) as people start installing random Windows software, the support overload would grow beyond DELL's capacity.

I know Ubuntu has tried to make configuration of codecs and restricted drivers easier but to be absolutely honest, both failed when I tried to use it. The technology does not appear to be entirely mature yet. Now, if the same happens with the newly converts, they will most likely say that "Linux does not work". This was a great chance to woo new users with the beauty of Compiz (if not Beryl), GStreamer et al, but I cannot help but think that a lot of new buyers will end up giving up on Linux because they won't get the whole "configuration" thing. And it's not because Windows is easier to configure; its because Linux is configured differently, and the 50 or so USD you save are not enough to compensate the time needed to learn a new way of doing things.

Which brings me neatly to my next topic. Even more important than DELL is the Playstation 3. There are over 6 million PS3's out there. Some estimate Ubuntu to be installed in 6 to 20 million computers worldwide, so adding 6 million to that number would have a major impact. And the relationship would be entirely symbiotic, since Sony managed to price the PS3 out of the console range and into the low-end PC range; it is possible to buy a DELL model, including a TFT monitor, for around the same price of a PS3 - which is, of course, monitorless. I personally wanted to get a PS3 and use it as a PC, but was not amused when I found out that much of its functionality doesn't work under Linux (including accelerated graphics, wireless, problems with sound - and let's not forget that Flash seems to be 32-bit x86 only at the moment). Not only that, but the entire installation process is non-trivial, meaning that only die-hard ubunteros are going to go for it.

Now, you tell me: if you were a manager at Sony, would you not have started talking to Linux vendors long before the PS3 was due to launch to ensure Linux would be 100% compatible with your hardware? And would you not select a Linux vendor and pre-install the distro? After all, many console users are not IT savvy, they see the console as yet another "white good" in their house. If not, ask youself: what is the point of buying a "'Computer', Not A Console", as Sony's CEO said, if it has no decent general purpose software on it?

Its hard not to feel that we've wasted two great opportunities to fight for market and mind share.

Update: check this for some pics of the setup of a new DELL laptop.

Thursday, May 24, 2007

Nerd Food: On Maintenance

The many years I've spent working for the bespoke industry and using free software finally made me understand the obvious: the single most important aspect in the whole of software development is maintenance. Yes, you heard it right. Its not the language, not the platform, not the methodologies, not the technologies involved, not even the pretty Gantt charts. All these tools are important, of course, but if one looks at the entire lifespan of a program, maintenance overshadows every other aspect by a wide margin. You may think I'm not saying anything new here, and with good reason. Classic texts like Bertrand Meyer's Object Oriented Software Construction already pointed out that the highest cost in a software project is maintenance; Meyer was not the first, by far, to pick up on this. The problem was not with their diagnosis but rather with the cure they proposed. Allow me to expand on this.

The first thing one must realise is that code is in itself the only complete system specification there is. I'm not going to spend much time explaining this view of the world since I cannot possibly improve upon Jack Reeves' "What is Software Design?". Any experienced developer knows that the only way to really understand how a system works is by looking at the source. Let's face it, in the real world manuals don't exist. Comments are sketchy and, more often than not, totally wrong. You may get developers to write good documentation on the early stages but in all these years I'm yet to see a large five-year old project properly documented. The only thing you can always rely on, the only thing that truly documents the behaviour of a program is its source code. I know, I know, you'll bring up Knuth and literate programming. Unfortunately, I have no option but to check-mate you with real world experience. Sad truth is, most people don't even know about Knuth. While Doxygen et al are nice and make documenting much easier, very few people bother making sure the text matches the source when they are on a tight deadline, and the life of a bespoke developer is nothing but one tight deadline after another, ad infinitum. You can imagine your project manager's face when you explain that the deadline won't be met because you still need to finishing off commenting.

Speaking in very empirical terms, most projects seem to have an average lifespan of around seven to ten years, with the caveat that the final stage can drag on for a very long time. The first two or three years are all about adding large amounts of new features, cramming in as much as possible in the shortest possible time. During this period, lurking in the shadows, there is a steady increase of complexity. If these things were easily quantifiable, I'd expect the data would display a high correlation between the number of added features and the increase in complexity (i.e. each feature dramatically raises the complexity bar). Thus, adding each feature (and fixing each bug) starts taking longer and longer over time. At some point the project will reach the "complexity barrier"; this is the point at which adding new features (or fixing existing bugs) is so expensive that it's cheaper to create a new product from scratch, one which addresses all the "scalability" issues that the current system cannot. At this point the code-base is kept in "life-support" mode, with a bare minimum number of developers working on it to keep existing customers happy, but unable to do any fundamental changes to the project's core. If any new major features are required, they are implemented by extremely complex workarounds over existing architectural deficiencies. Eventually, the next generation system leaves the sheltered green house and is ready for deployment. Customers are moved over with varying degrees of grumbling, but with little choice on the matter. This pretty much describes every other project I have worked on over the last decade, some of them in different stages, of course, but all of them describing the exact same arc. First, let's make this clear: this methodology works. Companies are making ridiculous amounts of money by religiously following it, and at the end of the day, from a financial perspective, all that matters is the bottom line. However, this can't be The Right Way from an engineering perspective. I'm afraid you'll need your engineering hat on for the remaining of the article.

Lets step back for a second and reflect. Why do we throw away code-bases in the bespoke market so readily, when both commercial and open source shops do it a lot less frequently? It's all to do with the development process. Truth is, bespoke projects die _by design_; their environment is so entropic and hostile that they cannot but die. Software development changed fundamentally when the day to day running of a project was taken from the hands of programmers and handed over to professionals. In time, project management became a science in itself, complete with its own language of Gantt charts, milestones and deliverables. The entire development ecosystem in which we now live is geared towards delivering more and more features in ever smaller timescales by people who have less and less technical ability - i.e. people that think at ever higher levels of abstraction. The first victim in this quest for "time to market" is the code-base. When a developer is asked to implement a new feature the key question asked by a good project manager is: can you "reuse" some of the existing infrastructure to do this? The project manager may not even know what reuse means technically, but he knows that "with reuse" the estimates are much lower than "without reuse". So "reuse" is good, writing from scratch or re-engineering is bad, really bad. The developer will most likely explain that the existing infrastructure was not designed with the new feature in mind, and so, given the current timescales, there is no option but to bend the code-base beyond shape to shoehorn the functionality in (also known as a kludge). In the ears of a good project manager this is equal to "yes, we can reuse the existing infrastructure, we'll sort the mess later". Alas, later never comes. Eventually, after years of kludges to deliver features, the code-base becomes so unmaintainable - so complex - that it is cheaper to write a new system from scratch than to maintain the existing system. The complexity barrier has been reached, the dreaded point of no return.

What the project manager fails to grasp - or does not want to grasp - is that the code-base is in itself a repository of knowledge of sorts; the summary of the experience of a large group of developers over a long period of time attempting to tame a given problem domain. To make an extreme analogy, this is akin to someone taking every single copy of every volume of The Art of Computer Programming, writing a few sketchy notes about in fifty or so pages and then burning the books, happily thinking that all important detail has been captured. You'd think that most software houses would understand the importance of the code-base as an asset; after all, ask to take a copy of the code home and you'll have the police breathing down your neck in seconds. However, this sort of behaviour is a bit like the attitude of the peasant who keeps his money under the mattress, not really knowing what it's worth but thinking that it must be really important. Companies don't really understand the value of the code-base. If they did, they would take _really_ good care of it. Instead, they treat it like any other perishable resource, a computer or a car, a trite commodity spewed out of a production line of developer drones. The decommissioning of a software system should always be seen as an immense tragedy, a great loss of knowledge. Management is just not able to comprehend the amount of detail that is contained in a code-base, detail that simply cannot be transposed to a new system and will have to be rediscovered. Problem is, an existing code-base hasn't got an easily computable dollar value - man years are a very bad way of estimating effort nor is it possible to estimate the cost of a yet-to-develop system - so we're all in the dark. (Not being an expert, I'm not going to try to propose ways of valuing an existing code-base, but whatever methodology one comes up with it is bound to produce some astonishingly high figures.)

Unconvinced, you may ask, what is so wrong with starting new projects? After all many lessons can be learned, new technologies can be used and the end result will be a faster, more featureful, more maintainable system. Before everyone starts chanting "oh, you luddite in disguise", it's important to bear in mind the following:

  • The failure rate for new projects is extremely high;
  • Incremental changes have a lower risk, whereas big changes are always highly risky;
  • Its much easier to estimate costs and timescales in an existing project which has been running for years rather than on a new one, for which baselines are yet to be created;
  • The second system effect forces architects and developers to create new projects that aim to boil the ocean and use every other new technology, adding even more variables to an already complex problem;
  • New systems introduce a host of new bugs; you're basically trading an existing set of bugs that are either known, or not known but also not known to seriously impact production, with an unknown (but almost always large) quantity;
  • You'll need to find new people or retrain existing people for the new skills required - particularly on the developer side, but quite often in the user side too;
  • Your system and component requirements will almost always miss vital features or important little bits of detail and ignore many of the lessons already learned simply because the latest crop of developers writing the specs is not aware of them;
  • Your project planning will almost always underestimate the complexity of implementing some or all of the existing features;
  • Your project managers may be excellent at managing an existing system but totally inexperienced at managing at this huge scale of uncertainty;
  • Your developers may be excellent maintainers of an aging code-base but terrible green house developers, getting continuously lost in blue-skies approaches;
  • The architecture of the existing system may not transpose very well to the new technologies your developers insist in using, limiting reuse even at this fundamental level.

As you can see, replacing an established system is close to spending a million dollars in a casino over a few months. If you do win, you'll make a fortune - but the odds are heavily stacked against you. For all the reasons above - and probably many more which I failed to uncover - it is vital to try to keep an existing code-base running healthily, avoiding the complexity barrier at all costs. In order to do so one must maintain a system properly. This entails:

  • Removing functionality which is no longer necessary, thereby reducing complexity;
  • Looking for opportunities to refactor existing code into separate modules, and replace existing modules with open source libraries if suitable ones exist;
  • Tracking and fixing _all_ reported bugs;
  • Ensuring the code compiles with no warnings at the maximum warning level;
  • Refactoring code when implementing new functionality that does not fit the existing infrastructure;
  • Continuously measuring system performance, ensuring it does not degrade over time;
  • Ensuring consistency with existing standards and conventions, avoiding in-house protocols;
  • Improving readability of existing code;
  • Regression testing the code-base after changes;
  • Striving for platform independence;
  • Making continuous releases after changes to ensure there isn't a feature pile-up; in other words, release early, release often.

Yep, you've noticed it. These are all obvious tasks, pretty much the standard you'd expect from an average free software maintainer. Unfortunately, for reasons outlined above, these tasks are rarely present in bespoke software houses' project plans. You may find that some commercial off-the-shelf shops actually take maintenance seriously, but most bespoke houses just can't afford to spend the required time on maintenance. IMHO, herein lies the key, the biggest needed change: project managers have to start allocating slots for maintenance. They have to treat maintenance work like they treat enhancements, allocating adequate resources for it, asking developers to make and keep updated a list of top issues in the code-base and make sure these are addressed.

The intrepid reader may reasonably ask: but what if the system is designed in such a way that a large new feature just cannot be implemented within its framework? To that I must counter that _no_ feature is too complex as to be unimplementable in _any_ existing system which has been well maintained. This is a fallacy in which I believed for many years but which I think has been comprehensively disproved by many projects such as the linux kernel, GTK and Qt. Take the kernel. If a system that was designed to run only on x86, with no virtual memory, minimal driver support, minimal filesystem support and all sorts of other constraints can be made to do what linux does today then any project can do the same. I mean, the v2.6.x kernel has excellent portability, large SMP scalability, close to real time scheduling, and many, many more features that are all but impossible when looking at them from a v0.0.1 perspective. Linus feels quite strongly about the fact that the kernel was not originally designed to do any of these things, but _evolved_ solutions to these problems over time and in many cases these solutions work extremely well. The question is not whether it is possible or not, but rather how much effort is required to get there. And any discussion about resource allocation must always take into account the huge benefits of keeping the same code-base.

The other important aspect of maintenance is code-base reduction, mentioned on the first two points above. Code-base reduction may appear counter-intuitive at first sight; after all, new features must require adding code. However, the best way to look at this is from a resource allocation perspective. There is a finite number of developers, call it d, working on a code-base of a given size, say s. Let's call c the ratio between s and d. I always dreamed to come up with a law, and here finally is my chance: Craveiro's law states that the higher c is, the harder it is to maintain a code-base. Of course, this is a highly empirical law, but useful nonetheless. Now, there are two very straightforward ways of reducing c: either increase the number of developers until you meet Brook's law, or decrease the size of the code-base until you start impacting required features. The latter is more interesting, very much reminiscent of St. Exupry: a designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away.

Since you can't literally start removing required functionality, the next best thing is to find other people who are willing to share the maintenance burden with you, reducing the individual maintenance cost (if not the overall cost). This is routinely done in open source projects, and it is incredibly successful. Basically, you want your developers to aggressively look at parts of your code-base which offer no discernible competitive advantage; once located, these are stripped out of the system and added to your company's portfolio of open source components. These have an important strategic value and should be managed very carefully (a community needs to be developed around them, the maintainer must listen to the community, etc.). The end result should be a significant reduction in your core code-base size.

I'll leave you with a couple of interesting corollaries from Craveiro's law:

  • Like many other "new" technologies, OOP by itself does not help or hinder the maintenance problem. Regardless of how elegantly your system is designed and implemented, if you are not maintaining it properly it will die. Conversely, a system designed in perl that is actively and adequately maintained may prove to be extremely resilient to time. _However_, choosing a popular language will have an important indirect impact on maintenance because it will define the size of the developer pool you can tap.
  • Java and C# are incredibly useful programming languages, but not for the reasons you might expect: i.e. not because of garbage collection, nice syntactical sugar, improved security or the VM. The one key element that distinguishes them from most other languages is their extensive and standardised class library, readily supplemented with huge amounts of open and close source components. These reduce the footprint of your code-base dramatically. Why are these languages better than say Delphi or RogueWave's extensions to C++? Because they insure vendor independence by standardising most of their interfaces.


VITORIA!!! VITOOORIA!!!

Incredible. After an exhausting season, Vitoria finally managed to stamp its ticket and remain in the superliga (the Portuguese equivalent of the English premier league). Its hard to believe we made it since it was one of those rather improbable events: two teams had to loose or draw and we had to win away. One of those teams, Beira Mar, had a cushy job since they were playing at home against a reasonable but beatable Pacos de Ferreira. In the end, after much, much suffering, the boys managed to bring it home. There were hundreds of faithful Vitorians who made the trip all the way up north to cheer the team on, and I don't think we would have done it without them. The victory was so huge that it even caused some fans to slightly lose track of reality over at the forums, so let me just clarify this: no, we're not the champions, we just managed to stay in for yet another season of pain and excitement.




Vitoria does make things hard for itself. It was yet another of those soap opera seasons, with managers getting fired, money gone missing, wages unpaid, players threatening to leave, players leaving, new Presidents, new financial partners that never quite materialise, the whole nine yards. A pretty standard season as far as Vitoria is concerned but hell on earth for any other club I can think of. No wonder Carlos Cardoso does not want to take up the job next season. I secretly believe that he spends the first few months of the season recovering from the previous season and getting mentally ready to take over as caretaker manager... Let's see what the new boy is all about...


Saturday, April 21, 2007

Nerd Food: A Feisty Fawn Indeed!

Like many nerds, I've been looking forward to Feisty Fawn, the latest incarnation of my favourite distro Ubuntu. Unfortunately, due to some incredibly bad internet access, was unable to take the plunge for more than two days after the release. You can imagine the pain. This morning I decided that enough was enough and searched for options. Just as luck would have it, I found out that there are no download limitations in the local friendly internet cafe and laptops are welcome. Perfect.

Like any good ubuntero, I started by reading posts. Although I keep to the manly maxim of "never read manuals", I've learned a long time ago that it doesn't apply to Linux. I think it was after a mega-botched Red Hat 6.0 upgrade. Quick skimming through the release notes and the millions of Fawn related posts revealed that:

1. You must make sure Edgy is up to date.
2. There are some potential problems with the new version of the ATA driver.

The first one was a major bummer. After four months in Africa, all my machines were massively out of date. I had 400 meg worth of packages to download and install. But since there wasn't an option, I went for it. Whilst waiting for the update, I investigated 2. It appears to be well documented:

#84603: Hard disk I/O randomly freezes when hald is running and optical drive is empty
#107271: Acer 9410 ATA hangs, system freezes

A quick perusal of the bug reports showed that there were known good workarounds, which I saved on a local file. Nothing else showed up on my quick review so, once the machine was up to date, it was time to get Fawned. Or not.

The update manager dutifully informed me that my machine was up to date, showed a "Upgrade to Fawn" little button (as per documentation) and told me I needed to reboot the box in order to complete the updates. It seemed sensible to reboot the box first. After rebooting, and moaning about the fact that my framebuffer theme is still black and gray instead of the shinny normal one (don't know what happened there, but has been that way since Edgy has been released. Tried manually changing the theme and everything but to no avail), I logged into my Gnome desktop.

But when I restarted update-manager, I was greeted by a surprise: the "Upgrade to Fawn" button was gone! Totally, inexplicably, inextricably gone. In addition, there was a new version of the update-manager, 0.45.3. Posts I had read previously indicated that 0.45.2 was the minimum required one, but latest is always the greatest - and the button was missing! - so I upgraded. I then restarted update-manager but the button was still missing.

Remembering Dapper - when one had to use some command line switches to get the distro level updates going - I went down to the command line. I tried all sorts of combinations:

$ update-manager -c
$ update-manager -c -d
...

But the button just didn't want to return. Then I remembered I read somewhere that update-manager was written in python. Being a C++ sort of chap, I know nothing about python and avoid high-level like the plague, but these were desperate times. Source code is always source code right? A bit of browsing revealed the "binary" was located in /usr/bin/update-manager. A bit of finding revealed the "packages" (not sure how pythonians refer to them) lived under /usr/lib/python2.4/site-packages/UpdateManager.

After lots of debug statements (which in python are, apparently, print followed by a printf like syntax), I eventually located the culprit in MetaRelease.py (in the core directory). In the download def (function?), we were trying to GET but never got (apologies for the lack of identation from here onwards, but I'm not a blogger/HTML wizard):


try:
print "here"
uri=urllib2.urlopen(req)
print "after"

Alas, there was never an after. I checked the revision control. There had been a couple of changes since the last release, and they were connected to fixing a bug.

# make sure that we always get the latest file (#107716)
req.add_header("Cache-Control", "No-Cache")
req.add_header("Pragma", "no-cache")

Thinking that this perhaps was damaging the HTTP envelope, I first removed the lines. No luck. I then made a simple test program just to see if the code would work in isolation:

import urllib2
import os
import time

print "test"

req = urllib2.Request("http://changelogs.ubuntu.com/meta-release")

# make sure that we always get the latest file (#107716)
req.add_header("Cache-Control", "No-Cache")
req.add_header("Pragma", "no-cache")

if os.access("/var/lib/update-manager/meta-release", os.R_OK):
print "access ok"
lastmodified = os.stat("/var/lib/update-manager/meta-release").st_mtime
if lastmodified > 0:
req.add_header("If-Modified-Since", time.asctime(time.gmtime(lastmodified)))
try:
print lastmodified
uri=urllib2.urlopen(req)
print "here"
f=open("/home/phase/test","w+")
for line in uri.readlines():
f.write(line)
f.flush()
f.seek(0,0)
uri.close()
print "here"
except:
print "Unexpected error:", sys.exc_info()[0]
raise

When I ran this little program, it appeared to grind to a halt; several seconds of wait produced no "here". Clearly something was wrong. But loading the page in Epiphany worked fine! What was going on here?!

I installed tshark, the latest iteration of tethereal (after all, you don't really want people to see you using a packet sniffer on a web cafe, right? a text mode one is easier to conceal...), and configured it listen to my traffic:

tshark -V -i eth0 host 82.211.81.132 > out.txt

I obtained the IP address by pinging changelogs.ubuntu.com. Although the site does not return pings, all I was interested in was the address resolution so that's all good. What I didn't notice at the time, bogged down as I was with my multitasking, was that the address resolution took a long time. And so had the epiphany page loading. The penny would have dropped then, had I only paid attention.

Onwards. A few seconds of wait and nothing got written to my file, so I killed and restarted the process. Eventually, I decided to download the CD ISO, which is apparently another way of upgrading. But without noticing, I had left my code running. A few minutes later I went back to emacs and surprise, surprise, lots of packets filled my out.txt. For some random reason, the GET was being delayed quite a lot. Quick confirmation was done using time and curl:

$ time curl http://changelogs.ubuntu.com/meta-release

real 0m24.322s
user 0m0.024s
sys 0m0.016s

By George! Slow network? But my ISO download was flying at over 100 kbytes! But nevertheless, things are actually working right? So if I leave update-manager running for a bit, it should eventually figure out that the HTTP data has been delivered and present me with a lovely button I can click, right? After all, the check() def is being run on a timer (Performance Federico would have loved this, after the outcry with poll()...):

gobject.timeout_add(1000,self.check)

A few prints later demonstrated that, whilst I could get my simple python script to work by waiting long enough, update-manager just didn't want to know. This is probably because the check() is done on a separate thread, but at this point I had more than exhausted my non-existent python knowledge. So the trick was to find out what was causing the network delays.

I save you the suspense: DNS resolution is the culprit. Repeating the same exercise with the IP address improved things a little bit:

$ time curl http://82.211.81.132/meta-release

real 0m0.133s
user 0m0.016s
sys 0m0.004s

I then reinstalled update-manager to make sure all my changes didn't break it beyond recovery, and did a dirty, dirty hack in MetaRelease.py, replacing

METARELEASE_URI = "http://changelogs.ubuntu.com/meta-release"

with

METARELEASE_URI = "http://82.211.81.132/meta-release"

Lo and behold, the button was back. But will the upgrade work? Only time will tell. At any rate, the ISO has finished downloading.

Bugs will be filled, but that will have to wait for my next session at the internet cafe. After all, today's bill will be a hefty one...

Update: Bug had already been reported (#108541).

Friday, March 23, 2007

Nerd Food: On World Domination

Well, three months into 2007 and very few mass migrations to desktop Linux have been announced. A few thousands here and there, but not the millions we all want. Its beginning to look like our hopes for 2007 as the Linux Desktop Year (TM) have been misplaced yet again. The algorithm for the Linux Desktop Year is becoming clear now: N + 1, where N is the current year. Yep, its always next year. What's going on here? Are we never going to have Linux on the desktop? Products like Ubuntu and Novell are looking more and more solid, why aren't people installing it en masse? What about World Domination?

Lets start by defining what we really mean by "World Domination". It is unlikely that we will see a world where Linux has 90% of market share across all segments, in particular on the Desktop. There's just too big an installed based for that, and inertia is too strong. Nor would that happen overnight, as any migration at the corporate level can take years to plan. My personal definition of World Domination is much more pragmatic:
  • For every new PC being sold, the buyer would consider whether to install Windows or Linux, basing his or her decision on technical and financial aspects;
  • For every migration to the latest version of Windows, the IT department would consider migrating to Linux, basing his or her decision on technical and financial aspects.
In other words, I want to join a company and be asked whether I want Linux or Windows on my desktop, rather than just be given Windows and be told to shut up. And when I go to PC World, I want to be asked if I want Linux or Windows. A lot of people think that these things are not happening because a) Microsoft is putting illegal pressure on vendors to stop Linux adoption b) there are too many Linux variants so vendors don't know what to do (DELL was a good example). Actually, whilst I think these two factors are important, they are also very misleading. For starters, that didn't stop Linux in other market segments. Lets look at the recent history.

A few years ago, fifteen or so, Linux was virtually unknown in all computing markets (call it markets, call it segments, you choose). The mainstay of Linux support were the college dorms and the homebrew engineers. If you were to read the articles about Linux in those days, the few that existed, they all said that Linux would never leave the college dorms. Not so long after that Linux became one of the most popular platforms for ISPs and web servers in general, in partnership with Apache. We were then told that Linux would never be more than a platform for web-serving. Fast forward a few years more, around the turn of the century, and we were then told that Linux would never leave the server room. This was a time were Linux proved itself as a good file and print server, and a good citizen on a Windows network, all thanks to Samba. It was also the time when Linux's presence in the clustering, super computing and the embedded markets was consolidated. Around this time we also noticed Linux's presence in the high end server market, in databases and proprietary server applications. Fast forward again, now to the present time. To all these segments we've now added POS, kiosking and other more restricted desktop markets. We are now told that Linux will never leave the restricted Desktop markets.

If you consider my narrower definition of world domination, then one can say that it has been obtained in all these markets. In all of these, people spend time looking at alternatives before settling in Windows or Linux. In some cases Linux has got more than 50% market share, in many other cases it has not, but there is an ongoing battle for market share.

But let's have a closer look at the database segment, because many lessons can be learned from it. The high end is dominated by colossus such as DB2 and Oracle. Here Linux is extremely successful, mainly because people don't really care about the operative system; they care about the database product. Move over to the middle and low ends of the market though, and Linux penetration is very small. Yes, MySQL and PostgreSQL are making inroads, but the truth is they are yet to make a dent in the market share of SQLServer and Access, both Microsoft products. The same principle applies: people don't want to run PostgreSQL, they want to run SQLServer; and since it only works on Windows, well, that means they can only consider Windows.

The lesson to take home from all of this is simple. Linux is extremely competitive in markets: a) that have no installed base, or where the field moves so fast that the installed base is obsoleted quickly and has to be replaced (embedded, clustering, supercomputing) b) that depend only in standard protocols or protocols that can be legally reversed engineered (file serving, web serving, authentication) c) that depend on applications which have already been ported to Linux, and where the port is of the same grade or higher as the original version (Java Application Servers, SAP, Oracle, DB2, bespoke applications, game servers) d) where there is a clone of a key application, and the clone offers a superset of the features of the original application, providing full compatibility e) where there is a compatibility layer that allows Linux to run applications designed for other platforms (.Net SWF GUI applications, Wine).

All these are pretty self explanatory, with the exception of d) and e) so lets have a look at those.

In terms of d), I don't mean cloning here like the GIMP is a clone of Photoshop; I mean cloning like EnterpriseDB. EnterpriseDB, the PostgreSQL derivative which aims to be fully compatible with Oracle, is experiencing huge growth, and the root cause of this growth is the claimed drop-in Oracle compatibility. It appears companies are buying the product in droves, trying to save money in Oracle licenses. In my personal opinion, whilst the principle is brilliant, EnterpriseDB made a mistake by going after Oracle. Most people that buy their products have money to spare and are not worried about costs. This does not mean there aren't many people who buy Oracle because they have to, but the number of people in this situation is rather small. However, if EnterpriseDB were to offer a drop-in SQLServer replacement, I am convinced their demand would have been much higher, by many orders of magnitude. I am talking specifically about a product that can: talk TDS, requiring no modification from clients to connect to it; import stored procedures, data and schemas from SQLServer with one click; be managed from Enterprise Manager and any other SQLServer tools as if it was another instance of SQLServer; run TSQL stored procedures without modification. Such a product would sell a lot more because people that buy SQLServer are much more cost sensitive than people that buy Oracle. But the principle here is that a clone can open a lot of doors.

As far as e) is concerned, this hasn't been proved yet because the emulation layers are not 100% complete. The big difference between .Net and Java is that many .Net GUI applications rely on Windows Forms (SWF), which means they are a lot more Windows dependent. Mono is working hard in getting a good SWF implementation, but this is a hard task and 100% compatibility will take a while. The same can be said for Wine, on the making for over a decade, forever getting closer but still struggling with compatibility. IMHO, all big Linux companies should get together and finance Wine, either through investing in CodeWeavers or by having their own Wine developers. If Wine was able to run _all_ windows applications say up to XP, and do so smoothly, without any problems, this would open many, many doors to Linux (and all other operative systems with Wine). I'm not talking about "it almost works, or it works sometimes" type of compatibility, I'm talking about rock-solid, uncrashable, perfect, flawless support. This will require huge amounts of investment. However, such level of compatibility would allow IT departments to consider Linux/Windows migration separately from the migration of Office and other key windows applications. To migrate everything in one go is just too deep a plunge for many people, too much risk. It's not that OpenOffice and other applications aren't good; its just a question of reducing the amount of change required in one go. Both approaches are good and should be pursued, for different reasons. OpenOffice caters for a less demanding segment of the population, Office on Wine caters for a totally different segment.

One last rant goes for the multimedia situation. We need to have _all_ popular codecs available in Linux legally and Fluendo's work is a significant milestone in this regard. Ubuntu is also making strides in this department, and pragmatism is the only way to win the day here. Don't take me wrong, there is a lot of merit in a lot of the religious causes. OGG over MP3 any day. Software patents are evil. GNU rules. But to create an operative system that ignores the current state of the world and is instead designed for the world we all would like to live in, is to doom it to failure. A case in point: last year the basketball World Cup was on, and I desperately wanted to watch Angola play. I then found out that the games were available on-line, but could not get any of the existing Linux media players to work with FIBA's website. Yes, proprietary codecs are evil but this is the World Cup we're talking about and I'm not RMS, so I compromised. Since we don't have anything else other than Ubuntu at home, I ended up having to watch the games at work. Fortunately I had understanding bosses, but is it really fair to demand this sort of commitment from the mass market? And to shut this people from Linux is not beneficial for us because, as we all know, critical mass is important. If Linux had 20% of the desktop market we would get a lot more attention from hardware companies, media websites, game developers, the world at large. They would think about us when they release new products. Would we really care if 19% of the 20% didn't know anything about freedom and GNU? Would we be worse off with them on board than without? Besides, it is a lot more likely that they would find out about freedom once they've started using Linux. "Who are this people who give their time for free to create such a good product?"

Forget about all the religious wars for a moment and lets put our business hats on. The truth is, the _vast_ majority of people out there never heard about Linux. Let me tell you this, I have walked around Africa for four months and met _two_ people that heard about Linux, and even then only vaguely. "Like Mac right?". A girl asked me if my Debian t-shirt had anything to do with lesbians. Even in South Africa, the home of Ubuntu, I've seen nowt, not even a single mention of it. Not on telly, not on the shops, not on the streets, not on the cybercafes, not on the big supermarket chains. You ask about Ubuntu on the streets and people think you're trying to learn Zulu or Khosa and you mean peace and unity. If we want mind share we need to be able to be functionally equivalent to Windows, with no excuses. You have to understand, from an outsider perspective religion _is_ an excuse. You can't really promote Linux to this people and then say "but you can't really play proprietary media without breaking the law, and even then it will take you days to configure". To start promoting Linux we first need the ability to play all the popular media formats, and to do so legally and without placing _any_ configuration demands to the user other than clicking a button. Keep in mind that when we do get to this level, we still have a long mind share struggle to face; it will take years to get the word out there, to get people to try.

So when are we going to get world domination?
  • When we can run all the popular applications faultlessly, in particular the Windows ones, regardless how that is done (port, emulation);
  • When we can play all the media formats flawlessly and legally;
  • When Linux is mass advertised.
We are getting close. But remeber the rule of credibility: "The first 90% of the code accounts for the first 90% of the development time. The remaining 10% of the code accounts for the other 90% of the development time."