The Difference Between Knowledge and Wisdom

random15.jpg

If you haven’t heard about this, you need to. All Debian-based Linux systems, including Ubuntu, have a horrible problem in their crypto. This is so important that if you have a Debian-based system, stop reading this and go fix it, then come back to finish reading. In fact, unless you know you’re safe, I’d take a look at updating your system anyway.

The problem is that they “fixed” the random number generator so that it doesn’t generate random numbers, but a semi-fixed stream of pseudo-random bytes.

A friend of a friend is now working on generating the whole set of possible keys, and will release them to the world here. (Agree or not with this, but remember that the bad guys have them by now.)

Ben Laurie has written about it in gory detail here and here. If you want a summary, this problem comes about because the OpenSSL random number generator does some things that are unconventional, but not wrong. The unconventional coding was flagged by a code-analysis tool, and a Debian person removed it. That change made all randomness vanish from the random number generator.

Plenty of people have debated the whole thing. For example, there’s the debate that says the Debian developer was an idiot, adn the people who say that the folks who did unconventional things were idiots.

I think that this is the sort of expected failure that happens in complex systems. I am reminded of code optimizers that see that a programmer clears a variable and then doesn’t use it, so they optimize out the clearing, not realizing that that is erasing keys or passwords or whatever.

I’ll add in that what leapt out at me was that the unconventional coding had an excessively vague comment noting that the analysis tool wouldn’t like it. It would have been much better to have an over-the-top comment.

I was once notorious for a comment I had in some extremely hairy code that said something akin to:

This code is delicate. Don’t modify it unless you understand it. If you think you understand it, you don’t. I wrote it and I don’t understand it.

That’s what I meant by an over-the-top comment. I wanted the poor person who maintained my code to think three times. When you do something unconventional, you need to point out to the other developers in the ecosystem that you did what you did intentionally.

And for those of you who read the whole of this article before patching — shoo. Go. Install that update. Now.

Photo “Random # 15 MSH” by Saffanna.

10 thoughts on “The Difference Between Knowledge and Wisdom

  1. There’s a systematic problem here for Linux, and that’s the tendency for upstream developers to get combative with vendors who want to change their code. The long, sad saga of cdrtools is another example of this.

  2. There’s a systematic problem here for Linux, and that’s the tendency for upstream developers to get combative with vendors who want to change their code. The long, sad saga of cdrtools is another example of this.

  3. The unconventional coding was flagged by a code-analysis tool, and a Debian person removed it. That change made all randomness vanish from the random number generator.

    That’s not entirely true. Please read the actual diff in question (preferably in the context of the whole source file). Removing the unconventional coding was one thing, and not especially controversial, but the Debian person who committed the fix also made the same “fix” to what appeared to be a similar line elsewhere in the same source file. It was this second “fix” that was wrong, and had the ghastly consequence of preventing entropy from flowing into the randomness pool.

  4. re Ross Younger
    Yes, but. I gave a summary. Summaries are short. Short is incomplete. I didn’t want to get tied in all the gory details because I think they detract from the core issue.
    Your more complete explanation of the etiology is a more complete explanation, but I put plenty of links for people interested in more to chase things.
    The real issue is that a comment like:
    /*
    * Yes, we’re stirring in uninitialized memory. This is salt that helps make
    * this machine unique in its pool. Valgrind will complain. Ignore valgrind
    * because its wrong about this.
    */
    Then this would never have happened.

  5. The discussion in one of the linked blogs also revealed some interesting things about the OpenSSL team’s communications. Basically the Debian maintainer making the change actually asked about the patch on openssl-dev mail list, and got no real objections. But apparently, contrary to what their website says, openssl-dev is not the place to ask these sorts of things; apparently it’s the (unpublished) openssl-team list.
    So what we have is a series of mistakes that added up to a big disaster:
    – A person changing critical code they didn’t fully understand.
    – Tricky code that was not sufficiently commented as such.
    – Upstream developers who did not make their preferred communications channels clear.
    – Insufficient testing of the final code.
    In aviation it’s common to talk about an “accident chain,” the series of small mistakes that lead up to an incident. Breaking any one link in the chain would have stopped the accident from happening. That’s kind of what happened here.

  6. Thank you. That’s exactly my point. My poor brain didn’t have “accident chain” cached in active vocabulary.
    My teachers, long ago, taught me to “take pity on the poor sucker who has to maintain your code.” In an open source project, those poor suckers include the arrogant little bastidge upstream from you who thinks he’s fixing a bug.
    There are many people who seem to write code with an attitude of, “if it was hard to code, it should be hard to read.” That’s a form of arrogance. A better form of arrogance is to think, “if it was hard to code, it should be easy to read, because people dumber than me will be reading it.” That arrogance, that I’m going to explain this because you aren’t going to figure it out for yourself, has better characteristics for making the world a better place.

  7. I like the term “accident chain.” That is precisely what happened here, and the attempts by the Debian and OpenSSL camps to blame each other only distract.
    It’s worth summarising the experience. I think it makes a great case study of the security world today.

  8. I like the term “accident chain.” That is precisely what happened here, and the attempts by the Debian and OpenSSL camps to blame each other only distract.
    It’s worth summarising the experience. I think it makes a great case study of the security world today.

  9. I like the term “accident chain.” That is precisely what happened here, and the attempts by the Debian and OpenSSL camps to blame each other only distract.
    It’s worth summarising the experience. I think it makes a great case study of the security world today.

Comments are closed.