A Mini-Review of “The Practice of Network Security Monitoring”

NSM book coverRecently the kind folks at No Starch Press sent me a review copy of Rich Bejtlich’s newest book The Practice of Network Security Monitoring and I can’t recommend it enough. It is well worth reading from a theory perspective, but where it really shines is digging into the nuts and bolts of building an NSM program from the ground up. He has essentially built a full end to end tutorial on a broad variety of tools (especially Open Source ones) that will help with every aspect of the program, from collection to analysis to reporting.

As someone who used to own security monitoring and incident response for various organizations, the book was a great refresher on the why and wherefores of building an NSM program and it was really interesting to see how much the tools have evolved over the last 10 years or so since I was in the trenches with the bits and bytes. This is a great resource though regardless of your level of experience and will be a great reference work for years to come. Go read it…

Emergent Map: Streets of the US

This is really cool. All Streets is a map of the United States made of nothing but roads. A surprisingly accurate map of the country emerges from the chaos of our roads:

Allstreets poster

All Streets consists of 240 million individual road segments. No other features — no outlines, cities, or types of terrain — are marked, yet canyons and mountains emerge as the roads course around them, and sparser webs of road mark less populated areas. More details can be found here, with additional discussion of the previous version here.

In the discussion page, “Fry” writes:

The result is a map made of 240 million segments of road. It’s very difficult to say exactly how many individual streets are involved — since a winding road might consist of dozens or even hundreds of segments — but I’m sure there’s someone deep inside the Census Bureau who knows the exact number.

Which raises a fascinating question: is there a Platonic definition of “a road”? Is the question answerable in the sort of concrete way that I can say “there are 2 pens in my hand”? We tend to believe that things are countable, but as you try to count them in larger scales, the question of what is a discrete thing grows in importance. We see this when map software tells us to “continue on Foo Street.” Most drivers don’t care about such instructions; the road is the same road, insofar as you can drive in a straight line and be on what seems the same “stretch of pavement.” All that differs is the signs (if there are signs). There’s a story that when Bostonians named Washington Street after our first President, they changed the names of all the streets as they cross Washington Street, to draw attention to the great man. Are those different streets? They are likely different segments, but I think that for someone to know the number of streets in the US requires not an ontological analysis of the nature of street, but rather a purpose-driven one. Who needs to know how many individual streets are in the US? What would they do with that knowledge? Will they count gravel roads? What about new roads, under construction, or roads in the process of being torn up? This weekend of “carmageddeon” closing of 405 in LA, does 405 count as a road?

Only with these questions answered could someone answer the question of “how many streets are there?” People often steam-roller over such issues to get to answers when they need them, and that may be ok, depending on what details are flattened. Me, I’ll stick with “a great many,” since it is accurate enough for all my purposes.

So the takeaway for you? Well, there’s two. First, even with the seemingly most concrete of questions, definitions matter a lot. When someone gives you big numbers and the influence behavior, be sure to understand what they measured and how, and what decisions they made along the way. In information security, a great many people announce seemingly precise and often scary-sounding numbers that, on investigation, mean far different things than they seem to. (Or, more often, far less.)

And second, despite what I wrote above, it’s not the whole country that emerges. It’s the contiguous 48. Again, watch those definitions, especially for what’s not there.

Previously on Emergent Chaos: Steve Coast’s “Map of London” and “Map of Where Tourists Take Pictures.”

Map of Where Tourists Take Pictures

Eric Fischer is doing work on comparing locals and tourists and where they photograph based on big Flickr data. It’s fascinating to try to identify cities from the thumbnails in his “Locals and Tourists” set. (I admit, I got very few right, either from “one at a time” or by looking for cities I know.)

Seattle Photographers

This reminds me a lot of Steve Coast’s work on Open Street Map, which I blogged about in “Map of London.” It’s fascinating to watch the implicit maps and the differences emerge from the location data in photos.

Via Data Mining blog and

Happy Banned Books Week!

banned-books.jpgQuoting Michael Zimmer:

[Yesterday was] the start of Banned Books Week 2009, the 28th annual celebration of the freedom to choose what we read, as well as the freedom to select from a full array of possibilities.

Hundreds of books are challenged in schools and libraries in the United States each year. Here’s a great map of challenges from 2007-2009, although I’m sure it under-represents the nature of the problem, as most challenges are never reported. (Note the West Bend library controversy is marked on the map.)

According to the American Library Association, there were 513 challenges reported to the Office of Intellectual Freedom in 2008.

I’m somewhat surprised by how many bluenoses dots there are in the northeast. Does anyone know of a good tutorial that would help me to re-map the data against population?

Just Landed in…

Just Landed: Processing, Twitter, MetaCarta & Hidden Data:

This got me thinking about the data that is hidden in various social network information streams – Facebook & Twitter updates in particular. People share a lot of information in their tweets – some of it shared intentionally, and some of it which could be uncovered with some rudimentary searching. I wondered if it would be possible to extract travel information from people’s public Twitter streams by searching for the term ‘Just landed in…’.

just-landed.jpg

This is a cool emergent effect of people chaotically announcing themselves on Twitter, a MetaCarta service that allows you to get longitude/latitude and a bunch of other bits all coming together to make something really cool looking.

Via Information Aesthetics

My Wolfram Alpha Demo

I got the opportunity a couple days ago to get a demo of Wolfram Alpha from Stephen Wolfram himself. It’s an impressive thing, and I can sympathize a bit with them on the overblown publicity. Wolfram said that they didn’t expect the press reaction, which I both empathize with and cast a raised eyebrow at.

There’s no difference, as you know, between an arbitrarily advanced technology and a rigged demo. And of course anyone whose spent a lot of time trying to create something grand is going to give you the good demo. It’s hard to know what the difference is between a rigged demo and a good one.

The major problem right now with Alpha is the overblown publicity. The last time I remember such gaga effusiveness it was over the Segway before we knew it was a scooter.

Alpha has had to suffer through not only its creator’s overblown assessments, but reviews from neophiles whose minds are so open that their occipital lobes face forward.

My short assessment is that it is the anti-Wikipedia and makes a huge splat on the fine line between clever and stupid, extending equally far in both directions. What they’ve done is create something very much like the computerized idiot savant. As much as that might sound like criticism, it isn’t. Alpha is very, very, very cool. Jaw-droppingly cool. And it is also incredibly cringe-worthily dumb. Let me give some examples.

Stephen gave us a lot of things that it can compute and the way it can infer answers. You can type “gdp france / germany” and it will give you plots of that. A query like “who was the president of brazil in 1930″ will get you the right answer and a smear of the surrounding Presidents of Brazil as well.

It also has lovely deductions it makes. It geolocates your IP address and so if you ask it something involving “cups” it will infer from your location whether that should be American cups or English cups and give you a quick little link to change the preference on that. Very, very, clever.

It will also use your location to make other nice deductions. Stephen asked it a question about the population of Springfield, and since he is in Massachusetts, it inferred that Springfield, and there’s a little pop-up with a long list of other Springfields, as well. It’s very, very clever.

That list, however, got me the first glimpse of the stupid. I scanned the list of Springfields and realized something. Nowhere in that list appeared the Springfield of The Simpsons. Yeah, it’s fictional, and yeah that’s in many ways a relief, but dammit, it’s supposed to be a computational engine that can compute any fact that can be computed. While that Springfield is fictional, its population is a fact.

The group of us getting the demo got tired of Stephen’s enthusiastic typing in this query and that query. Many of them are very cool but boring. Comparing stock prices, market caps, changes in portfolio whatevers is something that a zillion financial web sites can do. We wanted more. We wanted our queries.

My query, which I didn’t ask because I thought it would be disruptive, is this: Which weighs more, a pound of gold or a pound of feathers? When I get to drive, that will be the first thing I ask.

The answer, in case you don’t know this famous question is a pound of feathers. Amusingly, Google gets it on the first link. Wolfram emphasizes that Alpha computes and is smart as opposed to Google just dumbly searching and collating.

I also didn’t really need to ask because one of the other people asked Alpha to plot swine flu in the the US, and it came up with — nil. It knows nothing about swine flu. Stephen helpfully suggested, “I can show you colon cancer instead” and did.

And there it is, the line between clever and stupid, and being on both sides of it. Alpha can’t tell you about swine flu because the data it works on is “curated,” meaning they have experts vet it. I approve. I’m a Wikipedia-sneerer, and I like an anti-mob system. However, having experts curate the data means that there’s nothing about the Springfield that pops to most people’s minds (because it’s pop culture) nor anything about swine flu. We asked Stephen about sources, and specifically about Wikipedia. He said that they use Wikipedia for some sorts of folk knowledge, like knowing that The Big Apple is a synonym for New York City but not for many things other than that.

Alpha is not a Google-killer. It is not ever going to compute anything that can be computed. It’s a humorless idiot savant that has an impressive database (presently some ten terabytes, according to the Wolfram folks), and its Mathematica-on-steroids engine gives a lot of wows.

On the other hand, as one of the people in my demo pointed out, there’s not anything beyond a spew of facts. Another of our queries was “17/hr” and Alpha told us what that is in terms of weekly, monthly, yearly salary. It did not tell us the sort of jobs that pay 17 per hour, which would be useful not only to people who need a job, but to socioeconomic researchers. It could tell us that, and very well might rather soon. But it doesn’t.

Alpha is an impressive tool that I can hardly wait to use (supposedly it goes on line perhaps this week). It’s something that will be a useful tool for many people and fills a much-needed niche. We need an anti-Wikipedia that has only curated facts. We need a computational engine that uses deductions and heuristics.

But we also need web resources that know about a fictional Springfield, and resources that can show you maps of the swine flu.

We also need tech reviewers who have critical faculties. Alpha is not a Google-killer. It’s also not likely as useful as Google. The gushing, open-brained reviews do us and Alpha a disservice by uncritically watching the rigged demo and refusing to ask about its limits. Alpha may straddle the line between clever and stupid, but the present reviewers all stand proudly on stupid.