Archive for the ‘internet’ Category

Google Analyzing HTML

Thursday, January 26th, 2006

Google has done a pretty interesting analysis of HTML markup of about a billion webpages. They parsed all pages and have some nice graphs available showing what are the top used elements, tags, classes and attributes (unfortunately, these graphs only show in Firefox 1.5+).

It is pretty interesting to see that almost all pages at least get the basics right: they define html, head, title and body. Most pages contain (at least one) a element, but frighteningly is that more than half use a target attribute, meaning, they open another window for you.

Natural markup using paragraphs (p) is used less often than the br tag. There are also a huge amount of pages that use table, but apparently only half of the pages that use tables put cells in it!

A lot of people use presentational attributes on their webpages. This is most obvious with the attributes for the body element. Also interesting to see is that authors don’t really care about standards, of the top twenty body attributes, nine are invalid, and five have been deprecated for over eight years.

Have a look through the graphs, they show some interesting insights in how the web is currently built up. And it provides plenty of food for thought for the new standard for HTML5.

The Bandwidth of a Train

Tuesday, November 15th, 2005

I recently came across the famous quote from Andrew Tanenbaum again:

Never underestimate the bandwidth of a station-wagon full of tapes hurtling down the highway.

This is still very true and it got me thinking about the bandwidth of a train. During rush hour in the Netherlands, trains are always full of people with MP3 players. So what would be the bandwidth of a train filled with commuters.

  1. Assume that the average person is carrying a 1GiB MP3 player. This is pretty reasonable, since there are also people carrying 80GiB iPods.
  2. A train at full speed travels at 140km/h, which is about 40m/s.
  3. 40 meters of train contains about 100 people. (About 80 people per wagon and wagons are about 28 meters long).

This totals up to about 100 Gib/sec, or about 1 Terabit/sec. This is of course peak capacity. If we use this to calculate the sustained data-rate between Utrecht and Amsterdam, then it comes to about 10 Gigabit/sec.

Makes you wonder what we’re mucking about with 10Gigabit/sec fibers…

iGrid 2005

Monday, September 26th, 2005

This week I’m at iGrid 2005. The equipment and demos people are showing here is amazing, it’s one big collection of fast machines, tile displays, huge displays, huge projectors, webcams, etcetera. I will post some pictures once I get back home.

In the meantime you can also enjoy some first-hand experiences from the official iGrid 2005 blog. There’s some marketing stuff in there, but also some nice coverages of what people are doing.

Asking questions the smart way

Wednesday, August 31st, 2005

About two weeks ago I saw a link to the document How to ask questions the smart way? It seems a bit of a long article, but I can really recommend reading it, see it as investment. Plus, it is written in a good style as well.

Immediately after I read it and ran into some problem, I applied the things from that article and it really helps. It will save you time, because most of the time you can find an answer yourself and you won’t have to wait for someone else to point you to Google.

And if you can’t find the solution, it helps to formulate your questions better. Which certainly improves the responsiveness from people in IRC channels. If you sum up all the (simpler) solutions you already tried, it becomes more of a challenge to them to find the right answer, and more interesting. And even if it doesn’t become more interesting that way, it certainly makes you seem more polite.

On Sender-ID and SPF

Saturday, June 25th, 2005

Just as a warning: Don’t expect me to either implement SPF or Sender-ID.

Both have just been accepted as experimental standards by the IETF. But both also have problems and therefore I don’t want to implement them.

For SPF it’s mind boggling that it has been accepted as standard. It essentially breaks the ability to forward mail: If you send me an email to one of my old adresses, they are sent to my current one with a .forward. That will not work anymore with the current SPF implementations, because the mail seems to be coming from the old server, instead of your mail server. And of course, you also lose the ability to roam freely with your laptop and use any mailserver you are entitled to use as a guest user, because those are not in your SPF DNS records.

Sender-ID seems to `solve’ the first problem of SPF by adding a PRA header (Purported Responsible Addres), which essentially says: “From Sender on behalf of From”, where the PRA encodes the Sender part. This seems like a nice idea, except that Microsoft patented this idea and then sent it to the IETF as standard. And since Microsoft has not taken an official stance as to whether they will enforce this patent, and my mail will probably pass through the United States, I’m not going to take any chances.

And Microsoft is now also pulling a monopoly-like stunt: If your e-mail does not have a Sender ID, Microsoft wants to junk your message