dean.edwards.name/weblog/2005/09/bbc-space/

How BBC News Could Save 25% On Bandwidth

With my WaSP hat on I’ve been looking at the markup of the major news sites lately. Suffice to say that they all suck (especially Google News).

Whilst looking at the BBC’s source code something struck me. They are sending a tremendous amount of white space up the wire. The BBC is funded by British tax payers. That means all you lucky people outside of the UK get to read quality news coverage paid for by myself and my compatriots. This is annoying enough in itself. Today I discovered that the profligate BBC are frittering away money sending people lots of empty space.

A typical page contained 23% unnecessary white space characters (tabs, line feeds and normal spaces). Other pages reveal as much as 30% white space. I am not talking about the total white space, just the space between tags.

I don’t know what tools The Beeb are using to produce their content but surely they can be tweaked to remove this redundancy? It will result in faster loading pages and lower bandwidth costs. They can put the savings toward lowering the license fee. Then they won’t have to resort to harassing single mothers for money.

Comments (15)

Leave a comment

Sending pages out gzipped to browsers that can handle it would be ace, too.

  • Comment by: Olly
  • Posted:

Well, with *my* WaSP hat on, I can say that a number of us UK WaSPs spent some time with some of the web guys down at the BBC earlier this year. We learned two relevant things:

1) The BBC really understand web standards, but largely due to the scale of everything (sites, content, organisation, structure) things take a while to change.

2) Their biggest bandwidth munchers are the audio and video feeds *by far*. And with the volume of stuff they’re serving up down there, a bit of whitespace really is just a teeny tiny drop in the ocean.

Not that your point isn’t valid of course, it is.

This is a very interesting point though I don’t believe BBC is aware of that and it’s actually the browsers who should deploy such a compression engine.

  • Comment by: Jane Jolin
  • Posted:

One area in which the BBC does triumph online is Internationalization. Their multilingual document management is probably one of the best examples of broad application of Internationalization techniques anywhere. This is not to say their practices are particularly compliant, but overall they definitely take on challenges many other news sites have shied away from.

Wearing my WaSP hat, I have to say I lean toward agreeing with you, Dean, in terms of that suck feeling regarding source. Many large sites aren’t getting the message, which is impeding progress, wasting money, wearing down their human resources and not helping the Web get cleaned up.

I also agree with you, Drew, in that individual environments produce real life optimization challenges – particularly on very large, active sites. AOL, Yahoo! – they each address their optimization issues differently for different reasons. In some cases, the http requests are of greater concern, in some cases it is optimization of markup and code.

Wearing my WaSP hat, I wonder why you aren’t cross-posting WaSPy posts like this TO WaSP, Dean. :P

“it’s actually the browsers who should deploy such a compression engine.”

No kiddin’. It’s the network traffic BEFORE the payload gets to the web browser that’s being questioned here.

  • Comment by: anona
  • Posted:

It’s actually the british TV license payers that fund the BBC, not british tax payers. If you don’t own a TV (or if you don’t pay for a TV license) then you don’t fund the BBC.

  • Comment by: Anthony Williams
  • Posted:

Now that is much whitespace. Combined with cleaned markup, BBC could surely save another 25%. And you are right with Google, but not only Google News, all their pages are badly written.

Are bandwidth costs really an issue?

  • Comment by: Martijn
  • Posted:

Are bandwidth costs really an issue?

Depends on who’s paying. ;-)

Page load time is also an issue however. This is a common problem I’ve seen on plenty of other sites.

  • Comment by: -dean
  • Posted:

And seeing that the beep does everything with XSLT (I flunked one of their tests majorly some months ago when I entertained the notion of starting to work there), I suppose it could be achieved by a change in the XSLT parser. Which of course can mean that MSIE will change the display.

There is an article on the Beeb news site, (just found it) about all the servers and how its managed. See: Under the bonnet and this network diagram.

Anywho, perhpas you could give this guy a poke about it, he might be able to tell you more and/or pass it on to someone that can.

  • Comment by: Tom
  • Posted:

Hi Dean,

how did you stripped out the whitespaces of their site source code? I`m intressted in the regex you are using..

Thanks, Markus

  • Comment by: Markus
  • Posted:

Markus – I used my text editor which has pretty good regex support.

  • Comment by: -dean
  • Posted:

Hi Dean,

thanks for your reply.

I`m intressted in the exact pattern you are using..

Thanks, Markus

  • Comment by: Markus
  • Posted:

I looked into writing a module for Apache to strip out unnecessary characters before sending the content. After investigating the benefits on a large portion of sites, I felt it wasn’t worth the effort.

On the majority of sites I checked, the percentage gain broke down something like this:

The conclusion was simple, install a compression module and essentially get the same benefit.

Al.

  • Comment by: Alistair
  • Posted:

Leave A Comment

Line and paragraph breaks automatic, email address never displayed. Some HTML allowed.