One of our correspondents, Paul Cooper, asked about the pros and cons of writing websites using Microsoft Office tools, such as Word and Excel, after a friend of his said that Office introduces lots of unnecessary elements into your code.
Office bloat
To an extent, that’s true. You can simply save a file as a web page from Word’s
file menu. The file opens and displays in the browser, and looks pretty much
identical to the Word document. So far, so good. But count the characters and
check the properties of the HTML file, and you may well see the text accounts
for only a small amount of the size of the file.
Of course there’s always some bloat with HTML, since the tags themselves can be verbose, and you could dispense with style sheets completely and just use a couple of heading styles, and one paragraph style. Even so, it’s clear that what Word produces is not the most compact way of doing things.
To find out why, it’s worth taking a look at the source of the web page that Word produces. First, you’ll see references to the Microsoft Office XML schema, followed by tags telling you which version of Word was used. Then there’s a chunk of XML that contains the document properties, as set via Word’s File menu.
Additionally the author, as set up in your Word installation, will be added – so if you’re writing a critical web page that you intend to post anonymously, don’t use Word without checking the code later to make sure your identity isn’t revealed to all and sundry.
Following the description of the document, there are details about the template used to create it, and then a CSS-style sheet that includes all the font and style definitions used in the document before, finally, the actual text of the web page.
As I’ve mentioned before, there are sound reasons for using CSS in your web pages – it makes them easy to manage. But they can also be verbose and the inclusion of all the styles, even if only a couple are in use, makes the Word-created web page much longer than it needed to be.
For a simple site, you probably don’t need to worry too much about this, but if you start adding multiple pages, you’ll end up with a lot of information duplicated, because Word keeps the style information embedded. With other web-editing tools, such as Dreamweaver, you can easily link to an external stylesheet, where all the information is in a separate file.
That has two advantages; first, it’s only loaded once by the browser, and all the individual pages can be smaller, so your site has to send less data. And second, if you want to give your site a makeover – new corporate colours, perhaps, or a seasonal theme – you just need to change the single stylesheet file, and all the pages that refer to it will update.
Other Office HTML quirks
Compactness isn’t the only reason to use tools other than Microsoft Office for
more complex websites. If you turn pages that have pictures in them into web
documents, Word will deal with all the pictures for you, creating a folder for
each page's images. You can upload them all easily enough but it might not be
the way you want to organise your website.
For instance, when I’m doing a site, I tend to group all the images – say product images – into one folder. Then, if I had a set of new pictures, perhaps for Christmas, with products covered in snow, I could just upload them all into the same folder. If Word has created a folder of images for each individual product, it’s potentially going to be a lot fiddlier to sort out, instead of one batch upload.
Word is not the only part of Office that can save as HTML. Excel can do it too, and you’ll often encounter tables on the web that have been created by choosing the ‘Save as a web page' option.
And sometimes they can be utterly confusing; not only is there the same privacy issue as with Word, and the extra information, but one site I regularly look at used to show up another Office flaw well – an Excel spreadsheet listing features of competing products is saved as a web page, and ticks or crosses appear in columns to indicate whether a particular product has a feature.
It all looks great in Excel, and great if you’re a PC user with Internet Explorer; but look at it from a Mac, or some other browsers, and instead of easy-to-understand symbols, you end up with characters of the alphabet that shed absolutely no light on what the table was supposed to tell people in the first place.
All Online
