Tuesday, 1 May 2012

How to convert Word documents to clean HTML

 While you can "Save as" a DOC or DOCX document in Microsoft Word as "Web Page, Filtered", the resulting HTML includes all the MsoNormal etc classes. Which is bit messy for my taste - I often want clean HTML.

To convert the Word document to pure, simple HTML, rather than trying to email the document to yourself in Gmail (which no longer works the same way anymore, anyway) I've discovered a quicker, cleaner way.
  1. Download and install Windows Live Writer (which I use anyway for blogging - it's the best free blog writing software there is).
  2. Launch Live Writer.
  3. Then simply switch to your Word document, and copy all (in the Word document press ctrl-a to select all, then press ctrl-c to copy all)
  4. Switch back to Live Writer, click in its document window if necessary, then press ctrl-v to paste the Word document's text into Live Writer.
  5. In Live Writer, now go to the Source view (bottom left) and then you can copy and paste the nice clean pure HTML from that view, into Notepad or other text editor to "Save as" an .html file.

That's it. Clean, simple HTML. And no MSo classes either!

No comments: