Over the last few weeks I have been listening to the archive of Hanselminutes, a new(ish) dotnet developer podcast by Scott Hanselman. Scott is the kind of developer who can’t resist a beta and is currently reporting his progress with Vista via his blog. The shows are usually 20-30 mins and cut straight to the chase giving us developer types the sort of content we crave, so point your podcatching client at the show and enjoy.
I did some digging into the upcoming Open XML formats and found a great article by Ted Pattison explaining the changes MS have made. Essentially the new docx format is a zip file containing a bunch of other files mostly XML files that make up a word document ms refer to this as the ‘Package’. Ted has written a great article that explains how it all hangs together in some detail.
So why did MS go to all this trouble? Well folks like me want to be able to programatically work with Word documents on the serverside so we can automate certain processes. The old binary formats supported by previous versions were unfriendly so we tended to use the word object model or VBA to do this. These were not very performant and less than 100% reliable. So according to Ted the new formats will allow manipulation of word files without installation of Word via some new .Net classes. To start with this will not be straightforward, first you will have to familiarise yourself with how a docx package is structured, but I imagine some coding gurus have begun work on this already and by the time Office 2007 is released you will be able to purchase a 3rd party component that will make indenting paragraphs a snip.
We have a need to manipulate bookmarks so we can inject data into documents prior to converting them to PDF’s. I am on the lookout for worthy open source projects that are attempting to undertake this task, I would gladly make a contribution to their efforts in order to reap the rewards :o)
Another great article that goes into more detail
A project with example code on how to create a simple docx from scratch.
I have been using Google Spreadsheets to see how useful a tool it really is. I uploaded a small Excel sheet that I built for the purpose of testing out the GS application.
Unlike Excel in GS you have to insert extra rows and columns as and when you need them. You can perform multiple inserts by highlighting multiple rows, then right-clicking and selecting insert 5 below or above. I attempted to select 30 and insert 30 below then 60, 120, 240. When I tried to insert 240 something broke because the 240 rows did not get inserted and there was permanent Updating… message. Maybe this is just a glitch or maybe you cannot work with this many rows. I noticed that Internet Explorer’s memory usage rose to ~80mb typically this is about ~20mb.
My conclusions are that currently they are only suitable for light duties, where online access and or collaberation is needed. This may with time and effort on googles part change.
I just wasted a day figuring this out. Basically we have an ASP.net application written in VB dotnet where all the pages inherit from a base page. The base page in turn inherits from System.Web.UI.Page. We use the base page along with some user controls to ‘theme’ our application.
The application is quite mature so I was a little suprised when I checked it out of source control and it blew up in my face when I attempted to edit one of the web forms in the forms designer. A large message box would pop up complaining that ….
To finish the solution was to remove the Base Page file from the project and copy the contents to new file then add it back to the project then all was well again.
Apparently VS2003 is not very smart when it comes to Design Time / Run Time issues good things are already being said about VS2005 in this reagard but time will tell.
We recently visited the Cornish Seal Sanctuary, I have posted a video on google video that I edited with Microsoft Movie Maker. This is the video editing for dummies utility offered free by microsoft.