Monday, January 04, 2010

Our Infrastructure Upgrades - Domino (Part 1)

Our Domino Infrastructure upgrades were originally scheduled to take place in early 2009 but when we discovered that Notes 8 closed the "duplicate subform" loophole which, as it turned out was used by one of our mission critical apps, it was put on hold.

In fact it was put on hold for so long that 8.5 and then 8.5.1 were released in the meantime. After a careful look at both, some unresolved issues in 8.5.1 led us to a decision to deploy 8.5.

The Notes 8.5 Client
The rollout of 8.5 client was done quietly, one person at a time. Given that this is such a major change to the client (eclipse), it's hardly surprising that there were issues - some of which are again, still unresolved.

We did quite a few experiments in that initial rollout;

  • Networked Data Folders
    We tried putting the data directory on the user's home drives (network shares). It's an old trick which works to provide better (more complete and seamless) roaming than Notes itself provides. Unfortunately, notes still "hammers" the network connection - even at 100mbps - last time I tried this trick, I had it working (most of the time) at 10mbps - under Notes 4.

    A partial network install of the eclipse version of notes generates a great deal more traffic than it's R7 counterpart. This is because while the previous versions of Notes were one big .exe file with a few toolbars and icons, the eclipse version contains hundreds of little files and resources, all of which are loaded on demand. This bandwidth utilisation which runs in direct competition with scheduled replication tasks can add minutes to the notes startup.

    The other downside to running this type of notes install is that the client needs constant uninterrupted access to the desktop8.ndk and cache.ndk files. If the access drops, even for a second, the notes client goes into a loop from which the only escape is to end the task. There is actually a way of moving the files but we didn't bother trying it.

    It was obvious that the method was "no-go".

  • Standard Multiuser
    The next install type we tried for the clients was the standard multiuser install. This gave much better network performance but restricted our roaming options slightly. It was still supported but you had to do a Notes setup for each user on each machine.

    While this method was certainly faster and more stable than the networked method, it still wasn't as fast as we'd like.

    The reason? Well, it's again related to having all those little files instead of one big one. Normally when you start an application with a big EXE file, there is only one delay when the on-demand anti-virus scanner gets hold of it. When there are hundreds of little startup files however, the on demand scanner doesn't know what to get until the app "reaches" for it. The result is that it introduces a delay on every file. As the number of files being read increases, the delay becomes increasingly significant.

    There is only one way around the problem - remove the on-demand scanning. Now obviously we're not willing to do that, so the next best bet was to disable it for the notes directory. I was still a little uncomfortable with this idea but we tried it and got significant improvements.

    The improvements still were nothing on the speed of my designer client and after a little investigation, we discovered that the on-access scanning was still doing the notes DATA directory. We tried disabling this via policies but that didn't work - we couldn't specify the folder names because of that stupid Microsoft profiles folder naming convention;

    The data folder was in;
    C:\Documents and Settings\%username%\Local Settings\Application Settings\Lotus.

    Since the folder was different for each user and since our McAfee software doesn't take variables in exclusion folder names, it wasn't possible to set a policy. I wasn't even going to consider adding the entire "Documents and Settings" folder to exclusions - if you do that, you might as well turn off the anti-virus altogether.

    In the end, we asked ourselves how important the casual roaming ability was and decided that it was less critical than everyday startup times. We're now converting to a standard single user install and have perfected a procedure for moving without a reinstall.

Wow - that post was longer than intended, so I'm breaking it up. Next time I'll go into detail about our client settings (what we changed and why) and talk about our server and design installs.

No comments: