In certain computing circles, "clustering" is a dirty word. I've heard of situations where, far from providing business continuity peace of mind, it creates more work and greater risk than it would if it were not involved.
This is not the case with Domino clustering. Done properly, it is extremely reliable
Our Problem
Recently, our cluster seems to have "picked up a slight flutter". Actually, I think that perhaps the rules behind it may have changed sometime around our 8.0 or 8.5 migration.
So, first I want to cover off basically what our cluster looks like;
Ok, this is quite a simplistic view and there are servers missing. I'm concentrating on the problem area only.
We have an onsite and offsite clustered Lotus Domino server, both running Lotus Domino 8.5 HF 1021. We'll call them "Onsite" and "Offsite" for ease of reference. The servers are quite a distance apart because we're clustering for business continuity purposes.
The theory is that our onsite staff members should access the onsite server unless it is down. The majority of our agents also run on this server, as does an intranet, extranet and several web sites. It's a busy and powerful box.
We discovered recently that many of our clients have been using the offsite server but we don't know exactly why.
It seems that if you open a database for which you don't already have a desktop icon, then the Notes client will default to opening it from the offsite server. What has exacerbated this problem is that we upgraded our clients to 8.5.1 and blew away their desktops. Now, suddenly all the computers are trying to access everything off the offsite server.
The reasons?
We don't know but were thinking that it was either;
- Alphabetic: Because "Offsite" is lower in the alphabet than "Onsite"
or
- Task Related: Because the Onsite server is much busier than the offsite one.
Does anyone have any ideas as to how we could go about finding out?
Comments
Folks' clients go to the onsite one first, so based on that I'd vote for the alphabetica explanation.
But it's most likely because computers hate us and enjoy making life hard. ;-)
1) Do you enforce the mail server by desktop policy?
a) There was an SPR address in 8.0.2 FP5 AJAS7PDKER
SPR# AJAS7PDKER - After restart, the top of the workspace icon stack does not honor the Mail file location set defined in the location document. This regression was introduced from 8.0.2
2)Have you tried server_restricted on the outside server, to make sure the users only access the inside server.
If you use policies to push out the applications and such to be only n the onsite server then they should find it first.
Domino does lookup by alpha so it is possible you have multiple ways to do this.
You could also disable the offsite server from anyone using it by setting its threshold to maximum and stop people from accessing it, if you so desired.
I looked at server_restricted and although it looks interesting, it seems to be suggesting that replication with a restricted server will fail. Since we need the other server in case of DR, I need things replicated on it.
I'll have a look at the Server_Availability_Threshold notes .ini option.
Second thing. Use a server user restricted to stop users hitting the box, but that will still let replication do its thing.
Last - check SAI and expansion factor / trans info range.
By all means email me if I can help.
P.
While you could use SERVER_RESTRICTED=2 to prevent users from accessing the DR server, this isn't ideal because access won't be seamless in the event the primary server is down. Instead, the better solution is to set SERVER_AVAILABILITY_THRESHOLD=100 on the DR server, which means it is always in a busy state and will only take user connections if no other server is available.
http://www-01.ibm.com/support/docview.wss?rs=0&uid=swg21260389
I agree with the previous pot, the best way to solve you problem is the use of SERVER_AVAILABILITY_THRESHOLD=100.
But you need to know that this parameter is actually bugged and doesnt work as you can see in the following email i got with IBM:
"I have managed to reproduce the issue using 8.5 and 8.5.1 versions however Development is already aware of the situation and SPR # JSMN825TC8 is opened with the issue. We are expecting the issue to be resolved in 8.5.2 but the status of the SPR is still open and the only available workaround is to use the server_restricted=1 notes.ini value instead.
At the moment we have to wait for the specific SPRs resolution any progress can be checked from the Fixlist database or by directly calling HelpDesk. Please inform me If you require any further assistance from my side or should l conclude the PMR from now on.
Thanks in advance for your understanding "