Monday, May 28, 2007

Who has the Right to Perform Updates on your Computer and how Important are they?

These days, it seems that just about everyone feels that they have the right to update your computer automatically and for the least important applications.

But the questions need to be asked;

  1. Exactly how critical is the update and what will happen if I don't do it?

  2. How much testing has been done on the impact of this update on the applications I like to run?

  3. What is the update doing to my startup?

  4. Who gets to decide the timing and what warning is given?

  5. What will the application do to my network?

I'm not going to attempt to answer these questions in this post. I'm basically just going to whinge and then post updates as I get my policy in order. At the moment, I don't know what the best answers are.

The Microsoft Whinge
Microsoft used to provide only important security updates as "critical" but they are now putting out all sorts of unwarranted updates, like Internet Explorer 7, via the automatic distribution system.

The really annoying thing about Microsoft's update strategy though, is that they believe that they have the right to reboot your PC without your permission just because you don't seem to be doing anything much at the time.

I often leave my PC running overnight to do big things, such as download large files, convert pictures to DVD with fancy fades etc. In Microsoft's view, I'm not using my PC (keyboard and mouse), so it's alright to reboot.

There's worse though, much much worse.. I've left the automatic update on the recommended settings on servers just to see what will happen... Microsoft thinks nothing of rebooting the domino server just to put an IE update on it. I don't even use IE on my servers but Domino really needs to run 24x7.

I've also seen the Microsoft update reboot a server in the middle of a backup job.

Needless to say, I've turned it off on the servers I care about but now there are some serious considerations at work as to whether or not we need to turn it off on all our PCs as well.

Other Update Culprits
Microsoft isn't the only one, there are lots of other update culprits too with probably the worst of these being;

  • Sun Java
    Aside from being overly interactive and very slow to update, the sun Java update has one particularly nasty flaw. I've never seen "good" come from it. In other words, I've never had an instance where something that didn't work, suddenly started working after the update. Of course, I've seen plenty of things go the other way - for instance, I can no longer remotely manage our Symantec Firewall from my PC. Luckily, I have another un-updated PC I can use. This update should certainly be stopped.

  • Anti-Virus (Symantec and McAfee)

    Last Thursday, McAfee decided to do an automatic update to their personal firewalls. There were a number of side-effects. The update blocked most applications, including those previously given authorization, from the internet. It forced users to reboot, sure, it did give them a Yes/No choice but it also prevented any access to the file servers (bad luck if you had open files). Finally, it reset everything back to an "untrusted network" status. The timing was very unfortunate as Thursday was our "Board of Director's meeting day" and everyone was in a rush. It basically meant that I had to drop everything to deal with the problem - and it took all morning.

    Although the update was messy, at least it worked. We ditched Symantec about a year ago because it was deploying updates which failed regularly. From the look of the news, this is still happening - Article: Anti-virus cock-up paralyses millions of PCs (thanks Anna for the link).

    So should Anti-Virus updates be turned off - certainly not - but there has to be a better way of testing them first. I'll be contacting McAfee today (hopefully) and will post some results (if I get them) here.

  • Adobe Acrobat

    Adobe should win some kind of award for the worst deployment mechanism for updates. Where else can you find an update system that wants to reboot in the middle of an update and then continue installing. Also - why do they keep trying to sneak extra bandwidth hogging software in? Photoshop LE? Yahoo Toolbar? Come on, if we wanted it we would go get it ourselves. You should certainly remove the automatic Acrobat update utility from your computer. It doesn't serve any useful purpose as far as I can see.

  • RealPlayer

    Unlike its nice brother Quicktime, RealPlayer loves to be automated and loves to be updated. I don't play a lot of realplayer stuff but I've noticed that it refuses to play files and wants to update almost every time I use it. Unless you absolutely need to access Realplayer things, this belongs OFF your system - not the updates, the whole application. If you do need to run it, go through the preferences and file association with a fine toothcomb, nearly everything is NOT what you would want. Oh, and get it out of startup.

Mucking Around with Startup
This post is getting long, so I'm not going to worry about covering startup here suffice to say that you should get your hands on Mike Lin's excellent Startup Control Panel Applet and start blowing unnecessary things out of startup. I'll explain how to identify things in another post, but for the moment, consider removing the following from startup;

  • Adobe Acrobat Speed Launcher

  • Adobe Acrobat Assistant

  • Quicktime Tasks

  • iTunes Helper

  • RealPlayer

  • Sun Java Update Sched

  • CD Burner Utilites (like Nero)

  • MS Messenger unless you use it

  • Spyware - such as anything starting with WhenU

You don't need these applications in startup, they add a lot to the load-time of your computer. If you need Acrobat, you can start it (or it will start when you open a PDF file) - you may need to wait a few seconds, but better to lose a few seconds there than during startup every time. The same applies for most of the other applications.

Thursday, May 24, 2007

Recovering Deleted and Formatted Photos from CompactFlash and other Camera Cards

Last week my neigbour accidently pressed the format button on her camera. After all, format usually means something quite different in non-computing language. To her horror, her camera quickly counted up percentages and then responded with "Card is Empty". She had not backed up any of her photos and was quite upset because the manual said that they were gone forever.

Now I had heard that it was possible to unerase files on these cards, but I wasn't so sure about formatting but I tried a few things and presto.. well, actually quite a while later because recovery of over 1000 images took almost 3 hours, the photos came back.

Here's how to do it...

A Few Important Notes

1. Get yourself a card reader.
This is really important, Stop connecting your camera to your computer. Those USB cables carry POWER as well as data. You don't really want to be putting power in places where you camera doesn't expect it do you? The USB card readers are about $20 these days and well worth it for the time they save (much faster via the card than via the camera). Also, you can't perform these recovery steps with the camera.

2. Delete Less and Archive More
NEVER Delete photos that you want to keep, even if you've already printed them. Set up a place on your hard drive where you can store these photos and save them there regularly. You should also burn them to non-erasable media such as CD-R, on a regular basis.

You should try not to delete any photos, even those horrid blurry ones unless you have to. It's best to start deleting from End to End (ie: Fill the card with 800 photos then delete the first 100 to make room, or copy the whole lot elsewhere and reformat the card). Deleting files in the middle tends to reduce the recoverability chances.

Ok Now onto Recovery.

1. Obtain and install this fantastic software - it's free and it's easy to use.
Zero Assumption Digital Image Recovery

2. Run the Software and Select Digital Camera Card (the card needs to be in a card reader and addressable as a drive letter).

3. Wait - There's three passes;
a A Detect Data Pass, which looks at the entire card and locates all "used" blocks of data. This looks pretty much like the defragmentation screens of yesteryear.

b. A Locate Photos phase, where I think the software looks at the blocks of data and tries to identify headers for pictures.
c. A recovery phase, where the software starts creating files.

I'd suggest that you DONT restore to your camera card - instead, you should restore to an alternative location such as your hard drive.

Wednesday, May 02, 2007

Domain Controller Update

After obtaining some older hardware yesterday, we attempted to proceed with the Windows NT Domain Controller installation. Once again, our attempts ended in failure as we found the same problems as before (the network card needed service pack 3 or greater before it would install).

Eventually, we ended up installing Microsoft Windows NT 4.0 server but not as a domain controller. We then used a product that we found on the Internet called UPromote to convert the server to a Backup Domain Controller. It worked, and our domain infrastructure is now safe for the moment...

The one caveat with UPromote is that it prevents existing NT domains from being upgraded to active directories. I am not concerned with this limitation however as active directory should be designed and implemented from the ground up, not migrated from an existing NT domain.

Tuesday, May 01, 2007

How Video Killed the NT Domain

Our company will shortly be moving office and as part of our pre-move testing, we need to give all of our servers a cold start. This doesn't guarantee that they will start at the new location but at least it proves that they have recently been able to cold boot.

All of our servers, except one, are Microsoft Windows 2003 server. The one exception is a Microsoft Windows NT 4.0 server which acts as our Primary Domain Controller. We used to have a backup domain controller but when we were asked to move everything to Windows 2003, we lost the device. Windows 2003 server cannot provide Windows NT Domain services. If you need login services in Windows 2003, you are required to run Microsoft active directory.

It was always our intention to put Microsoft active directory on our servers at the earliest possible opportunity and to remove the Windows NT server entirely. Unfortunately, we never got around to the job due to other work commitments and the amount of planning required to implement an active directory infrastructure.

The result was that we were in a precarious situation where we only had a single domain controller on some very old hardware. Management were aware of the problem, but I don't think they fully understood the implications as I had been told on several occasions that if we did not need to go active directory immediately, we should leave it for later.

Our biggest mistake, was not making sure that we have a backup domain controller available at all times. In the IT world, temporary solutions have a way of becoming permanent and as we'd been discussing the move to active directory for 2 years, it should have become obvious that we were going nowhere fast and needed to re-implement NT domain in a more permanent fashion.

The Cold Boot
We aren't a 24 hour shop and it is much easier to do system maintenance early in the morning than it is to do it late at night. Nobody gets in early but lots of people stay back. In addition, we service the whole of Australia, so people in Perth are always working later than people in Sydney. The side effect of early morning work is that you only get a short window of opportunity and if something goes wrong there isn't much time to fix it before business commences.

Our servers don't get rebooted very often. The domain controller hasn't had a reboot in six months. It's most likely that the last few reboots were warm boots and it probably hasn't had a cold boot in more than two years. I wasn't really expecting a problem - if I thought it likely, I wouldn't have done the reboot that morning.

So, 6:30 a.m. I powered the server down. I waited a full 10 seconds after power was off and then pushed the power button again. The power came on and the lights flickered for about five seconds and then the server shut down. I tried again with the same result. From the third try onwards, I wasn't even getting any lights flickering, the server was dead.

I quickly removed the cover of the server to determine whether or not there was any kind of burning smell - this would confirm my fears of a power failure. I couldn't detect any smell. With the lid off, I checked the disk drives - they were SCSI. This meant that they could not easily be transferred to another PC.

[An Aside: While I am quite happy to sing the praises of SCSI drives on Raid 5 for data critical servers, I'm really not convinced that RAID is great way to go for an operating system. It seriously limits your options for transferring drives to other computers and I've heard it said that there is a considerable performance impact.]

Problem Solving
The next step was to remove the server from the computer room. After all, it is just too cramped in there to work. I got the server to my desk and after a bit of a scrounge found some PS2 cables (everything else is on USB). I tried powering the server on and I got five seconds of lights - better than I was getting in the computer room. after a few more goes though the lights stopped appearing.

There wasn't much a could do so I started working on a PC in the hope of loading NT Server on it. All the time, I was thinking how am I going to replicate the domain given that the domain controller is dead. I considered ghosting from one server to another but without power that wasn't really an option. I also thought about our "Full" nightly backup but how are you supposed to get an operating system from tape on a server to a PC when the tape drive is internal. I am sure that there is a way of doing this but I think I need additional resources. In reality, I think that backups are for data, not for operating systems.

Once I had hit some stumbling blocks on the PC (more on them later), I returned my attention to the server. I powered the server on and it started running. Grateful for at least one piece of good luck, I went back to the PC in the hope that I could get it to operate as a backup domain controller before the primary domain controller failed again. Unfortunately, the primary domain controller only lasted 5 minutes.

I started to make a few phone calls but I really couldn't think of anyone who would be able to bring some hardware with them and install it. The server in question was well and truly out of warranty and was a no-name server. I don't approve of unbranded servers but this one existed in the company long before I started.

Staff started to appear and ask questions about why they couldn't login. Our server automatically logs everyone out at night, so nobody had a working login. They did, however have access to Notes mail and the web which supports the theory that systems segregation is probably more important than integration. Our problem wasn't visible to the outside world as all the external systems were functioning normally.

It was about this time that my boss arrived and in the most time-honoured traditions the server worked almost as soon as he touched the power button. More than that, it made a liar out of me by staying running for hours. It was still making the most dreadful high-pitched noise though which led to complaints from some staff (who didn't realize how lucky they were to be able to login at all).

At this time, a technician whom I had contacted earlier arrived. Embarrassingly, the server was operational. The technician listened to the noise and advised that we do something as early as possible. He said he would give us some time to try to set up a backup domain controller and to allow our staff to access their documents.

NT Server Issues
We started trying to build a new backup domain controller on a PC. We installed Windows NT 4.0 Server service Pack 1 and started going through setup and selected backup domain controller when prompted. Of course we were unable to login to the network as most network cards need at least service Pack 4 under Windows NT.

The problem was that you can't apply service packs until you have completed installation and you can't complete installation on an NT Domain Controller until you have replicated the domain settings from a primary domain controller. It was a classic catch 22 and we would not be able to install the server.

We looked at other options, obviously ghosting was one but it would only give us a single primary domain controller. The best solution would be to setup a backup domain controller on trusted hardware and then later promote it to primary. This would give us fallback for the future.

The next thing that we've thought about was that we could install Windows NT server as a standalone server and then upgrade to a domain controller. Yes, I know that it isn't actually possible under Microsoft but there is a very good product out there called upromote which we intended to use. We got to the point of installing service Pack 6a but were still couldn't get a network card running as it was too new. By this stage, we had exhausted the hardware available at the office and the technician had come back ready to perform some server maintenance.

Server Maintenance
The technician first replaced the power supply and we waited with baited breath for the server to restart. It did restart but continued to make the awful noise. The noise was coming from the drives but it was obvious that there was not a problem with them. There was however still a problem with the power. Eventually, the technician discovered that the video RAM had suddenly decided to become faulty. We looked around for a video card on-site but as most PCs now have video on board, it was not easy to find one. Eventually, we replaced it with a $600 High Definition TV/Video card from our presentation PC. The server is now running happily.

[One note of interest: Users were still able to access shared drives when the server was down the second time. This was because they were still logged in. All the more reason to do server maintenance at night - BEFORE the server logs people out]

The End?
The story is not complete as we still need to build ourselves a backup domain controller. I'll document this as it happens. There are also a lot of issues coming out of the reboot, In particular, I guess we need to treat a reboot as part of change management even though there isn't really a change involved.

Finally, there will be the move to consider and migration to active directory. I will post stories about these as and when they happen.