I recently had to restore a Navision database backup (Native - 2.50) after a database corruption and was faced with the prospect of not having enough time to complete the restore before out office opened the next day. To speed up the creation of the 40 GB of native Navision files and restoring of data I turned on the write-cache on the hard drives. The speed for the creation and restore increased by a factor of 4. The database was restored and no data was lost. I know the Navision documention says to turn off write caching on controllers and hard drives. I’m not questioning the controller cache issue, I’m wondering about the hard drive cache. The Navision manual says data may be lost should the hard drive crash before the data is writen from cache to the hard drive. Is it only a matter of losing data or do I risk corrupting the database should an drive fail? (The drives are mirrored and of course the server is on a UPC). Any ideas?
I am on SQL version of Nav3.7… My Server is on a pretty good UPS. It can stay up quite a few hours before the UPS will shut it down. From everything I have read, I thought I would be safe to turn on Write-Cache on my RAID 5 array (14 Drives, all in an external enclosure). Not to mention Write-Cache on my RAID contoller (via the BIOS). To make the story short, someone yanked the SCSI cables by accident on Sunday. And guess what, I paid the price of having only 4 hours sleep on Sunday and Monday. And yes, the DB did get corrupted! And no, I cannot restore from backup because warehouse was doing inventory count and sales was selling. The lesson of my story… Power failure IS NOT the only reason to turn the Write-Cache off. Short of someone cutting your cable, you could still have BSOD, or driver issues. Why risk it? The speed improvement on an already properly tuned server is not that much.
Hi! chubby10, just a question: if you have 14 disks, why do you run a RAID5 which is the slowest of all?!? If you could set up a RAID10 your performance should increase remarkably!
I was planning on RAID 10 it, but was told RAID 5 might be faster. I do have plan to try RAID 10 and see if there is a speed change in the next few months
I’m not talking about turning on the write cache on the controller card. I’d never do that for Navision, that’s asking for trouble. I’m wondering about the cache on the hard drive. In a mirrored situation even if one drive crashed, you’d still be writing on the second. A BSOD would not affect the hard drive writing of all cached writes. The hard drives were looking at (WD Raptors 10K) use Command Queing technology. I talked to technical support at Western digital and they assurred me that the data written to the hard disk is FIFO. I don’t believe him, I believe it writes to the drive in the most efficient way not in FIFO order. If it was FIFO, I’d have no concern. I’m trying to figure out what scenario could corrupt the database. If there wasn’t such a speed advantage I wouldn’t make an issue of it. Any ideas.
As they would say in other forums, I’ll put on the flak jacket… At a certain point, if you want the performance you have to trust the hardware. BUT this is not blind faith. You will have to darn well do all you can to make sure that the hardware will NOT FAIL. And you have to have a procedure to recover if and when it does fail. So you have to do your homework (eliminate or minimize failure points), don’t pinch pennies, and you need to get proven stuff (not second rate equipment). An example of drive caching on major prodution systems is EMC2. A few years ago I was talking to the guys at EMC2. They have a drive storage system with a LOT of cache with hundreds of GB of drive storage. If the hw “dies” that is a LOT of data that won’t get onto the drive. But their stuff isn’t cheap, and it is reliable (well for the most part…grimace). And work on the environment and dumb stuff: - like labeling ALL the cables, so you don’t unplug the wrong one (don’t ask…) - UPS with enough capacity (not just 5 min of capacity, seen this one too) - securable racks. These are racks that are in secured enclosures so people can’t get to both the front AND the back of the equipment w/o unlocking it. see the next item… - dicipline/stupidity…I don’t know how else to say this. I had a server “failure” where the users could not access it, yet it was running. “Someone” removed the network cable from the server (cuz the cable wasn’t lying on the floor…it was GONE !!!). I need a patch cable…hey I know who has one on his server, he won’t miss it…grrrr Replace “patch cable” with “SCSI array cable” and you see what I mean. gud luk Gary