Log / Cache Device - If Failed?

Added by David Venter about 1 year ago

Hello,

I've just learnt through loads of reading how a failed log device can result in a totally corrupt system. It's also to my understanding that a failed log can result in not being able to import the dataset - possibly having a single point of failure with zero recovery?

Please could somebody confirm this for me?

Secondly:

  • Is it possible to add a mirror to a log drive on a live system?
  • Possible to remove it and prevent the above single point of failure ?

The log and cache that I'm using are Intel 320 series SSD drives with log being 80GB and cache being 120GB in size.

Please feel free to ask any information that you may require and I'll gladly forward.

Thank you in advance.


Replies

RE: Log / Cache Device - If Failed? - Added by Linda Kateley about 1 year ago

Please could somebody confirm this for me?

Yes, we typically recommend mirroring of drives

Secondly:

Is it possible to add a mirror to a log drive on a live system?

Yes, but the writes that are logged are only for the next write forward.

Possible to remove it and prevent the above single point of failure ?

Yes log devices can be removed.

RE: Log / Cache Device - If Failed? - Added by David Venter about 1 year ago

Hello Linda,

Thank you so much for your reply.

I assume by removing the log device, I'll be eliminating the single point of failure ?

What would be the easiest/best way to remove this on a live system?

RE: Log / Cache Device - If Failed? - Added by David Bond about 1 year ago

My understanding, since version 19 of zfs it allowed the removal of the zil, which reverted the zil back to the pool. This also allowed for zil to fall back to the pool incase of the zil failing. If the zil fails, you pool should still import fine (wouldnt prior to v19), it should also failover to the pool without problem. The only time that you risk losing data is if your zil fails, there are uncommited writes and you get a crash / panic (could be from the zil failing) then the uncommited data wouldnt be written when you start up / import the pool.

But its not supposed to be a problem, the only way to be sure you will be ok is, to test it (pull the zil out while running). but you appear to be running it in production so may be a problem to do that.

RE: Log / Cache Device - If Failed? - Added by David Venter 12 months ago

Hello David,

Thank you for your reply, that makes me feel slightly better but of course not 100% confident.

For the record: "This system is currently running ZFS pool version 28.'"

So the above said, I should be safe with a fail over to the pool?

Is it easy enough to safely remove zil, this way not having a single point of failure ?

RE: Log / Cache Device - If Failed? - Added by Jeff Gibson 12 months ago

If you do not have mirrored log devices (I don't in our systems because we've weighed the risks) you have a window where you could cause the pool to still become corrupted.

Lets say you're writing heavily to your ZIL and you have 4GB of data (you'd have to be pushing ~400MB/s for that) that is cached and waiting to be written to your disks. This data is currently stored in two locations, Memory and your ZIL device(s). With the current version of zfs if your system crashes it will use the ZIL to recover that 4GB of data on the next system boot. Now lets say instead that you're running along writing 400MB/s to your pool and the ZIL drops offline/fails. The first thing the zfs system is going to do is try to flush all of the data that is stored in memory to your pool disks. This means that for however long it takes to flush 4GB of data to your spinning media zfs (should) block new transactions until this is completed. Here is where you can get into trouble. If ANYTHING causes the system to freeze or reboot before this finishes your pool is now corrupted due to the pool not knowing what has or has not been written. The biggest issue is it'll seem like the system is frozen while this is going on. Lets say you have a nice healthy pool that's able to write at 100MB/s with the small random blocks; you would have to wait 40seconds before any new data can be written to your pool. In VMWare the timeout is 60s before it starts dropping luns thinking they've gone offline; if you have a pool that's only able to write at 65MB/s (or less) you have a chance of your guests appearing to be frozen and possibly causing you to "troubleshoot" by rebooting your SAN.

**Disclosure this is a bit of a scare tactic because if your pool can only write at 65MB/s you wouldn't have 4GB of outstanding data in the ZIL you would only have about 650MBs outstanding. This means that unless something has broken inside ZFS you should only have about 10-30seconds to recover to a safe state where a crash wont destroy/corrupt your pool.

RE: Log / Cache Device - If Failed? - Added by David Venter 12 months ago

Wow Jeff, Thank You for the detailed information! Makes loads of sense!

RE: Log / Cache Device - If Failed? - Added by David Venter 11 months ago

I'd feel much better if somebody could guide me in removing this single point of failure.

I want to remove the ZIL (log drive) and let it just fall back onto the pool.

How would I go about doing this?

RE: Log / Cache Device - If Failed? - Added by Dan Swartzendruber 11 months ago

It should be safe to just do 'zpool remove XXX YYY'. If you're that freaked out that the host might crash in the tiny window, reboot single user and do so (or boot from a live CD and do the remove.)