Net-burst wrote:
I'm also not an expert, but if VM just suddenly shuts down or hangs, you will have corrupted data if you had write event to file system, that wasn't flushed. And this can happen to anyone because inside VM disk writes are also cached. Battery-backed RAID is needed to sustain integrity of RAID itself and flush all data from cache of RAID itself. But we also have VM cache

The combination of journaling filesystems and the BBU RAID should prevent any filesystem corruption, but you could certainly have application level data that did not make it to the disk if it was only held in in-memory buffers (at any level) at the time of failure. But that's the application's fault, as without application flush requests, there's never any guarantee about data consistency on media.
On any system, those applications that require such consistency should be handling it themselves with explicit flushing, and nothing you can impose externally can correct things if they don't. Consistency has to start at the top, from the application. Databases (at least ACID-compliant ones), for example, usually have their own level of journaling which is flushed prior to writing any actual record data. (I suppose arguably that case could then be considered corruption on restart, but the database will just replay the journal and no data will be lost) Even a simple logging application needs to use flush if it wants any assurance that the data has been written, regardless of what's happening beneath it at system level. The flush may not turn out to be sufficient, but it's required.
The problem with modern disks, and what write barriers were introduced to help address, is that the disks themselves may cache and reorder data writes, so that even when the filesystem driver believes it has written data to its journal or in proper order (which the application level flush is then trusting to mean its data is on physical media), it may only exist in the drive's cache, and a sudden outage may end up with that data never making it to the disk media. The write barrier prevents the filesystem driver from writing any further data until the disk guarantees prior data has hit the media successfully (and assuming the disk isn't fibbing, which some have in the past). Unless you're on an LVM volume, which I believe does not currently pass barrier requests through to the media.
However, the BBU on the arrays in the Linode case solves this in a separate way. It's there to ensure that at a minimum any data currently held by its cache is persisted until the following reboot, at which point it will be immediately written to media prior to any other operations. So (at least theoretically) there's no way for data not to reach media once the filesystem driver has handed it to the drive array, and barriers would offer no particular benefit, except likely slowing down the application while it waits in a shared environment for the data to be written to media. Some early measurements had the hit as high as 30% for some workloads and I don't think that was even in a shared environment. Now, the BBU isn't quite an absolute guarantee (it could fail, or the disks could be offline longer than it can maintain the cache - probably a few days at most) but it's pretty darn good, and the most critical applications will have their own way of dealing with actual corruption in such rare cases, ala databases above.
Perhaps a more succinct way to think of it is that barriers were introduced when you couldn't trust your disks, but rather than barriers, a BBU just lets you trust your disks again. And without the performance hit barriers introduce.
While I'm not 100% sure, I also don't believe barriers have any impact on the point of higher level application data consistency since applications would still need to have flushed their internal data (otherwise the filesystem driver might not yet have chosen to write the data itself). The barrier option in ext4 for example, affects the journal commit record (and data sent to the disk prior to that point), but not unflushed in-memory cache data. So appropriate flushing is needed barrier or not. I'd certainly want any application whose consistency I cared about to be written to explicitly flush data it required to be stored.
-- David