db3l wrote:
bji wrote:
I found that s3fs is kind of poorly written (I knew this going in - which is why I was really hesitant to use it but now that it's my best option I've switched over to it)
I'm curious if you gave s3backer any testing? I haven't had need to set up S3 access for one of my nodes yet, but I sort of like the s3backer design, and it's a bit more active of a project. You would lose the ability to reference buckets externally, but as you say with respect to gallery2, you're not doing that right now anyway.
s3backer's treating S3 as a raw block device likely has different trade-offs (including potential operational cost) than s3fs, but then again maybe it wouldn't have quite as many warts?
-- David
s3backer is a much, much better designed and implemented filesystem from what I have seen. I had some interaction with the author and from the code, and from his communication, it's definitely something I would trust *alot* more than s3fs.
However, s3backer stores logical file blocks on S3, not whole files; so like you said, you completely lose the ability to serve these files directly via S3. Given that I still intend to hack gallery2 to redirect the browser to the S3 URLs directly eventually, I want my files to be stored in a form that can be directly downloaded by a browser.
There is another problem with s3backer, and it's one that I communicated to the author but that we don't see eye-to-eye on. s3backer relies on Linux' standard block cache for caching; it doesn't persist cached blocks to disk on its own. The author thinks that this is a more elegant way to do caching because it relies on pre-existing Linux mechanisms. The problem is that every time your server reboots, you lose your cache because the cache has not been persisted to disk. Additionally, this limits the effective size of the cache to what can fit into virtual memory. I would not want this behavior; I would not want to have to re-download the several gigabytes of thumbnail and medium sized images each time the server was rebooted. s3fs persists its cache in disk so it is re-used after a reboot.
Also I did test s3backer in its early form and it was very slow; but I think that the author has done considerable performance improvement since then so I can't say for sure if it still is.
As I may have written in these forums before, at one time I was on the road to implementing my own version of s3fs with more robustness and performance. My idea was to implement the caching separately, as a FUSE filesystem that did nothing other than cache requests to one mount point on some segment of the disk; this could be used for any filesystem and would solve the caching problem generically. For example, you'd do something like:
cachemount /mnt/foo /mnt/bar /var/cache/baz
Which would intercept any filesystem requests for any file under /mnt/foo, first checking for a locally cached version under /var/cache/baz, and if one was not found, loading the file from the corresponding location under /mnt/bar, caching it under /var/cache/baz, and then satisfying the request with the cached result.
Then the S3 based filesystem itself would be pretty simple as it wouldn't need any caching to be implemented internally at all; it would just assume completely uncached access to S3. In the end to mount an S3 filesystem with local disk caching you'd do something like this:
cachemount /mnt/foo /mnt/bar /var/cache/baz
(as above, makes references to files under /mnt/foo satisfied by cache stored in /var/cache/baz and backed by files in /mnt/bar)
s3mount bucket /mnt/bar
(which would make any request to files under /mnt/bar be satisfied by an S3 request for the corresponding file in the given bucket)
I only got as far as writing a robust and fast S3 interface in C, which I turned into a library (libs3) and actually licensed to several companies for enough $$$ to have made the entire exercise very worth my while. But I lost interest and never finished the whole thing. Maybe someday ...