Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Thu Nov 19, 2009 3:01 pm 
Offline
Senior Member
User avatar

Joined: Sun Aug 10, 2008 11:26 am
Posts: 104
Location: ~$
I'm serving user-uploaded pictures from the filesystem using nginx. I'm not using a database for performance reasons. I want this to be able to scale like crazy, like into the hundreds of thousands of pictures. If I select the filesystem carefully, is there any problem with dumping all the files into the same directory?

I was thinking of creating multiple buckets to distribute the files based on hashing their id. But then, how many buckets do I need? Sub-buckets?

ReiserFS and ext3 both support b-tree searches. I read that ext3 supports around 10**20 files per directory, but I couldn't find any data for ReiserFS.

Anybody have experience doing this kind of thing?


Top
   
 Post subject:
PostPosted: Thu Nov 19, 2009 6:11 pm 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
It may be best to avoid MurderFS until its' future is a bit more certain.


Top
   
 Post subject:
PostPosted: Thu Nov 19, 2009 7:17 pm 
Offline
Junior Member

Joined: Sat Oct 24, 2009 2:16 pm
Posts: 21
You could look at hash-like databases (CouchDB, etc) and then have a quick lookup to get the file location (create 100 dirs, randomize which dir is used for the image). Or create dirs as you go, making sure each dir only has X images in it.

Also,might be worth looking at pastebin's sourcecode (http://pastebin.com/pastebin.php?help=1)


Top
   
PostPosted: Thu Nov 19, 2009 9:30 pm 
Offline
Senior Member

Joined: Wed May 13, 2009 1:18 am
Posts: 681
funkytastic wrote:
Anybody have experience doing this kind of thing?

I'd avoid extremes such as that. Even if the filesystem technically supports that many files in a single directory, various admin tools you may wish to use when working with that tree are likely to bog down, sometimes severely.

I'd certainly suggest sharding the set of files among one or more levels depending on your expected scale. If you're in control of the filenames (say assigning uuids or something), just create a few levels based on initial characters. For example, with a uuid scheme, using 2-character directories (00-ff) with 2 levels you can support a million files with an average leaf directory size of about 16, assuming even uuid distribution.

If you're only going to be in low hundreds of thousands, a single level of directories would still average only ~400 files in each leaf node per hundred thousand.

If you don't have control over the filenames, you may want to hash the filename and then use characters from the hash since otherwise common naming patterns could significantly skew the tree.

-- David


Top
   
 Post subject:
PostPosted: Thu Nov 19, 2009 11:17 pm 
Offline
Senior Member

Joined: Mon Apr 27, 2009 7:36 pm
Posts: 59
Website: http://www.xenscale.com
Location: Boise, ID
Linode is a good place to house your website,

for scalability of mass image hosting, you would be better served to push your images to amazon s3 or rackspace cloudfiles.

its going to be cheaper for you in the long run when it comes to raw file storage (but potentially more for actual bandwidth) and rackspace has a CDN built in, with no extra costs bandwidth from cloud files to the cdn edge like amazon does.

amazon has better access controls.

this would be the more scalable way to do this, and the infrastructure is already there, you don't have to reinvent it.


Top
   
 Post subject:
PostPosted: Fri Nov 20, 2009 12:18 am 
Offline
Senior Member
User avatar

Joined: Sun Aug 10, 2008 11:26 am
Posts: 104
Location: ~$
Thanks for pointing out rackspace cloud files. I hadn't heard of it. I just signed up and it looks good so far. This will certainly simplify things!


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group