Awesome - thanks for the info, guys.
Quote:
There's no hard limit on files in a directory though, as long as you have free inode blocks on the filesystem. So it does depend on the initial filesystem creation of inode space, but that's usually more than enough for the practical number of files that would use up the actual data space. You can use "df -i" to see how things stand on the filesystem in question.
Running df -i shows I'm using 20% of my available inodes, while I'm 50% out of disk space, so seems like there's not a immediate danger of running out. Glad I know this now, though, so I can keep an eye on it.

Quote:
As for downsides, it's mostly a question of performance at very large sizes, which in turn will depend on the applications being used and whether or not they become inefficient in processing very large numbers of files.
Makes sense. I do notice things are a little sluggish when trying to, say, auto-complete filenames from the shell, or doing a ls of some sort; however, my application never tries to list the files - it always knows the exact filename it's looking for, and performance doesn't seem to be suffering.
Quote:
For example, with such large sets of files, there's usually some pattern to the naming, and if an even distribution, you can create an extra level of sub-directory using the first character or two of the filename. So a file of an arbitrary name is still trivial to locate (including the containing directory), but you divide things up into smaller chunks of files.
Yep, that's actually what I'm planning to do, which also gives me natural partitions to move this out over multiple servers when it becomes necessary.
Quote:
If you've got 136k files in a single directory, you might want to be asking yourself if that should be living in a database instead.
It's actually done this way on purpose - it's essentially blob data, I always need an entire blob at one time, and it's not relational in any sense. In general, I've found relational databases to be the most expensive way (in terms of I/O and CPU) to store and access data like this...when in this particular case, file system access is efficient and cheap.
Thanks everyone!