|
Scaling horizontally can be done equally well with dedicated machines, but that's beside the point.
I'd say start with a 512 node. With increased traffic you'll know if you need to scale horizontally or vertically. For example, if you max out memory with number of application processes but CPU and IOPS are relatively unused, you'll benefit from vertical scaling. Using too much CPU while not utilizing all the memory means you should scale out horizontally.
Now before someone jumps in telling me that CPU will seldom be the bottleneck here... I happen to manage a node for a client which uses extremely inefficient PHP/MySQL application. Each request scans through tens of thousands of rows with complex sorted joins without proper indices, which translates to average of 120% CPU used for puny 4 requests per second (application requests, and some 20-30 r/s overall with static content), with frequent spikes up to above 300%.
Another angle here is that you'll likely max out bandwidth before you max out CPU, if your app is well written and optimized for your rdbms. In that case you need to scale horizontally.
Horizontal scaling is always nice because you then bring redundancy in the equation, but it is more difficult to manage, eg. you need either to offload uploadable content to a CDN, or do some NFS magic with a common static server. At any rate, you'll need to design your app to be server-agnostic, ie. never rely on assets being available on the local machine: files, images, sessions, ...
However, scaling the database horizontally is not as easy and will require planning. Which also depends on your read-to-write ratio, ie. whether you can manage with single master for writes and many replicated nodes for read, which is not as difficult to achieve.
Linode offers backups but I don't know how that works for busy nodes (being file-based backup). Personally I'm not using it, I do offsite backups, which can be another Linode. tar -mtime ... | curl (S)FTP(S) for images, and database dumps for database, however that locks the database. I don't know about MySQL, but with PostgreSQL I can do continuous WAL archiving and PITR, without locking the db.
Also design the app to use a RDBMS in the core of your data management, but think ahead in that some day you might want to introduce layers of memcached or other nosql trickery to help out performance.
|