Hi, James!
irgeek wrote:
It's definitely possible. The only caveat I'd give is that you honor any robots.txt files for pages that aren't your own--failing to do so may result in complaints being filed about you.
I don't need to worry about it because I'll tracking some pictures - and tags - on Flickr and Fotolog.com using their API. Not rocket science at all, just a Java software running some queries
irgeek wrote:
As for shipping the data off, you can easily run a MySQL server on a Linode so you don't need to send it off remotely if you don't want to.
Currently I have a hosting plan on another server

but I can't run the crawler on it. On the other side, I don't want to administer an entire server. I prefer to focus on my business and pay someone to do the dirty job for me

So a good solution to me is to buy a linode and continue with my current plan.
Thank you, James!