Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
 Post subject: RE: S3 post
PostPosted: Fri Dec 05, 2008 2:24 am 
Offline
Senior Newbie

Joined: Sun Nov 30, 2008 1:36 am
Posts: 12
S3sync would automatically retry the uploads. Happens with S3. Its more of a TCP problem then a S3 issue to say.

However, at 99 attempts the sync process makes damn sure to get every last bit of specified data up to S3. I have backup lot's dvd source backups everyday. I run into the same issue. Using it for more then a year now. Never had data corruption problem (atleast with S3. Ec2 is completely useless).


Best Regards
Hareem Haque.


Top
   
 Post subject:
PostPosted: Fri Dec 05, 2008 2:29 am 
Offline
Senior Newbie

Joined: Thu Nov 20, 2008 5:39 pm
Posts: 17
ICQ: 221298635
Well, I hope so.. Right now I'm looking at this

Code:
With result 500 Internal Server Error
72 retries left, sleeping for 30 seconds


Since retries are removed 30 seconds from each other and it apparently tried 28 times by now, this effectively translates to 14 minutes of S3 downtime/failure.

I suppose in one of the remaining 72 tries it will succeed...

EDIT: Actually, I just realized that wasn't on a single file, but in general. It now has 63 retries left so overall I had 37 errors during this upload. Hopefully it wont reach 100 before it's done.. (it's got gigabytes of stuff to do).


Top
   
 Post subject:
PostPosted: Fri Dec 05, 2008 3:36 am 
Offline

Joined: Fri Dec 05, 2008 3:33 am
Posts: 1
Have you considered Joyent's BingoDisk before?

http://www.bingodisk.com/


Top
   
 Post subject: Weird
PostPosted: Fri Dec 05, 2008 9:48 am 
Offline
Senior Newbie

Joined: Sun Nov 30, 2008 1:36 am
Posts: 12
memenode:

Are you chunking your data or have you selected a whole drive with tons of data in it. As it will start to bug you. It creates a list of uploading files. Sync size has to be less then 5GB.

I have not tested this on a linode yet. Works fine from EC2


Top
   
 Post subject:
PostPosted: Fri Dec 05, 2008 10:42 am 
Offline
Newbie

Joined: Fri Dec 05, 2008 10:03 am
Posts: 3
Website: http://www.tarsnap.com/
Hi all, I'm the author of tarsnap (and saw this forum appearing in my web server logs).

memenode wrote:
Atourino, thanks for suggesting tarsnap. It looks interesting, but is still in beta and the site is super-minimal so not sure.. It's pricing is in line with S3 and also uses some nonstandard tech so in that sense it's similar to S3... though with more security..


Tarsnap has some features in common with S3, such as the linear pricing model and the very very low probability that your data will ever be lost (data stored via tarsnap currently gets stored on S3 behind the scenes, in fact); but tarsnap is designed fundamentally as a backup system rather than a general-purpose storage system like S3.

In addition to the improved security, tarsnap works with a snapshot model of backups: Instead of just synchronizing the latest version of your data to S3 like s3sync does (which can cause problems if you realize that you mangled a file after the next sync happens), tarsnap allows you to store as many archives as you want (either snapshots of the same files/directories or completely different data, it doesn't matter) and uses some magic behind the scenes to remove any duplicate bits. Each archive stored on tarsnap can be deleted independently of all the others; so you can do things like creating backups every hour but deleting most of them later so that at any point you have (for example) hourly backups for the past week, daily backups for the past month, and weekly backups for the past year.

I'll stop my evangelizing there for now :-). If anyone has any questions about tarsnap, feel free to post here or email me at cperciva@tarsnap.com .


Top
   
 Post subject: RE: cperciva
PostPosted: Fri Dec 05, 2008 10:44 am 
Offline
Senior Newbie

Joined: Sun Nov 30, 2008 1:36 am
Posts: 12
Hi cperciva

I reside in Canada. Is there any way that i could test the service.


Top
   
 Post subject: Re: RE: cperciva
PostPosted: Fri Dec 05, 2008 11:01 am 
Offline
Newbie

Joined: Fri Dec 05, 2008 10:03 am
Posts: 3
Website: http://www.tarsnap.com/
hareem wrote:
I reside in Canada. Is there any way that i could test the service.


Send me an email and I'll try to work something out.


Top
   
 Post subject:
PostPosted: Fri Dec 05, 2008 11:19 am 
Offline
Senior Newbie

Joined: Thu Nov 20, 2008 5:39 pm
Posts: 17
ICQ: 221298635
LatecomerX wrote:
Have you considered Joyent's BingoDisk before?

http://www.bingodisk.com/


Yep, and bookmarked them too. That also seems to be a "cloud" though since it doesn't mention rsync or SSH, but rather WebDAV. Pricing is good though.

hareem wrote:
Are you chunking your data or have you selected a whole drive with tons of data in it. As it will start to bug you. It creates a list of uploading files. Sync size has to be less then 5GB.


It's in chunks and I'm only backing up select directories so no single s3sync command actually gets 5GB at once (/home comes closest at 3.5GB).

It seems to be complete and is showing 5.1GB bandwidth spent on S3 with 68,637 PUT/COPY/etc requests and costing $1.23. The only thing that's confusing me is that for storage used it says "0.002 GB-Mo" while there should be about 5GB on it..

cperciva wrote:
Hi all, I'm the author of tarsnap (and saw this forum appearing in my web server logs).


Hi, thanks for the info. Sounds good overall (better than S3, except that I like S3's charging model better since I don't risk losing access by failing to fund an account at any point, cause it funds itself).

I'm not sure though if anything cloud-style is actually better than classic methods (getting a storage box with SSH and rsync support with fixed fee and limit so you know you wont go over that and when you do it's upgrade time :P ).

Thanks.


Last edited by memenode on Fri Dec 05, 2008 11:36 am, edited 2 times in total.

Top
   
 Post subject:
PostPosted: Fri Dec 05, 2008 11:22 am 
Offline
Senior Newbie

Joined: Sun Nov 30, 2008 1:36 am
Posts: 12
memenode

Your storage charge is based on a 30 day model. So if its stored there for 30 days then you get billed for 5GB.

So you would see slight increase in storage cost each day. Like $ 0.0015

or something.

Regards
Hareem Haque


Top
   
 Post subject:
PostPosted: Fri Dec 05, 2008 11:36 am 
Offline
Newbie

Joined: Fri Dec 05, 2008 10:03 am
Posts: 3
Website: http://www.tarsnap.com/
memenode wrote:
I like S3's charging model better since I don't risk losing access by failing to fund an account at any point, cause it funds itself


I might add an automatic funding mechanism in the future -- my paypal-fu is rather limited, but I understand that paypal does have some sort of mechanism for recurring payments. In the mean time, I send out emails warning people when their tarsnap account balances get low, so as long as you're signed up for tarsnap with a working email address it would be very hard to accidentally fail to fund your account when needed.

Quote:
I'm not sure though if anything cloud-style is actually better than classic methods (getting a storage box with SSH and rsync support with fixed fee and limit so you know you wont go over that and when you do it's upgrade time :P ).


If you have lots of data to back up, running your own backup server might be the most cost-efficient approach -- but it has the downside that you rent a server with a 1 TB disk you're paying for the entire TB disk even if it's only half full. Tarsnap (and S3 and other "cloud" storage) is more expensive per GB, but at least you're not paying for unused disk space. :-)


Top
   
 Post subject:
PostPosted: Fri Dec 05, 2008 11:49 am 
Offline
Senior Newbie

Joined: Thu Nov 20, 2008 5:39 pm
Posts: 17
ICQ: 221298635
hareem wrote:
Your storage charge is based on a 30 day model. So if its stored there for 30 days then you get billed for 5GB.

So you would see slight increase in storage cost each day. Like $ 0.0015

or something.


Oh I see. Makes sense.

cperciva wrote:
I might add an automatic funding mechanism in the future -- my paypal-fu is rather limited, but I understand that paypal does have some sort of mechanism for recurring payments. In the mean time, I send out emails warning people when their tarsnap account balances get low, so as long as you're signed up for tarsnap with a working email address it would be very hard to accidentally fail to fund your account when needed.


Ah that takes care of it then. :)

cperciva wrote:
If you have lots of data to back up, running your own backup server might be the most cost-efficient approach -- but it has the downside that you rent a server with a 1 TB disk you're paying for the entire TB disk even if it's only half full. Tarsnap (and S3 and other "cloud" storage) is more expensive per GB, but at least you're not paying for unused disk space. :)


I'm not anywhere near needing a TB. :) I got the impression that clouds have the price advantage though so that's not quite why I'm still vary.. except slightly for the lack of fixed price, but mostly it's the non-standardness of the tech used and that you have a little less control over what's happening.

For example, there are multiple of tools for S3, but sometimes changes made by one are not recognizable by another tool and the more convenient ones are either not fully functional or are proprietary (with things like 30 day trials and such).

As for less control, for example, sometimes it could happen that an "internal server error" is something that I could potentially fix myself if I had SSH access or there would maybe be less of a chance of encountering those because I always enter into the same virtual space of the same box which if solid is solid. With clouds, you don't quite know where you are so to speak.. it's kinda random. Any time you connect you're relying on a set of unknown parameters.

That's at least according to my limited understanding of it. I'm basically comparing a VPS server to an account that just uses an unknown number of servers in a cluster with space (and other resources) being dedicated from any which one of them at any point in time.

Thanks


Top
   
PostPosted: Fri Dec 05, 2008 3:11 pm 
Offline
Senior Member
User avatar

Joined: Wed Mar 17, 2004 4:11 pm
Posts: 554
Website: http://www.unixtastic.com
Location: Europe
S3 - If you give your data away you can't be totally sure you can get it back and you can't be sure who else has a copy. If you use S3 use heavy encryption.

http://www.bqbackup.com - Same as above, assume third parties get copies and use heavy encryption.

I use BackupPC to a machine I control. BackupPC does rsync, file pooling, and compression so you can save hundreds of point in times in little more disk space than a single backup. It's a nice tool, take a look at http://backuppc.sourceforge.net


Top
   
PostPosted: Sat Dec 06, 2008 4:22 pm 
Offline
Senior Newbie

Joined: Sun Nov 30, 2008 1:36 am
Posts: 12
I'll stick with S3 for the time being. Have tested sync on Nirvanix. Nirvanix is expensive but it beats S3 enormously.

If your worried about your data. You can always encrypt it prior to the sync operation. Remember S3sync will update the latest content that you specify so works best with incremental backups etc..



You can always customize the bash scripts to do whatever you want.


Top
   
 Post subject:
PostPosted: Sat Dec 20, 2008 11:04 pm 
Offline
Senior Newbie

Joined: Thu Nov 20, 2008 5:39 pm
Posts: 17
ICQ: 221298635
I think I'm gonna get a $5 a month FTP/SSH backup solution from one of the mentioned companies and delete everything from S3.

S3 has gone up to almost $3.5 pretty quickly and at this rate may reach $5 a month therefore nullifying the price advantage. I suppose it's just due to the specifics of my use case. I have multiple sites and am dumping up their databases daily meaning that it re-uploads them to S3 every day thus pumping up my bandwidth. Also I suppose the requests pile up easily.

For $5 a month I get unlimited bandwidth, enough space for my current needs and no worries about the price going beyond $5 so that at this point seems to be a better way for my specific case. :)

Thanks every for suggestions. I'll take another quick sweep of the available solutions before making a decision. No big hurry. At least I have some backup for the time being and I've now set syncs for once a week. ;)


Top
   
 Post subject:
PostPosted: Sun Mar 22, 2009 3:57 pm 
Offline
Newbie

Joined: Sun Mar 22, 2009 2:01 pm
Posts: 4
Website: http://eagereyes.org/
I've been using S3 via Jungledisk for over a year, and it's working great. JungleDisk provides an incremental backup mechanism, can keep previous versions of files around, and also does encryption if you want. The only caveat is that you have to create the config file on a local Linux machine or have X installed and run the junglediskmonitor X program on your node to configure it. I've found that to work reasonably well, though (and I don't have to do that very often).

The cool thing about JungleDisk (and S3 in general) is that you can easily access the data from any machine - which I've used more than once. I also have my S3 backup drive mounted on my node so I can get a file I've deleted or changed without going through the whole restore thing.

Regarding the errors you see with S3: these are normal. I'm using a bunch of Amazon Web Services, and when you look at their developer documentation, they tell you to expect errors and be prepared to retry. They have a massively parallel system that does fail on single transactions occasionally, but that overall works very well and is very reliable.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group