Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Wed Jan 27, 2010 6:46 am 
Offline
Senior Member

Joined: Thu Nov 19, 2009 4:55 pm
Posts: 52
I'm using the beta backup service but as I've been reading it seems that you don't want to rely on that.

I've never had to resort to restoring a backup so I'd just like to ask a few questions if you don't mind.


Some points:

My main server is a PHP/Apache/Mysql vBulletin server. The MySql database is 800M.

Questions:

Backups are categorized daily, weekly, and monthly when done correctly right? So if I backed up to S3 then I'd have to pay at least 800mb a day for a total of 800mb x 30? That seems like a lot of space that would add up in $$$ really fast.

Is Amazon S3 the ideal way to back it up?

How do I backup MySQL properly and not crash the server during a mysql dump (server gets slow when I do this?) Will the SQL dump be fine? What about if it tries to do this in the middle of someone trying to post something?

What are the alternatives to Amazon S3? Something cheaper? Easier to restore?


Top
   
 Post subject:
PostPosted: Wed Jan 27, 2010 7:04 am 
Offline
Senior Member

Joined: Mon Dec 07, 2009 6:46 am
Posts: 331
I use GigaPros FTP hosting, and I encrypt the packages before FTPing them. All done automatically, daily. No issues so far (using them a few months), and they have FTP plans ranging from $2.5/mo for 1GB+2GB, to $25/mo for 20GB+400GB (space+bandwidth).

As for MySQL dumping, I don't know, don't use MySQL, but PostgreSQL that I use does it transactionally (with pg_dump), ie. the dump is db snapshot at the time of taking.


Top
   
 Post subject:
PostPosted: Wed Jan 27, 2010 7:19 am 
Offline
Senior Member

Joined: Thu Nov 19, 2009 4:55 pm
Posts: 52
Ah, hmm I think that's more expensive than Amazon S3 isn't it?

I'm really confused at how much space I'll be needing.


Top
   
 Post subject:
PostPosted: Wed Jan 27, 2010 1:20 pm 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
There's always the DIY solution of home backups. If you have a few gigs at home, you can use rsync to only send the differences. So of your 800MB database, if only 10MB changes per day, you can do daily backups and only use 10MB of bandwidth on your home connection.

The problem with S3 is that because both the source and destination (S3) are local filesystems to the linode, it won't be able to tell what parts of a file have changed without actually reading the file (unchanged files can be detected by filesize/modification time, but telling WHAT changed requires reading it).

Wheras when you're doing a local/remote scenario, you can run rsync on the remote end (in this case, the home server), where rsync can scan its copies of the files without transferring data over the net.


Top
   
 Post subject:
PostPosted: Wed Jan 27, 2010 1:21 pm 
Offline
Senior Member

Joined: Mon Dec 07, 2009 6:46 am
Posts: 331
arachn1d wrote:
Ah, hmm I think that's more expensive than Amazon S3 isn't it?

I'm really confused at how much space I'll be needing.


Well, don't you know your current usage? Besides, I'm not affiliated with Gigapros, but using the S3 online calculator shows S3 is twice as expensive. Granted, it also says as of June the inbound traffic will be free, which is what you mostly have, I guess.

As for the backup plan, depends on what kind of backups you need/want. I currently keep daily only backups, meaning not 30 monthly archives but last 24 hours only. That's all I need, since the backup is just to prevent data loss if my node begets to push daisies for whatever reason.

So unless you need historical snapshots throughout the month, last 24 hours plus maybe twice a week as extras should be more than enough. Or just keep last 5 days worth of backup or something.


Top
   
 Post subject:
PostPosted: Wed Jan 27, 2010 10:12 pm 
Offline
Senior Member

Joined: Thu Nov 19, 2009 4:55 pm
Posts: 52
Ah, that's where my confusion came... on how many total backups one would have.

So if I kept a week worth of backups that would be 800mb x 7 then, correct?

Anyone have answers on the MySQL concern I had?

What scripts do you guys use?

So Amazon isn't good because you can't rsync? What if you had a daily backup with rsync wouldn't it be 800mb+ no matter what each transfer? It doesn't just version the previous file...?


Top
   
 Post subject:
PostPosted: Thu Jan 28, 2010 12:01 am 
Offline
Senior Member

Joined: Fri May 02, 2008 8:44 pm
Posts: 1121
Database dumps compress well. Unless you have large blobs in there (which you shouldn't), the compressed dump file is likely to be only 100M or so.

Have you considered backing up the binary logs (binlogs) instead of shipping an entire database dump every time? If only a small part of the database changes during the day, this might be more space-efficient. Binary logs are also more rsync-friendly than raw database dumps, because they're append-only. The downside to binary logs is that they're tricky to verify.

Just a couple of reminders:

1) If you use the MyISAM storage engine, mysqldump will lock all your tables while the dump is in progress. If your application tries to insert or update some rows during this time, the page is likely to hang until the dump is complete.

2) If you use InnoDB tables, inserts and updates are allowed to happen even while the dump is in progress. So you should specify the --single-transaction --quick options in mysqldump to prevent getting an inconsistent snapshot.


Top
   
 Post subject:
PostPosted: Thu Jan 28, 2010 5:01 am 
Offline
Newbie

Joined: Thu Dec 03, 2009 11:06 am
Posts: 4
I use Webbycart's backup service. 120Gb storage with unmetered bandwidth for $15 p/m.
nightly backups with rsync.
Before the backup I dump the mysql tables and .tgz them.

http://www.webbycart.com/backup.htm


Top
   
 Post subject:
PostPosted: Thu Jan 28, 2010 5:06 am 
Offline
Senior Member

Joined: Wed Jan 21, 2009 7:13 pm
Posts: 126
Location: Portugal
Hi,

I use bqbackup (rsync) and s3 (duplicity).
Nightly I rsync bqbackup to my home mac.


Top
   
 Post subject:
PostPosted: Thu Jan 28, 2010 5:39 am 
Offline
Senior Newbie

Joined: Thu Jun 18, 2009 5:28 pm
Posts: 11
I use duplicity with amazon s3.
Dump of mysql databases then duplicity backup.
I use a script that makes incremental backup every day and full backup every month.
With 4Gb of (uncompressed) data to backup, one year of backup (12 full and more than 300 incremental) my amazon bill is 6-7 dollars a month.

Every month I also make a full backup from my home box with rdiff-backup.
And every time I shut down I also take some minute to make a clone of the disk image, shrink it to the minimum size and keep it as an additional backup: now that I have two linodes (one for production and one for testing) I want to keep the last image on the testing linode.
A bit paranoid, I know ;-)


Top
   
 Post subject:
PostPosted: Thu Jan 28, 2010 7:48 am 
Offline
Senior Member

Joined: Thu Nov 19, 2009 4:55 pm
Posts: 52
nexnova wrote:
I use duplicity with amazon s3.
Dump of mysql databases then duplicity backup.
I use a script that makes incremental backup every day and full backup every month.
With 4Gb of (uncompressed) data to backup, one year of backup (12 full and more than 300 incremental) my amazon bill is 6-7 dollars a month.

Every month I also make a full backup from my home box with rdiff-backup.
And every time I shut down I also take some minute to make a clone of the disk image, shrink it to the minimum size and keep it as an additional backup: now that I have two linodes (one for production and one for testing) I want to keep the last image on the testing linode.
A bit paranoid, I know ;-)


I like your style. Could you share your script perhaps?


Top
   
 Post subject:
PostPosted: Thu Jan 28, 2010 10:47 am 
Offline
Senior Newbie

Joined: Thu Jun 18, 2009 5:28 pm
Posts: 11
arachn1d wrote:
I like your style. Could you share your script perhaps?


Sure!

This is the script for db backup (one table for file, I like this way and not all the DB in one file)

Code:
#!/bin/bash
# backup mysql - evry DB in its file

MUSER="root"
MPASS="XXXXXXXX"
MDBAK="/var/dumps"
MYSQLDUMP="$(which mysqldump)"
MYSQL="$(which mysql)"

# clean old backups
rm $MDBAK/*bak > /dev/null 2>&1

# save db list
DBS="$($MYSQL -u $MUSER -p$MPASS -Bse 'show databases')"

# dump every database
for db in $DBS; do
 MFILE="$MDBAK/$db.bak"
 $MYSQLDUMP -u $MUSER -p$MPASS $db > $MFILE
 #echo "$db -> $MFILE"
done

exit 0


And this is the script for duplicity backup:

Code:
#!/bin/bash

## NEX: full bkup on Amazon S3
# NOTE: In a shared environment is not safe to export env vars

# Export variables
export AWS_ACCESS_KEY_ID='your AWS Access Key ID'
export AWS_SECRET_ACCESS_KEY='your AWS Secret Key'
export PASSPHRASE='your passphrase'

GPG_KEY='your GPG key'

# day of the month
DDATE=`date +%d`

# full backup only on 1st of the month, otherwise incremental
if [ $DDATE = 01 ]
then
    DO_FULL=full
else
    DO_FULL=
fi

# Backup source
SOURCE=/

# Bucket backup destination
DEST=s3+http://your.real.unique.bucket.amazon

#NEX - enable for removing old backups
#duplicity remove-older-than 1Y

duplicity ${DO_FULL} \
    --encrypt-key=${GPG_KEY} \
    --sign-key=${GPG_KEY} \
    --exclude=/root/download/** \
    --exclude=/var/www/www.toexclude.com/** \
    --exclude=/var/www/web23/user/** \
    --include=/etc \
    --include=/home \
    --include=/root \
    --include=/usr/local \
    --include=/var/www \
    --include=/var/dumps \
    --include=/var/mail \
    --exclude=/** \
    ${SOURCE} ${DEST}

# Reset env variables
unset AWS_ACCESS_KEY_ID
unset AWS_SECRET_ACCESS_KEY
unset PASSPHRASE

exit 0


Useful links:

https://help.ubuntu.com/community/DuplicityBackupHowto
http://www.randys.org/2007/11/16/how-to-automated-backups-to-amazon-s-s3-with-duplicity/
http://www.lullabot.com/blog/how_do_you_backup_your_webserver#
http://www.debian-administration.org/articles/209


Top
   
 Post subject:
PostPosted: Thu Jan 28, 2010 11:16 am 
Offline
Senior Newbie

Joined: Thu Jun 18, 2009 5:28 pm
Posts: 11
And this is the very simple script for the backup with rdiff-backup.
The difference from duplicity is that
duplicity can encrypt so it is safe to use in untrusted backup destinations (like amazon s3)
rdiff-backup stores files without encryption, so it is easier to use in a trusted environment (your home pc, I hope :-) ) and very easy to restore.

Code:
#!/bin/bash

## NEX - full rdiff bkup
##       must be executed manually with root privileges (sudo)
##       better to create ssh account only for backup, so it can be launched unattended

# backup source (your linode)
SOURCE=root@yourlinode.com::/

# Local destination (on your home pc)
DEST=/var/mirrors/my-linode-full

# Replace 12345 with your ssh port number, or remove "-p 12345" if it is on standard 22
rdiff-backup -v5 --print-statistics \
        --remote-schema 'ssh -p 12345 -C %s rdiff-backup --server' \
        --exclude=/root/download/** \
        --exclude=/var/www/www.toexclude.com/user/** \
        --exclude=/var/www/web23/** \
        --exclude /lost+found \
        --exclude /media \
        --exclude /mnt \
        --exclude /proc \
        --exclude /sys \
        --exclude /tmp \
        ${SOURCE} ${DEST}

exit 0


Top
   
 Post subject:
PostPosted: Thu Jan 28, 2010 1:25 pm 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
As a reminder, rdiff won't be useful with S3 or an S3-driven service like Duplicity unless you keep local snapshots that you backed up against. rdiff and rsync have the same problem; they need to fully read each copy of the file at least once in order to determine what changed (in rdiff's case) or generate the set of checksums (in rsync's case).

So your backup procedure for file "foo" could be:

1) Backup time! Copy /mystuff/foo to /backups/2010-01-28/foo
2) rdiff /backups/2010-01-28/foo against /backups/2010-01-27/foo
3) Compress the diff
4) Send the diff to S3 for storage

The next day, repeat with process, diffing against the previous day. You would want to periodically do a full backup. Perhaps every week, you would want to compress the whole shebang and send it, and then send diffs for each day of the week.

In this scenario, you also only ever need to keep the most recent backup snapshot locally; each day, you just need to diff against the previous day, unless it's full-backup day.

Restoring a backup just involves taking the latest full backup and then applying the diffs sequentially until you reach the desired date. That can also be automated with scripts.

Of course, this is all far more complex than the old "rsync home each night and then let your home backup solution take care of things like incremental stuff for history".


Top
   
 Post subject:
PostPosted: Thu Jan 28, 2010 2:04 pm 
Offline
Senior Member
User avatar

Joined: Sat Aug 30, 2008 1:55 pm
Posts: 1739
Location: Rochester, New York
As far as incremental diff strategy goes, what I do is something like:

Day 1: Full backup.
Day 2: Incremental against Day 1
Day 3: Incremental against Day 2
Day 4: Incremental against Day 1
Day 5: Incremental against Day 4
Day 6: Incremental against Day 1
(and so forth -- I actually do it every 0.7 days, but you get the idea)

Basically, ensure that the distance from the full backup to your most recent incremental is reasonably short. This will make your incremental backups larger, more often than not, but it will reduce the amount of work needed to restore. Also, if one component of the backup gets corrupted or deleted, you've got a better shot of not losing everything.

Bandwidth is cheap, storage is cheap, but neither your data nor your time are. Have a backup strategy that works, is automatic, assures you that everything is up to date, and has a restore method you know how to use. And practice a restore... grab a 360 for the day and restore to it. It'll cost you a buck or two, but you'll sleep better.

And if you're me, you'll find out why backing up to home really sucks for full restores :-)

EDIT: And I might as well plug my personal backup methods:
0. Linode's backup service (ideal for full restores, not to be relied upon yet)
1. BackupPC on my home server (ideal for full LAN restores and single-file restores; stores ~3 months of data with pooling across machines)
2. Keyfobs with tarballs generated by BackupPC and moved off-site monthly (ideal for full restores and sphincter-clenching disasters)
3. Experimental backups from BackupPC to S3 (ideal for full restores, somewhat more automated than #2 but slow due to upstream bandwidth constraints)

Also, most of my works-in-progress are stored on Dropbox, which is synced across all of my computers and backed up by BackupPC. I use git for revision control and a script I wrote to back up my remote IMAP accounts (gmail, live@edu, etc).

I... think of too many worst-case scenarios.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group