Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
 Post subject: Large Databases
PostPosted: Wed Jul 02, 2008 3:41 pm 
Offline
Senior Member

Joined: Wed Apr 11, 2007 8:23 pm
Posts: 76
Hey everyone,

I'm currently designing an online database and backend using PHP/MySQL where a clientside program uploads a large amount of data for each user. I'm rather new to MySQL programming, and am wondering the best way to go about this. If some of you play Team Fortress 2 then you will understand, otherwise just assume each class is a different player-type.

Basically I have 44 fields for every class, plus an ID and name for each, which comes to 46. With 10 classes that brings us to 460 fields plus a possible few more. So, I ask you this, would the best way to store this data be in one database that would be 460+ fields wide, or sore the data in 10 separate databases of 46 fields each? Note, when the page loads, all (or atleast most) of the data will have to be retrieved.

It it unusual for MySQL to have databases that may potentially have 100+ entries with 460+ fields each?

If anyone has a better way to go about then then I am willing to listen!

Thanks,
Smark


Top
   
 Post subject:
PostPosted: Wed Jul 02, 2008 3:56 pm 
Offline
Senior Member
User avatar

Joined: Sat Mar 24, 2007 6:09 pm
Posts: 59
Location: South Africa
Hmmm... 460 fields per record is a bit extreme. Read up on database normalisation. It's the concept of breaking a record into bits of linked information.

http://en.wikipedia.org/wiki/Database_normalization

Thus, have a table containing your list of users, but link that to a class table. The class table has the 10 classes of info (i.e. 10 records for each userID, each record with a different ID for each class).

If you want to extract information, you use a JOIN to link the data sets together.

--deckert


Top
   
 Post subject:
PostPosted: Wed Jul 02, 2008 3:59 pm 
Offline
Senior Member

Joined: Wed Apr 11, 2007 8:23 pm
Posts: 76
This is the type of thing I was looking for, I'm going to look into it... Anyone else have any other opinions or suggestions?

Thanks,
Smark


Top
   
 Post subject:
PostPosted: Wed Jul 02, 2008 6:34 pm 
Offline
Junior Member

Joined: Fri Mar 05, 2004 12:30 am
Posts: 39
ICQ: 181450
Website: http://tkatch.com
AOL: TMHChacham
Location: Oak Park, MI
Are the 44 fields related to each other?

Are the ten sets of 46 fields the same for each class?

Can a user have more than one character?


Top
   
 Post subject: Re: Large Databases
PostPosted: Wed Jul 02, 2008 8:17 pm 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
Smark wrote:
Basically I have 44 fields for every class


Unlikely. You probably have 44 _attributes_ for each class; an attribute isn't a field. I don't know Team Fortress, so I'll just use a pseudo-RPG for examples.

So a simple table could be:

classname,attributename,value

With this you'd have multiple rows for each class

Fighter,Strength,10
Fighter,Speed,20
Fighter,Dexterity,15

etc etc etc To get all the fighter attributes you'd do "select attributename,value from details where classname='Fighter'"

Now this is where you'd start to normalise data; you'd have one table listing classname and associating it with a number and maybe other info. So

1, Fighter, "This character has two handed swords as a feat"
2, Rogue, "Strong but clumsy"
3, Elf, "Tolkein is spinning in his grave"

You'd have another table listing attribute name and a number

1, Strength
2, Speed
3, Dexterity

Now your main table would be
1,1,10
1,2,20
1,3,15
2,1,25
2,2,5
2,3,10
3,1,5
3,2,30
3,3,40

Doesn't mean anything to humans, but this is a database and with cross joins we've defined 3 classes and 3 attributes for each class.

Your main table would be 440 rows long (your 10 classes with 44 attributes each).

This is a very small database :-)

_________________
Rgds
Stephen
(Linux user since kernel version 0.11)


Top
   
 Post subject:
PostPosted: Thu Jul 03, 2008 11:06 pm 
Offline
Junior Member

Joined: Fri Mar 05, 2004 12:30 am
Posts: 39
ICQ: 181450
Website: http://tkatch.com
AOL: TMHChacham
Location: Oak Park, MI
Long is usually a worse performer than wide. Long is better for logs. If they are indeed 44 attributes, 44 fields would probably be better.


Top
   
 Post subject:
PostPosted: Sat Jul 05, 2008 8:08 am 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
If you want performance, then don't use a database at all. Since we're talking about tiny datasets, store them in plain text files and load them into memory. I'd do it in perl with hashes ($character{Thief}{Agility}=30;) :-) :-)

But seriously, the problem with creating a gazillion fields is one of scalability, especially if there's a large number of attributes that could be selected from. What if character A's attributes are different to character B's? Maybe there's 20 in common, but the other 20-odd are different; now the table has to be 60+ fields wide, a lot of which are empty. It's just bad database design. Conversely the design I described is scalable, extensible, normalized, maintainable and all other good stuff :-)

(Actually, I am semi-serious about the perl hash; this is too small and too simple to use a database for; the overhead a database introduces is amazing)

_________________
Rgds

Stephen

(Linux user since kernel version 0.11)


Top
   
 Post subject:
PostPosted: Sat Jul 05, 2008 1:52 pm 
Offline
Senior Member

Joined: Sun Nov 30, 2003 2:28 pm
Posts: 245
sweh wrote:
If you want performance, then don't use a database at all. Since we're talking about tiny datasets, store them in plain text files and load them into memory.


Which if fine until you have two programs or instances accessing and modifying the data. At which point you either convert to a real DB or come up with your own half-assed locking and sharing scheme, which will make you crazy until you give up and convert to a real DB. And the performance thing is mostly irrelevant, the dataset, as described, is so small[1] that it will live in memory regardless of the implementation method. Let the OS and DB worry about caching and disk write-out; after all, a few people have worked on this problem. The footprint and setup of something like SQLite isn't all that big, if you're loading Perl anyway.

Steve

[1] Assuming that the 40+ fields aren't all multi-megabyte BLOBS, that is.

_________________
The irony is that Bill Gates claims to be making a stable operating system and Linus Torvalds claims to be trying to take over the world.
-- seen on the net


Top
   
 Post subject:
PostPosted: Sat Jul 05, 2008 4:00 pm 
Offline
Senior Member

Joined: Wed Apr 11, 2007 8:23 pm
Posts: 76
Thanks for the help everyone... After posting on a few other forums and talking with some people, it looks like my post mislead everyone...

I ended up making a table something like this:

Code:
Name   ID   Class      Kills   Deaths
-------------------------------------
Smark    1   Scout      34    32
Smark    1   Medic      13    64
Smark    1   Heavy      76    12
Smark    1   Spy        24    9
Smark    1   Engineer   67    12
Smark    1   Demoman    12    53
Smark    1   Pyro       64    23
Smark    1   Soldier    12    12
Smark    1   Total      302   217
Reject   2   Scout      324   214
Reject   2   Medic      352   242
Reject   2   Heavy      894   252
Reject   2   Spy        124   25
Reject   2   Engineer   96    47
Reject   2   Demoman    121   256
Reject   2   Pyro       134   132
Reject   2   Soldier    324   231
Reject   2   Total      2369  1399


The table goes on and on with 10 entries per player, and it goes farther to the right with more fields, I just showed deaths and kills to give you the idea of how it worked...


Top
   
 Post subject:
PostPosted: Tue Jul 08, 2008 7:17 am 
Offline
Junior Member

Joined: Fri Mar 05, 2004 12:30 am
Posts: 39
ICQ: 181450
Website: http://tkatch.com
AOL: TMHChacham
Location: Oak Park, MI
sweh wrote:
If you want performance, then don't use a database at all.


Or learn how to use one? I do databases all day long, so i'll have to take some issue with your statement. :)

sweh wrote:
Since we're talking about tiny datasets, store them in plain text files and load them into memory.


If he's not worried about multiple processes accessing it, that should be ok.

sweh wrote:
But seriously, the problem with creating a gazillion fields is one of scalability, especially if there's a large number of attributes that could be selected from. What if character A's attributes are different to character B's?


Which was my original question. If A's attributes are different then B's, it would probably be a good idea to have a separate TABLE for each one, with a one-to-one relationship on the main TABLE. Then, a VIEW could join them all together, especially helpful for the common fields.

sweh wrote:
Conversely the design I described is scalable, extensible, normalized, maintainable and all other good stuff :-)


scalable - Not really. In practicality, such a TABLE would probably kill performance especially if any sort of aggregation was required.

extensible - If, and only if, only one value was required per entry.

normalized - Absolutely not. In this TABLE all values must be of the same data type.

maintainable - Let me guess, you're a coder who has never touched a database before, and figures as long as the statement can be written, the database is maintainable?

sweh wrote:
(Actually, I am semi-serious about the perl hash; this is too small and too simple to use a database for; the overhead a database introduces is amazing)


A database introduces overhead? I think you mean the connection adds it to the code. A DB connection adds no more to code than any library's API does.

Sweh, i don't mean to knock you here. It just sounds as if you don't usually touch databases too often, so even if the logic is correct, as far as databases are concerned, a long TABLE is definitely not the way to go.

Now, the OP wants to use MySQL, so i think all he wants is quick data access anyway and not a real database. But that is another story.


Top
   
 Post subject:
PostPosted: Tue Jul 08, 2008 8:47 am 
Offline
Senior Member
User avatar

Joined: Tue Apr 13, 2004 6:54 pm
Posts: 833
"Sweh, i don't mean to knock you here. It just sounds as if you don't usually touch databases too often, so even if the logic is correct, as far as databases are concerned, a long TABLE is definitely not the way to go. "

Yeah well. I've only got 12 years of commercial experience, mostly with Oracle (I used to DBA Oracle 7.1.6 and Oracle 8.1.3), recently with postgres. Complicated SQL still causes my brain to leak out of my ears, though, so I tend to do most logic in perl (which helps with portability as well). My current application consists of 50-odd tables, the longest of which is a couple of million rows long by 30 fields wide.

Your objections... scalable... I was talking about scalability in terms of number of attributes; a 460 field wide table is _bad_. extensible; can have multiple rows per attribute (but that's not required in this problem). It's extensible by easily adding more attributes without needing to modify the schema; normalized... absolutely YES it is ("Name", "Class" should both be lookups into secondary tables; not sure you know what a normalized dataset actually is), maintainable (schema changes are _bad_ when it comes to maintainability; my design didn't require any) etc etc etc.

But, at heart, I'm a unix geek and for data this small I'd still do it with text files :-) (locking is not an issue; it's a trivial task; done this too many times; saving is already atomic and the solution automatically provides read consistency)

But *shrug* each to their own. There's more than one way to skin a cat.

_________________
Rgds

Stephen

(Linux user since kernel version 0.11)


Top
   
 Post subject:
PostPosted: Thu Jul 10, 2008 10:49 pm 
Offline
Junior Member

Joined: Fri Mar 05, 2004 12:30 am
Posts: 39
ICQ: 181450
Website: http://tkatch.com
AOL: TMHChacham
Location: Oak Park, MI
I disagree with you, but to each their own. I guess we'll have to leave it at that.


Top
   
 Post subject:
PostPosted: Mon Jan 05, 2009 8:06 pm 
Offline
Newbie

Joined: Wed Mar 12, 2008 3:45 pm
Posts: 4
Location: Scotland
Smark wrote:
Thanks for the help everyone... After posting on a few other forums and talking with some people, it looks like my post mislead everyone...

I ended up making a table something like this:

Code:
Name   ID   Class      Kills   Deaths
-------------------------------------
Smark    1   Scout      34    32
Smark    1   Medic      13    64
Smark    1   Heavy      76    12
Smark    1   Spy        24    9
Smark    1   Engineer   67    12
Smark    1   Demoman    12    53
Smark    1   Pyro       64    23
Smark    1   Soldier    12    12
Smark    1   Total      302   217
Reject   2   Scout      324   214
Reject   2   Medic      352   242
Reject   2   Heavy      894   252
Reject   2   Spy        124   25
Reject   2   Engineer   96    47
Reject   2   Demoman    121   256
Reject   2   Pyro       134   132
Reject   2   Soldier    324   231
Reject   2   Total      2369  1399


The table goes on and on with 10 entries per player, and it goes farther to the right with more fields, I just showed deaths and kills to give you the idea of how it worked...


I think you could still do with some work on that design - there is repetition and repetition has repercussions down the line, for example if a player changes their name you have to update every row that contains their name.

I would start with something like this

One table is player info
id
name

Another table to store kills info
id
owner_id
class
kills
deaths

owner_id is a foreign key to id to the player table.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group