Wednesday, April 25, 2012

MySQL, NoSQL: help me to choose the right one! (on a )


Ok, i've a bit of experience with mysql+php. In fact, i'm not a full time developer and that's the only stuff i really do know.



Here is what i have to do.



There is a big database, 1G of records (please note: not 1Gb of database size, 1G of records (1'000'000'000 rows)), called threads (these threads actually exist, i'm not making things harder just because of i enjoy it). Threads has only a few stuff in it, to make things faster: (int id, string hash, int replycount, int dateline (timestamp), int forumid, string title)



Query:



select * from thread where forumid = 100 and replycount > 1 order by dateline desc limit 10000, 100



Since that there are 1G of records it's quite a slow query. So i thought, let's split this 1G of records in as many tables as many forums(category) i have! That is almost perfect. Having many tables i have less record to search around and it's really faster. The query now becomes:



select * from thread_{forum_id} where replycount > 1 order by dateline desc limit 10000, 100



This is really faster with 99% of the forums (category) since that most of those have only a few of topics (100k-1M). However because there are some with about 10M of records, some query are still to slow (0.1/.2 seconds, to much for my app!, i'm already using indexes! ).



I don't know how to improve this using MySQL. Is there a way?



For this project i will use 10 Servers (12GB ram, 4x7200rpm hard disk on software raid 10, quad core)



The idea was to simply split the databases among the servers, but with the problem explained above that is still not enought.



If i install cassandra on these 10 servers (by supposing i find the time to make it works as it is supposed to) should i be suppose to have a performance boost?



What should i do? Keep working with mysql with distributed database on multiple machines or build a cassandra cluster?



Thank you, a lot!



I was asked to post what are the indexes, here they are:




mysql> show index in thread;
PRIMARY id
forumid
dateline
replycount



Select explain:




mysql> explain SELECT * FROM thread WHERE forumid = 655 AND visible = 1 AND open <> 10 ORDER BY dateline ASC LIMIT 268000, 250;
+----+-------------+--------+------+---------------+---------+---------+-------------+--------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+------+---------------+---------+---------+-------------+--------+-----------------------------+
| 1 | SIMPLE | thread | ref | forumid | forumid | 4 | const,const | 221575 | Using where; Using filesort |
+----+-------------+--------+------+---------------+---------+---------+-------------+--------+-----------------------------+


Source: Tips4all

No comments:

Post a Comment