Sunday, May 20, 2012

What database does Facebook use?


Does it use any of the standard ones like Oracle, DB2, SQL Server, or have something of their own?



Considering the type of data (text + images + videos) that they have to manage, it would be interesting to know how they deal with it.



Is this information publicly available? Any links would also be helpful.


Source: Tips4all

8 comments:

  1. It should be no surprise that an site as high-scale as Facebook uses a variety of data management technology. Each database product has its strengths, and Facebook needs all of them.

    They have also changed their data management from time to time, as they find solutions that meet their needs.

    According to Exploring the software behind Facebook, the world’s largest site (2010/6/18):


    MySQL
    Memcached
    Haystack for photo retrieval
    Cassandra
    Hadoop and Hive
    Scribe for high-speed distributed logging

    ReplyDelete
  2. I've discussed this extensively with some sysops from Facebook in the past.

    Facebook primarily uses MySQL for structured data storage. For instance, wall posts, user information, etc. are all stored in MySQL. They replicate this between their various data centers.

    For blob storage (photos, video, etc.), Facebook makes use of a custom solution that involves a CDN (fbcdn) externally and NFS internally.

    For a few means of document storage and write-heavy applications (such as inbox search), Cassandra is used. Contrary to popular belief, Cassandra is NOT the primary database at Facebook. In fact, it isn't anywhere close to being the primary database platform; it used for very specific scenarios where the NoSQL paradigm fits best.

    Hope this helps

    EDIT:

    I should also note that this is by no means the full extent of technologies that FB uses, but it does represent the vast majority of storage that they take advantage of.

    ReplyDelete
  3. They use Apache Cassandra for some of their storage (document database), and heavy use of memcached to make it scale well.

    ReplyDelete
  4. According to Wikipedia's Hadoop page and the PoweredBy page on the Hadoop site, they use Hadoop. However, the text on the Hadoop page reads:


    We use Hadoop to store copies of internal log and dimension data sources and use it as a source for reporting/analytics and machine learning.


    That makes me think that their user profiles are not stored in Hadoop.

    ReplyDelete
  5. They use Casandra and MySQL, see here about Casandra http://www.facebook.com/note.php?note_id=24413138919

    ReplyDelete
  6. If you are interested in what technologies Facebook uses, follow their engineering "blog".
    http://www.facebook.com/Engineering

    There is lots of good stuff in there.

    ReplyDelete
  7. MySql

    Source: http://www.datacenterknowledge.com/archives/2008/04/23/facebook-now-running-10000-web-servers/

    They may have migrated since, but I doubt it.

    ReplyDelete
  8. I'm fairly sure they used to use MySQL, however they now use some sort of NoSQL database for heavier transactions. The number of transactions Facebook has to handle is sometimes too much for a relational database. You see, relational databases must adhere to the principle of ACID. It is costly to maintain ACID on a large scale. NoSQL variants don't adhere to as strict of a set of rules as relational databases do.

    ReplyDelete