November 20, 2010

Finding a needle in Haystack: Facebook's photo storage

Summary of the idea Haystack project that Facebook started to use for storing pictures

Total image workload Facebook has:
  • 260 billion images (~20 petabytes) of data
  • every week 1 billion (~60terabyte) new photos are uploaded

Main charachteristics of Facebook images:
  • read often
  • written once
  • no modification
  • rarely deleted

Traditional file systems are not fast for these specifications (too many disk accesses per read) and external CDN won't be enough in near future due to increasing workload -especially for long tail. As a solution, Haystack is designed to provide;
  1. High throughput low latency:
    • keeps metadata in main memory -at most one disk access per read
  2. Fault tolerance
    • replicas are in different geographical regions
  3. Cost effective and simple
    • comparison to NFS based NAS appliance
    • each usable terabyte costs ~28% less
    • ~4% more reads per sec

Design Previous to Haystack



What is learned from NFS-based Design
  • more than 10 disk operation to read an image
  • if directory size is reduced, 3 disk operation to fetch an image
  • caching file name for highly possible next requests - new kernel func open_by_file_handle
Take away from previous design
  • Focusing only on caching has limited impact on reducing disk operations for long tail
  • CDN are not effective for long tail
  • Would GoogleFS like system be useful ?
  • Lack of correct RAM/disk ratio in current system
Haystack Solution:
  • use XFS (extend base file system)
    • reduce metadata size per picture so all metadata can fit into RAM
    • store multiple photos per file
    • so very good price/performance point -better off than buying more NAS appliances
    • holding all regular size metadata in RAM would be way expensive
  • design your own CDN (Haystack Cache)
    • uses distributed hash table
    • in requested photo can not be find in cache, fetches from Haystack store
    • store multiple photos per file


DESIGN DETAILS
needs to be updated ..

D. Beaver, S. Kumar, H. C. Li, J. Sobel, and P. Vajgel. Finding a needle in Haystack: Facebook’s photo storage. In OSDI ’10