
Theo Schlossnagle (OmniTI) spoke about a solution for serving a massive amount of largely static images.
Why pay Akamine(sp?) or NetApp lots of money for bandwidth, network infrastructure or webcache devices, when you can roll your own.
Peer-based HA caching with something like Apache + mod_proxy (Reverse proxy + caching) might work well. There are some caveats - such as other people's caches, which might not respect cache directives.
The example given was a three-site HA solution. Finding the 'closest' image server was achieved through using local DNS servers, colocated with the image servers. Anycast (all the DNS servers have the same IP address and BGP takes care of finding out which is the 'closest' DNS server.) DNS uses UDP which means that it's safe to use Anycast for serving - however, the image servers would use TCP connections, which will not work if the internet topology/routing changes in the middle of a session.
So if you have 3 DNS servers, each with the same IP address in 3 geographically diverse locations, 3 servers ready to serve static image content, your own AS for BGP routing and a large amount of static image content - you can use F/OSS software and commodity hardware to make a HA/LB solution that will handle an enormous load.
What about distributed reliable logging? Something like a spread patch to syslog-ng allows logs to be written in 'real' time to multiple servers, reliably.
Blogs can benefit hugely from caching. Something like memcached might work for a read-heavy, write-light dataset. User preferences can be stored in user cookies which will provide all the nodes you ever need, along with all the resilience you'll ever want. If someone loses their cookie (or deletes it) you can just look up their preferences in your database and regenerate their cookie - if their cookie gets corrupted or their browser breaks, they only remove service from themselves.
Apologies for the terseness of this article, but information was coming hard and fast - and I can only typo at a certain rate ;-)
posted at: 16:39 | path: /technical | permanent link to this entry
