What kind of hardware does Slashdot run on?
[Note: This writeup originally appeared as a Slashdot post. You
can see the original post, complete with comments, here.
]
At the request of countless users, we're happy to finally present a
summary of the new setup over at Exodus. It's the result of over 6
months of work from a lot of people, so shout-outs to Adam, Kurt, and
Scoop, Team P: Pudge, PatG & Pater for the code, and Martin, BSD-Pat,
and Liz for getting the hardware and co-loc taken care of.
The original version of this document was written by Andover.Net
Alpha Geek Kurt Grey. The funny jokes are his. The stupid jokes are
mine.
The Backstory
We realized soon that our setup at Digital Nation was very flawed.
We were having great difficulty administering the machines and making
changes. But the real problem was that all the SQL traffic was flowing
over the same switch. The decision was made to move to Exodus to solve
these problems, as well as to go to a provider that would allow us to
scatter multiple data centers around the world when we were ready to do
so.
Meanwhile, Slashcode kicked
and screamed its way to v1.0 at the iron fists of CaptTofu (Patrick
Galbraith) and Pudge (Chris Nandor). The list of bug fixes stretches
many miles, and the world rejoiced, although Slashdot itself continued
to run the old code until we made the move.
The Co-Loc
Slashdot's new co-location site is now at Andover.Net's own (pinky
finger to the mouth) $1 million dedicated data center at the Exodus
network facility in Waltham, Mass, which has the added advantage
of being less than a 30 minute drive for most of our network admins
-- so they don't have to fly cross-country to install machines. We
have some racks sitting at Exodus. All boxes are networked
together through a Cisco 6509 with 2 MSFCs and a Cisco 3500 so we
can rearrange our internal network topology just by reconfiguring
the switch. Internet connectivity to/from the outside world all
flows through an Arrowpoint CS-800 (which replaced the CS-100 that
blew up last week) switch which acts as both a firewall load
balancer for the front end Web servers. It also so happens that
Arrowpoint shares the same office building with Andover.Net in
Acton so whenever we need Arrowpoint tech support we just walk
upstairs and talk to the engineers. Like, say, last week when the
100 blew up ;)
The Hardware
- 5 load balanced Web servers dedicated to pages
- 3 load balanced Web servers dedicated to images
- 1 SQL server
- 1 NFS Server
All the boxes are VA Linux Systems FullOns running Debian (except
for the SQL box). Each box (except for the SQL box) has LVD SCSI
with 10,000 RPM drives. And they all have 2 Intel EtherExpress 100
LAN adapters.
The Software
Slashdot itself is finally running the latest release of Slashcode
(it was pretty amusing being out of date with our own code: for nearly
a year the code release lagged behind Slashdot, but my, how the tables
have turned).
Slashcode itself is based on Apache, mod_perl and MySQL. The MySQL
and Apache configs are still being tweaked -- part of the trick is
to keep the MaxClients setting in httpd.conf on each web server
low enough to not overwhelm the connection limits of database,
which in turn depends on the process limits of the kernel, which
can all be tweaked until a state of perfect zen balance has been
achieved ... this is one of the trickier parts. Run 'ab' (the
apache bench tool) with a few different settings, then tweak SQL a
bit. Repeat. Tweak httpd a bit. Repeat. Drink coffee. Repeat until
dead. And every time you add or change hardware, you start over!
The AdFu ad system has been replaced with a small Apache module
written in C for better performance, and that too will be open
sourced When It's Ready (tm). This was done to make things
consistent across all of Andover.Net (I personally prefer AdFu,
but since I'm not the one who has to read the reports and maintain
the list of ads, I don't really care what Slashdot runs).
Fault tolerance was a big issue. We've started by load balancing
anything that could easily be balanced, but balancing MySQL is
harder. We're funding development efforts with the MySQL team to
add database replication and rollback capabilities to MySQL (these
improvements will of course be rolled into the normal MySQL release
as well).
We're also developing some in-house software (code named
"Odyssey") that will keep each Slashdot box synchronized with a
hot-spare box, so in case a box suddenly dies it will
automatically be replaced with a hot-spare box -- kind of a
RAID-for-servers solution (imagine... a Beowulf cluster of these?
rimshot) Yes, it'll also be released as open source when
its functional.
Security Measures
The Matrix sits behind a firewalling BSD box and an Arrowpoint Load
balancer. Each filters certain kinds of attacks and frees up the httpd
boxes to concentrate on just serving httpd, and allows the dedicated
hardware to do what it does best. All administrative access is made
through a VPN (which is just another box).
Hardware Details
- Type I (web server)
- VA Full On 2x2
- Debian Linux frozen
- PIII/600 MHz 512K cache
- 1 GB RAM
- 9.1GB LVD SCSI with hot swap backplane
- Intel EtherExpress Pro (built-in on moboard)
- Intel EtherExpress 100 adapter
- Type II (kernel NFS with kernel locking)
- VA Full On 2x2
- Debian Linux frozen
- Dual PIII/600 MHz
- 2 GB RAM
- (2) 9.1GB LVD SCSI with hot swap backplane
- Intel EtherExpress Pro (built-in on motherboard)
- Intel EtherExpress 100 adapter
- Type III (SQL)
- VA Research 3500
- Red Hat Linux 6.2 (final release + tweaks)
- Quad Xeon 550 MHz, 1MB cache
- 2 GB RAM
- 6 LVD disks, 10000 RPM (1 system disk, 5 disks for
RAID5)
- Mylex Extreme RAID controller 16 MB cache
- Intel EtherExpress Pro (built-in on motherboard)
- Intel EtherExpress 100 adapter
Answered by: CmdrTaco
Last Modified: 6/13/00