Discussion:
spread[logd] scalability
BUSTARRET, Jean-Francois
2003-07-09 15:41:47 UTC
Permalink
I am looking for a way to handle 5000+ req/s on 100+ serveurs. mod_log_spread looks like exactly the tool I need to handle this kind of task, but I am a little bit concerned about scalability.

I did a quick & dirty test with 1 web server + 1 logger & 2 web servers + 1 logger. Web servers & logger are on different subnets, using multicast, with the multicast ttl changed in data_link.c. Multicast should be working : spmonitor reports 0 unicast retransmits.

With 1 web server, spreadlogd wrote about about 260 lines/s, and about 370 with 2 web servers. Each web server was handling about 400 req/s.

How does spread & spreadlogd scale ? What would be the limit of spread & spreadlogd ?

Jean-François Bustarret
eTF1 - Internet Architect
***@tf1.fr
Jay West
2003-07-09 21:08:34 UTC
Permalink
I have a question and an issue with spreadlogd I can't get my head around. Perhaps someone on the list can point me in the right direction:

1) The mod_log_spread sample config lists 3 ways to handle mls setup. Basic, Virtualhost by servername, and virtualhost by servername hash. In the section on the Hash setup, it shows two spreadlog.conf files and says you should have a separate config file and spreadlogd instance for each hash bucket. WHY? Couldn't you just put two Log {} groups in a single config file for any number of hash buckets? Just curious - and it has an impact on #2 below.

2) Assume we set mls in apache for hash, with a hash size of one. In spreadlogd, does the VhostDir support using a pipe like the File directive does? What I'm really driving at is we want to pipe the log writing through cronolog. This can be done using just the virtual host by servnername (no hash) and piping to cronolog, no big deal (doing that already). But here's the real catch - I do NOT want to have to change the spreadlogd config file every time we add or remove a virtual host. It may shed some light to know we want the log files stored by cronolog to be something like /u1/logs/{virtualdomain}/{year}/{month}/{virtualdomain}.log. Of course I know how to make cronolog do this. The problem is the integration between spreadlogd and cronolog. I see no way to pass the host name that spreadlogd discovers in the pipe to cronolog, using either the vh hash or vh no-hash paradigm. Any thoughts?

Regards,

Jay West
George Schlossnagle
2003-07-10 21:31:40 UTC
Permalink
Post by Jay West
I have a question and an issue with spreadlogd I can't get my head
around. Perhaps someone on the list can point me in the right
 
1) The mod_log_spread sample config lists 3 ways to handle mls setup.
Basic, Virtualhost by servername, and virtualhost by servername hash.
In the section on the Hash setup, it shows two spreadlog.conf files
and says you should have a separate config file and spreadlogd
instance for each hash bucket. WHY? Couldn't you just put two Log {}
groups in a single config file for any number of hash buckets? Just
curious - and it has an impact on #2 below.
Yes. The reason for multiple buckets is to support architectures with
a very large (100k+) files.
Post by Jay West
 
2) Assume we set mls in apache for hash, with a hash size of one. In
spreadlogd, does the VhostDir support using a pipe like the File
directive does? What I'm really driving at is we want to pipe the log
writing through cronolog. This can be done using just the virtual host
by servnername (no hash) and piping to cronolog, no big deal (doing
that already). But here's the real catch - I do NOT want to have to
change the spreadlogd config file every time we add or remove a
virtual host. It may shed some light to know we want the log files
stored by cronolog to be something like
/u1/logs/{virtualdomain}/{year}/{month}/{virtualdomain}.log. Of course
I know how to make cronolog do this. The problem is the integration
between spreadlogd and cronolog. I see no way to pass the host name
that spreadlogd discovers in the pipe to cronolog, using either the vh
hash or vh no-hash paradigm. Any thoughts?
I think it works via pipe. I don't have time to check right now, but
it should be easy to verify experimentally. If it doesn't it's not
because it can't. Which is why I think it does.

:)

George

Loading...