Keep Server Online
If you find the Apache Lounge, the downloads and overall help useful, please express your satisfaction with a donation.
or
A donation makes a contribution towards the costs, the time and effort that's going in this site and building.
Thank You! Steffen
Your donations will help to keep this site alive and well, and continuing building binaries. Apache Lounge is not sponsored.
| |
|
Topic: Apache hang due to rotatelogs problem |
|
Author |
|
rjung
Joined: 26 Aug 2015 Posts: 13
|
Posted: Thu 10 Sep '20 15:47 Post subject: Apache hang due to rotatelogs problem |
|
|
Hi there,
we experience hangs of the Apache child process when talking to rotatelogs.
In the setup there are about 20 VirtualHosts, each with one rotatelogs.exe for the error log and another one for the access log. Thehangs are triggered by child process restarts due to reaching the configured MaxConnectionsPerChild limit. The parent process automatically starts a new child process as expected, but sporadically the child process hangs when trying to send data to one of the rotatelogs processes. more precisely the hang does not happen for any of the rotatelogs processes, but mostly only for one of them. Which one might differ, but until now we only observed it for access log rotatelogs, not error log rotatelogs.
Once the problem starts, we loose child process threads, because whenever a thread finished a request belonging to the VirtualHost whose access log is handled by the hanging rotatelogs, we loose that thread. It indefinitely tries to write to its rotatelogs. Therefore over time we loose more and more threads until all are busy waiting to werite to rotatelogs and the full Apache got disfunctional.
We do not know, whether the rotatelogs process still exists or whether old rotatelogs are kept behind as zombies.
I am aware of possible workarounds, like mod_log_rotate for access logs or doing nightly restarts and external rotation. Nevertheles I would like to investigate the root cause and try fixing it. Before I try to strictly reproduce the issue and doing live analysis I wanted to check, whether such a problem is known?
I did not find that type of problem in the ASF Bugzilla, nor here in a forum search.
Apache version is 2.4.43 VS 15 from Apache Lounge, platform is Windows Server 2012 R2.
Thanks for any hints,
Rainer |
|
Back to top |
|
tangent Moderator
Joined: 16 Aug 2020 Posts: 348 Location: UK
|
Posted: Thu 10 Sep '20 22:19 Post subject: |
|
|
You don't say what mpm you're using, presumably mpm_winnt on Windows Server 2012.
The ThreadsPerChild setting will be important here, especially since you've got 40 piped log connections through your VirtualHosts. The ThreadsPerChild default varies depending on the chosen mpm.
So as well as the mpm, what values are currently assigned to MaxConnectionsPerChild and ThreadsPerChild? |
|
Back to top |
|
rjung
Joined: 26 Aug 2015 Posts: 13
|
Posted: Fri 11 Sep '20 7:34 Post subject: |
|
|
MPM: mpm_winnt
ThreadsPerChild: 1200
The problem also occured with smaller values. 1299 is not really needed, but the number got increased as a workaround. Since the threads get "lost" request by request for the VirtualHost with the hanging rotatelogs.exe, increasing the number buys us a little more time to react and restart before a full hang occurs. |
|
Back to top |
|
James Blond Moderator
Joined: 19 Jan 2006 Posts: 7371 Location: Germany, Next to Hamburg
|
|
Back to top |
|
tangent Moderator
Joined: 16 Aug 2020 Posts: 348 Location: UK
|
Posted: Fri 11 Sep '20 22:56 Post subject: |
|
|
Hmm, interesting. James' observations from Apache 2.2 days are worthy of consideration.
Also, what if any other child/proxy processes are involved here? Do you have a PHP or mod_jk back end, that could be blocking threads, or is all your VH content static?
I'd be interested to know what the process relationships are when you get blocked/hanging threads. Are the Apache child processes waiting on rotatelog processes completing, or do some of the rotatelogs processes end up being orphaned? Sysinternals Process Explorer would help you check this out.
A few other things to muse over.
1) Unless you believe there are memory leaks, I'd consider setting MaxConnectionsPerChild to 0, to prevent new child proceses being spawned. You could always try this and keep an eye on the overall httpd process memory to see if it's creeping up. This approach might mean you have to restart Apache less frequently.
2) Have you tried setting AcceptFilter to none, as detailed on the MPM WinNT module documentation page?
AcceptFilter http none
AcceptFilter https none
The comments on that documentation page mentions potential issues with new child processes.
3) Although it's no longer maintained, I always found cronolog more robust than rotatelogs. It might be worth a try. Cronolog 1.6.1 Win32 binary downloadable from here (base64 format) - https://apaste.info/WIez (I couldn't find a valid download link elsewhere on the web).
Decode that 101K paste at https://base64.guru/converter/decode/file, to get cronolog-1.6.1-win32.zip. |
|
Back to top |
|
James Blond Moderator
Joined: 19 Jan 2006 Posts: 7371 Location: Germany, Next to Hamburg
|
Posted: Tue 15 Sep '20 22:12 Post subject: |
|
|
Rainer did you increase the LogLevel for more information? |
|
Back to top |
|
|
|
|
|
|