Author |
|
mvuk
Joined: 21 Dec 2012 Posts: 8 Location: Germany
|
Posted: Wed 19 Feb '14 12:13 Post subject: [SOLVED] WinNT MPM question |
|
|
I am using Apache 2.4.6 (32) on a Windows 2003 Server as a reverse proxy cache for a tomcat sitting behind it. The binaries are from this website.
My monitoring tool tells me the server has alot of hicups (short time frames where its not responding at all). Checking the apache logs i see this message at the timestamps where the server stops responding (timeout in httpd.conf is set to 300 seconds):
Code: |
[Tue Feb 18 11:07:00.553011 2014] [mpm_winnt:notice] [pid 15880:tid 632] AH00362: Child: Waiting 270 more seconds for 6 worker threads to finish.
[Tue Feb 18 11:07:33.309231 2014] [mpm_winnt:notice] [pid 15880:tid 632] AH00362: Child: Waiting 240 more seconds for 6 worker threads to finish.
[Tue Feb 18 11:08:06.065451 2014] [mpm_winnt:notice] [pid 15880:tid 632] AH00362: Child: Waiting 210 more seconds for 6 worker threads to finish.
[Tue Feb 18 11:08:38.821671 2014] [mpm_winnt:notice] [pid 15880:tid 632] AH00362: Child: Waiting 180 more seconds for 6 worker threads to finish.
[Tue Feb 18 11:09:11.577891 2014] [mpm_winnt:notice] [pid 15880:tid 632] AH00362: Child: Waiting 150 more seconds for 6 worker threads to finish.
[Tue Feb 18 11:09:44.334111 2014] [mpm_winnt:notice] [pid 15880:tid 632] AH00362: Child: Waiting 120 more seconds for 6 worker threads to finish.
[Tue Feb 18 11:10:17.090917 2014] [mpm_winnt:notice] [pid 15880:tid 632] AH00362: Child: Waiting 90 more seconds for 6 worker threads to finish.
[Tue Feb 18 11:10:49.849237 2014] [mpm_winnt:notice] [pid 15880:tid 632] AH00362: Child: Waiting 60 more seconds for 6 worker threads to finish.
[Tue Feb 18 11:11:22.607557 2014] [mpm_winnt:notice] [pid 15880:tid 632] AH00362: Child: Waiting 30 more seconds for 6 worker threads to finish.
[Tue Feb 18 11:11:55.365877 2014] [mpm_winnt:notice] [pid 15880:tid 632] AH00362: Child: Waiting 0 more seconds for 6 worker threads to finish.
[Tue Feb 18 11:11:55.475071 2014] [mpm_winnt:notice] [pid 15880:tid 632] AH00363: Child: Terminating 6 threads that failed to exit.
[Tue Feb 18 11:11:55.475071 2014] [mpm_winnt:notice] [pid 15880:tid 632] AH00364: Child: All worker threads have exited.
[Tue Feb 18 11:11:55.506270 2014] [mpm_winnt:notice] [pid 17332:tid 644] AH00428: Parent: child process 15880 exited with status 0 -- Restarting.
[Tue Feb 18 11:11:55.553067 2014] [mpm_winnt:notice] [pid 17332:tid 644] AH00455: Apache/2.4.6 (Win32) configured -- resuming normal operations
[Tue Feb 18 11:11:55.553067 2014] [mpm_winnt:notice] [pid 17332:tid 644] AH00456: Apache Lounge VC11 Server built: Jul 15 2013 20:13:45
[Tue Feb 18 11:11:55.553067 2014] [core:notice] [pid 17332:tid 644] AH00094: Command line: 'c:\\Apache24\\bin\\httpd.exe -d C:/Apache24 -f c:\\Apache24\\conf\\httpd.conf'
[Tue Feb 18 11:11:55.553067 2014] [mpm_winnt:notice] [pid 17332:tid 644] AH00418: Parent: Created child process 12032
[Tue Feb 18 11:11:56.333027 2014] [mpm_winnt:notice] [pid 12032:tid 632] AH00354: Child: Starting 1920 worker threads.
|
It seems that the process is killed and Apache restarted because 6 out of the 1920 threads failed to finish. While waiting for the timeout the Apache is not responding at all.
Reading about mpm_winnt at http://httpd.apache.org/docs/current/mod/mpm_winnt.html, it seems that this is behaviour as expected.
So my question is basicaly:
Is Apache suitable for a productive system on windows at all, if 1 failed thread makes the whole webserver unavailable, or am i missing something here?
Last edited by mvuk on Mon 24 Feb '14 12:43; edited 1 time in total |
|
Back to top |
|
Steffen Moderator
Joined: 15 Oct 2005 Posts: 3092 Location: Hilversum, NL, EU
|
Posted: Wed 19 Feb '14 12:33 Post subject: |
|
|
The log shows only notices during a restart, no so special.
Woww... do you really need 1920 worker threads, it is using memory and cpu. And causes probably your notices in the log.
For a quite busy site 250 should be enough (150 is the deault). With mod_status you can tune it.
Do you have the following in httpd.conf ?
AcceptFilter http none
AcceptFilter https none
EnableSendfile off
EnableMMAP off
Hickups are mostly solved with these.
Sure Apache on Windows is production ready, in fact quite companies are more and more switching to Apache. |
|
Back to top |
|
mvuk
Joined: 21 Dec 2012 Posts: 8 Location: Germany
|
Posted: Wed 19 Feb '14 13:48 Post subject: |
|
|
What i am wondering about, is why the restart happened in the first place. It seems I missed out the most important line in the log above.
Code: |
[mpm_winnt:crit] [pid 15880:tid 632] (OS 6)The handle is invalid. : AH00356: Child: WAIT_FAILED -- shutting down server
|
No log entry before that line for ~8 hours. This seems to happen few times a day. I have put the log to debug now, hopeing to get more info. I guess if I can manage to keep Apache from restarting it should be fine.
AcceptFilter http none
AcceptFilter https none
EnableSendfile off
EnableMMAP off
Are in the conf.
I put ThreadsPerChild to 1920, since there is only one child (process) in win_nt MPM mode. From my understanding this means that 1920 connections can be handled at the same time (correct me if i am wrong here). Since the tomcat can take up to 10 seconds to respond (if no cache hit) i dont want to block the webserver if there are too many hits. The server should be easily big enough for that (16 cores, 32gig ram of which 3gig are planed for the apache).
Thanks for the mod_status hint, i ll play around with it. |
|
Back to top |
|
Steffen Moderator
Joined: 15 Oct 2005 Posts: 3092 Location: Hilversum, NL, EU
|
Posted: Wed 19 Feb '14 14:06 Post subject: |
|
|
You have other things running in Apache, like Perl, PHP or other external party stuff ? |
|
Back to top |
|
mvuk
Joined: 21 Dec 2012 Posts: 8 Location: Germany
|
Posted: Wed 19 Feb '14 14:38 Post subject: |
|
|
No its a very basic apache used only as a reverse proxy cache in front of a Tomcat. Here is the current config if that helps
xxx is the tomcat servlet
Mod note: moved config to http://pastebin.com/Z3YDRaYa |
|
Back to top |
|
mvuk
Joined: 21 Dec 2012 Posts: 8 Location: Germany
|
Posted: Thu 20 Feb '14 14:24 Post subject: |
|
|
Lowering ThreadsPerChild from 1920 to 1024 seems to solve the problem. Can be reproduced by puting the value back to 1920.
Environment is windows server 2008 with Apache 2.4.6 |
|
Back to top |
|
Qmpeltaty
Joined: 06 Feb 2008 Posts: 182 Location: Poland
|
Posted: Mon 24 Feb '14 14:51 Post subject: |
|
|
mvuk wrote: | Lowering ThreadsPerChild from 1920 to 1024 seems to solve the problem. Can be reproduced by puting the value back to 1920.
Environment is windows server 2008 with Apache 2.4.6 |
You may use ThreadsPerChild higher than 1920. It's possible with ThreadLimit directive used together with ThreadsPerChild. |
|
Back to top |
|
Mg
Joined: 04 Jun 2014 Posts: 4
|
Posted: Wed 04 Jun '14 15:37 Post subject: |
|
|
I face the same problem as described here in a similar environment. Its an apache reverse proxy.
[Fri May 23 07:53:58.344457 2014] [mpm_winnt:crit] [pid 42540:tid 380] (OS 6)The handle is invalid. : AH00356: Child: WAIT_FAILED -- shutting down server
ThreadsPerChild is 300 in my setup, so the suggested solution unfortunately doesn't apply for me.
The suggested settings are already active:
Quote: |
AcceptFilter http none
AcceptFilter https none
EnableSendfile off
EnableMMAP off
|
The Problem occured after the Apache upgrade from 2.4.4 to 2.4.9 (heartbleed ). First I used the VC10 compiled version, but the problem occurs on the VC11 version too.
At this moment I reverted to 2.4.4 again (with new OpenSSL libs) because the random restarts cause downtimes.
Apache Version (from this site):
* Problem occurs on 2.4.9, both VC10 and VC11 compiled ones
* Problem occurs NOT on 2.4.4
Operating System:
* Microsoft Windows Server 2012 Standard
typical logfile fragment:
Code: |
[Fri May 23 07:53:58.344457 2014] [mpm_winnt:crit] [pid 42540:tid 380] (OS 6)The handle is invalid. : AH00356: Child: WAIT_FAILED -- shutting down server
[Fri May 23 07:54:03.757607 2014] [mpm_winnt:warn] [pid 42540:tid 4212] (OS 10038)An operation was attempted on something that is not a socket. : AH00344: accept() failed.
[Fri May 23 07:54:30.288807 2014] [mpm_winnt:notice] [pid 42540:tid 380] AH00362: Child: Waiting 270 more seconds for 5 worker threads to finish.
[Fri May 23 07:55:00.329552 2014] [mpm_winnt:notice] [pid 42540:tid 380] AH00362: Child: Waiting 240 more seconds for 1 worker threads to finish.
[Fri May 23 07:55:30.370221 2014] [mpm_winnt:notice] [pid 42540:tid 380] AH00362: Child: Waiting 210 more seconds for 1 worker threads to finish.
[Fri May 23 07:55:47.893020 2014] [mpm_winnt:notice] [pid 42540:tid 380] AH00364: Child: All worker threads have exited.
[Fri May 23 07:55:48.010131 2014] [mpm_winnt:notice] [pid 36544:tid 516] AH00428: Parent: child process 42540 exited with status 0 -- Restarting.
[Fri May 23 07:55:48.801890 2014] [mpm_winnt:notice] [pid 36544:tid 516] AH00455: Apache/2.4.9 (Win64) OpenSSL/1.0.1g configured -- resuming normal operations
[Fri May 23 07:55:48.801890 2014] [mpm_winnt:notice] [pid 36544:tid 516] AH00456: Apache Lounge VC10 Server built: Mar 17 2014 12:11:31
[Fri May 23 07:55:48.801890 2014] [core:notice] [pid 36544:tid 516] AH00094: Command line: 'C:\\Apache24\\bin\\httpd.exe -d C:/Apache24'
[Fri May 23 07:55:48.802891 2014] [mpm_winnt:notice] [pid 36544:tid 516] AH00418: Parent: Created child process 33888
[Fri May 23 07:55:50.263291 2014] [mpm_winnt:notice] [pid 33888:tid 380] AH00354: Child: Starting 300 worker threads.
[Fri May 23 11:58:09.232804 2014] [mpm_winnt:crit] [pid 33888:tid 380] (OS 6)The handle is invalid. : AH00356: Child: WAIT_FAILED -- shutting down server
[Fri May 23 11:58:10.710220 2014] [mpm_winnt:warn] [pid 33888:tid 5008] (OS 10038)An operation was attempted on something that is not a socket. : AH00344: accept() failed.
[Fri May 23 11:58:28.546422 2014] [mpm_winnt:warn] [pid 33888:tid 3320] (OS 10038)An operation was attempted on something that is not a socket. : AH00344: accept() failed.
[Fri May 23 11:58:41.169814 2014] [mpm_winnt:notice] [pid 33888:tid 380] AH00362: Child: Waiting 270 more seconds for 8 worker threads to finish.
[Fri May 23 11:59:11.208219 2014] [mpm_winnt:notice] [pid 33888:tid 380] AH00362: Child: Waiting 240 more seconds for 3 worker threads to finish.
[Fri May 23 11:59:41.245781 2014] [mpm_winnt:notice] [pid 33888:tid 380] AH00362: Child: Waiting 210 more seconds for 1 worker threads to finish.
[Fri May 23 12:00:11.277011 2014] [mpm_winnt:notice] [pid 33888:tid 380] AH00362: Child: Waiting 180 more seconds for 1 worker threads to finish.
[Fri May 23 12:00:41.305795 2014] [mpm_winnt:notice] [pid 33888:tid 380] AH00362: Child: Waiting 150 more seconds for 1 worker threads to finish.
[Fri May 23 12:01:11.335526 2014] [mpm_winnt:notice] [pid 33888:tid 380] AH00362: Child: Waiting 120 more seconds for 1 worker threads to finish.
[Fri May 23 12:01:15.840009 2014] [mpm_winnt:notice] [pid 33888:tid 380] AH00364: Child: All worker threads have exited.
[Fri May 23 12:01:15.953118 2014] [mpm_winnt:notice] [pid 36544:tid 516] AH00428: Parent: child process 33888 exited with status 0 -- Restarting.
[Fri May 23 12:01:16.820952 2014] [mpm_winnt:notice] [pid 36544:tid 516] AH00455: Apache/2.4.9 (Win64) OpenSSL/1.0.1g configured -- resuming normal operations
[Fri May 23 12:01:16.820952 2014] [mpm_winnt:notice] [pid 36544:tid 516] AH00456: Apache Lounge VC10 Server built: Mar 17 2014 12:11:31
[Fri May 23 12:01:16.820952 2014] [core:notice] [pid 36544:tid 516] AH00094: Command line: 'C:\\Apache24\\bin\\httpd.exe -d C:/Apache24'
[Fri May 23 12:01:16.820952 2014] [mpm_winnt:notice] [pid 36544:tid 516] AH00418: Parent: Created child process 24388
[Fri May 23 12:01:18.308375 2014] [mpm_winnt:notice] [pid 24388:tid 376] AH00354: Child: Starting 300 worker threads.
|
|
|
Back to top |
|