Keep Server Online
If you find the Apache Lounge, the downloads and overall help useful, please express your satisfaction with a donation.
or
A donation makes a contribution towards the costs, the time and effort that's going in this site and building.
Thank You! Steffen
Your donations will help to keep this site alive and well, and continuing building binaries. Apache Lounge is not sponsored.
| |
|
Topic: Serving large files that must be changed while being served |
|
Author |
|
marekful
Joined: 22 Nov 2012 Posts: 2 Location: UK, London
|
Posted: Fri 23 Nov '12 15:23 Post subject: Serving large files that must be changed while being served |
|
|
Hi,
I'd like to ask for your advice to help me understand how Apache's memory management while serving a large size file works. For this example's sake I refer to file size greater than 2MB but may be significantly larger. Obviously, it may and most probably does depend on many factors, for example the underlying operating system, so let's say we're talking about Apache/2.2.22 on a Ubuntu SMP x86_64 with kernel version 3.2.0-25.
In particular, I'd like to understand what is happening when the target file being served is overwritten or changed in other ways by the operating system before Apache finishes serving it.
To be honest, I'm seeking advice for different scenarios:
a) Apache is solely responsible for serving the file (not involving any CGI).
b) PHP (5.3.10) is involved and the file is passed to Apache using PHP's readfile() function.
c) PHP reads the entire file into a variable using file_get_contents() and then prints this into the output buffer and flushes it.
I understand that b) highly depends on the readfile function's nature and I only really expect advice for a) but I welcome any useful input for the latter too.
My understanding is that c) is absolutely "safe" against the file being overwritten since in this case PHP internally holds a copy of the file so if it is changed on the file system, that doesn't have an effect on the copy. However, for large files c) is not efficient especially if many requests to the same file are expected.
In case of a), what is the size limit under which Apache will store the file in memory and serve it from there? Is it configurable?
---
I'm trying to find out the best and safest way to get around the issue when a web-application depends on serving large files and those files must be changed sometimes while high traffic is expected and ideally the web-server should be continuously up.
One method to mitigate the time it takes for the operating system to actually overwrite the file is to have a symlink pointing to the file, write the new version of the file to a different location on the file system and then just swtiching the symlink to this new location (in which case no actual overwriting happens, but the content and possibly the size of the file is still changed from Apache's viewpoint after switching the symlink).
---
I think there's a way to handle the issue with the following technique:
1) Use an environment variable to tell Apache where the file is.
2) Write the new version of the file to a different location.
3) Gracefully restart Apache.
4) Change the value of the environment variable to point to the new location.
In this case, requests that exist at the time of the graceful restart would finish using the old copy of the file while new requests would be served with the new copy. Obviously what I want to avoid is the file being served partly from the old and partly from the new version of the file.
This is just a theory and I didn't implement it yet. Also, I may be wrong in my assumptions so please correct me if so. The drawbacks I can see for this technique is that it requires 2 copies of each large file and I have to get around the issue to know when the "old" copies can be removed.
---
I realize that this post is a bit "messy" but at the moment this is the best description of the issue I can come up with and English is only my second language so sorry for that.
Thank you for your advice,
Marcell |
|
Back to top |
|
|
|
|
|
|