Author |
|
Danll
Joined: 02 Aug 2013 Posts: 49 Location: USA, Houston
|
Posted: Fri 05 Jun '15 15:56 Post subject: flood of 302s after redirect |
|
|
I serve large professional documents, and sometimes links to them end up on social media. No big deal, but I think people clicking on them from tose social media sites don't have a clue. They think they're being directed to small page, when in fact they are downloading megabytes of pdf -- myfile.pdf. So what I've started to do is to redirect requests from social media to an archive page, where they can see specifically what document they are trying to get, and recognize its size before they ask for it. No problem, right? I just do
Code: | # DENY facebook.com ACCESS
RewriteEngine on
RewriteCond %{HTTP_REFERER} facebook\.com [NC]
RewriteRule ^(.*)$ http://mysite/archive.htm |
So in this case, any GET request from Facebook will get sent to my archive page. Well, what I see in my logs is this
Code: | 12.34.56.78- - [04/Jun/2015:21:08:56 -0500] "GET myfile.pdf HTTP/1.1" 302 233 "http://l.facebook.com/blahblah"Useragent"
12.34.56.78- - [04/Jun/2015:21:08:56 -0500] "GET archive.htm HTTP/1.1" 302 233 "http://l.facebook.com/blahblah"Useragent"
12.34.56.78- - [04/Jun/2015:21:08:56 -0500] "GET archive.htm HTTP/1.1" 302 233 "http://l.facebook.com/blahblah"Useragent"
12.34.56.78- - [04/Jun/2015:21:08:56 -0500] "GET archive.htm HTTP/1.1" 302 233 "http://l.facebook.com/blahblah"Useragent"
12.34.56.78- - [04/Jun/2015:21:08:56 -0500] "GET archive.htm HTTP/1.1" 302 233 "http://l.facebook.com/blahblah"Useragent"
12.34.56.78- - [04/Jun/2015:21:08:56 -0500] "GET archive.htm HTTP/1.1" 302 233 "http://l.facebook.com/blahblah"Useragent"
12.34.56.78- - [04/Jun/2015:21:08:56 -0500] "GET archive.htm HTTP/1.1" 302 233 "http://l.facebook.com/blahblah"Useragent"
12.34.56.78- - [04/Jun/2015:21:08:56 -0500] "GET archive.htm HTTP/1.1" 302 233 "http://l.facebook.com/blahblah"Useragent"
12.34.56.78- - [04/Jun/2015:21:08:56 -0500] "GET archive.htm HTTP/1.1" 302 233 "http://l.facebook.com/blahblah"Useragent"
12.34.56.78- - [04/Jun/2015:21:08:57 -0500] "GET archive.htm HTTP/1.1" 302 233 "http://l.facebook.com/blahblah"Useragent"
12.34.56.78- - [04/Jun/2015:21:08:57 -0500] "GET archive.htm HTTP/1.1" 302 233 "http://l.facebook.com/blahblah"Useragent"
12.34.56.78- - [04/Jun/2015:21:08:57 -0500] "GET archive.htm HTTP/1.1" 302 233 "http://l.facebook.com/blahblah"Useragent"
12.34.56.78- - [04/Jun/2015:21:08:57 -0500] "GET archive.htm HTTP/1.1" 302 233 "http://l.facebook.com/blahblah"Useragent"
...
... |
Duh, whaaaat? This happens every time (though with somewhat variable number of 302s). I expect ONE 302, not fifty.
Should I have something else for my RewriteRule? What causes all those 302s?? This happens for Twitter and GooglePlus as well. Sure makes a mess of my logs.
Does this have something to do with asking for a pdf file and getting sent to an html file? |
|
Back to top |
|
admin Site Admin
Joined: 15 Oct 2005 Posts: 692
|
Posted: Fri 05 Jun '15 16:19 Post subject: |
|
|
What happens when you change the line to:
RewriteRule (.*) http://mysite/archive.htm [R=301,L]
? |
|
Back to top |
|
Danll
Joined: 02 Aug 2013 Posts: 49 Location: USA, Houston
|
Posted: Fri 05 Jun '15 16:46 Post subject: |
|
|
No change. Still a flood of 302s. |
|
Back to top |
|
admin Site Admin
Joined: 15 Oct 2005 Posts: 692
|
Posted: Fri 05 Jun '15 17:08 Post subject: |
|
|
Did you restarted Apache ? |
|
Back to top |
|
Danll
Joined: 02 Aug 2013 Posts: 49 Location: USA, Houston
|
Posted: Fri 05 Jun '15 17:11 Post subject: |
|
|
Well, this is just an .htaccess file, so no restart of Apache should be required. But I'll try that. |
|
Back to top |
|
Danll
Joined: 02 Aug 2013 Posts: 49 Location: USA, Houston
|
Posted: Fri 05 Jun '15 17:25 Post subject: |
|
|
No difference after Apache restart.
I should add that I routinely do redirects, from one html file to another (on the basis of user agent or referral), and never see this kind of thing. |
|
Back to top |
|
Danll
Joined: 02 Aug 2013 Posts: 49 Location: USA, Houston
|
Posted: Fri 05 Jun '15 17:40 Post subject: |
|
|
Sigh. I tested the presumption that this was because the original request was for a pdf file, and I was redirecting to an html file. But I changed it, so the redirect was to another pdf file, and the same thing happens.
This is nuts.
It is because I'm trying to redirect a request for a pdf file? |
|
Back to top |
|
glsmith Moderator
Joined: 16 Oct 2007 Posts: 2268 Location: Sun Diego, USA
|
Posted: Fri 05 Jun '15 20:00 Post subject: |
|
|
Yes it is effen nuts until you think through your rewrite carefully. Remember that when the user is redirected, it's an entirely new request and they are going to have to pass through the rewite again.
RewriteEngine on
RewriteCond %{HTTP_REFERER} facebook\.com [NC]
RewriteRule ^(.*)$ http://mysite/archive.htm
Someone coming from a link on facebook tries to access the pdf and is redirected to archive.htm, that request is then redirected to archive.htm, that request is then redirected to archive.htm and on and on and on. You are sticking them in a never ending loop and at some point they/browser just gives up. You can clearly (now at least) see that in your log snippet. Only the first request is the actual pdf file, then the dog chasing it's tail starts.
It's a bad rewrite that is the problem, not cause it's originally a pdf file. There is nothing in your example you have shown me that would indicate you are just targeting said pdf file. Quite the opposite, (.*) == absolutely anything and everything including the kitchen sink (and archive.htm) other than the one condition that is the referrer being facebook.com.
All requests with a referrer of facebook.com no matter what the URI are going to be redirected, and then redirected again, and again.
Try Code: | <Files "myfile.pdf">
RewriteEngine on
RewriteCond %{HTTP_REFERER} facebook\.com [NC]
RewriteRule ^(.*)$ http://mysite/archive.htm
</Files> | or
Code: | RewriteEngine on
RewriteCond %{HTTP_REFERER} facebook\.com [NC]
RewriteRule ^myfile\.pdf$ http://mysite/archive.htm |
|
|
Back to top |
|
Doug22
Joined: 02 Jun 2013 Posts: 57 Location: Houston TX
|
Posted: Fri 05 Jun '15 21:47 Post subject: |
|
|
I too had this problem.
But I think what the OP was trying to do (like me) was to send someone from Facebook to his archive page no matter what file they were asking for. That's what I was trying to do.
How do you do that? |
|
Back to top |
|
Danll
Joined: 02 Aug 2013 Posts: 49 Location: USA, Houston
|
Posted: Fri 05 Jun '15 22:00 Post subject: |
|
|
Yep, that's exactly right. No matter what they ask for (the kitchen sink), I want to send them (say, if they're coming from Facebook) to my archive.htm page. There ought to be a way to do that.
It's interesting that the rewrite is circular. I would have thought that once it got where it was told to go, it would escape the rewrite. |
|
Back to top |
|
glsmith Moderator
Joined: 16 Oct 2007 Posts: 2268 Location: Sun Diego, USA
|
Posted: Fri 05 Jun '15 22:33 Post subject: |
|
|
I admit it surprised me a little too, till I thought about how the rewrite is being told to do so. Without following my own advice and reading the docs, some more, the best I can figure is it's the "http://" that's forcing the redirect.
You can possibly use a condition to ignore archive.htm, or it might be as simple as
RewriteRule ^(.*)$ archive.htm
You'll have to try some things, be creative. Rewriting is an art form. Hopefully you do not find too many ways it doesn't work. |
|
Back to top |
|