logo
Apache Lounge
Webmasters

 

About Forum Index Downloads Search Register Log in RSS X


Keep Server Online

If you find the Apache Lounge, the downloads and overall help useful, please express your satisfaction with a donation.

or

Bitcoin

A donation makes a contribution towards the costs, the time and effort that's going in this site and building.

Thank You! Steffen

Your donations will help to keep this site alive and well, and continuing building binaries. Apache Lounge is not sponsored.
Post new topic   Forum Index -> Apache View previous topic :: View next topic
Reply to topic   Topic: Conversion of internal links to correct URL
Author
lugovsa



Joined: 22 Aug 2014
Posts: 3
Location: Israil

PostPosted: Fri 22 Aug '14 20:20    Post subject: Conversion of internal links to correct URL Reply with quote

A dynamic php+mySQL site was converted to a static html by mirroring (wget). The result was that each page became an html file named some-page.html. The links direct to URI like http://sitename/some-page. The site is large (more than 10,000 pages after the conversion) and changing each link manually is a crazy job. I tried to use the rewrite_rules in the .access file of the root directory of the site. Many attempts have been made, but all in vain. My last 'genious' idea was like that:

Quote:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.+) http://sitename/$1

RewriteCond %{REQUEST_URI} !\.(html|css|js|less|jpg|png|gif|htm|net)$
RewriteRule ^(.*)$ $1.html


Unfortunately, it doesn't work as I want it to. The most annoying thing is that it must be quite easy. Basically, what I need is to convert

each_filename --> http://sitename/+FILENAME+.html

I would really-really appreciate someone's help
Back to top
glsmith
Moderator


Joined: 16 Oct 2007
Posts: 2268
Location: Sun Diego, USA

PostPosted: Fri 22 Aug '14 21:00    Post subject: Reply with quote

You forgot to calculate in that the browser is being told to go to http://sitename and not http://your-sitename, so no, I doubt any rewrite would work.

Two option:
1. write a script in any number of scripting languages that goes through each .html file and strips out http://sitename or replaces it with http://your-sitename.

2. and probably easiest, use mod_substitute for that site.
http://httpd.apache.org/docs/2.4/mod/mod_substitute.html

I do hope you have permission to mirror this site and are not just stealing it and maybe making it look like yours.
Back to top
lugovsa



Joined: 22 Aug 2014
Posts: 3
Location: Israil

PostPosted: Sat 23 Aug '14 5:09    Post subject: Reply with quote

Thanks for hoping that I am not necessarily a thief, I do appreciate it. It is always a pleasure to feel that the world consists not only of over-suspecting persons having no confidence to anyone but also from some kind Samaritans ready to admit that there are some honest people. Confused

I might have not explained it clearly enough, but the sitename and the my-sitename are the same domain. It is my site, my domain and my ten years of work (I mean its content). I do make use of scripts of many kinds , but I am no programmer and this was the main reason to ask for help.

Thank you, anyway.
Back to top
glsmith
Moderator


Joined: 16 Oct 2007
Posts: 2268
Location: Sun Diego, USA

PostPosted: Sat 23 Aug '14 10:00    Post subject: Reply with quote

Good, I've just seen it before.

Really though, have a look at mod_substitue, you won't have to change anything in the files, the server will do the work.

something like
Code:
Substitute "s|http://sitename/|http://newsitename|i"
Back to top
lugovsa



Joined: 22 Aug 2014
Posts: 3
Location: Israil

PostPosted: Sat 23 Aug '14 10:54    Post subject: Reply with quote

It's not that.

The problem is that instead of a URL, say, 'http://sitename/pagename.html' it has just a 'pagename' (no 'http://...' and no '.html'). That is, if I am trying to use such a substitution, I will have the server to replace ANYTHING. Some links are correct and the substitution for them would result in the obtaining of doublets like 'http://sitenamehttp://sitename/pagename.html.html'

I need first to check if a URL is not correct (doesn't start with 'http://' and doesn't end with '.html') and only then to substitute.
Back to top


Reply to topic   Topic: Conversion of internal links to correct URL View previous topic :: View next topic
Post new topic   Forum Index -> Apache