While wondering what to do to occupy some time, I decided to think about what I might do with a link maker. Here’s one attempt at a JavaScript possibility:
function rep(url, remove=false){
prefix = "https://mudcat.org";
u = new URL(url, prefix);
pathpart = u.pathname;
if (u.hostname.search(/(mudcat.org)/i)>=0) {
mc = u.pathname.search("mudcat.org");
if (mc >= 0 && remove){
pathpart = pathpart.substr(mc+10);
}
hostpart = prefix;
}else{
hostpart = u.protocol + "//" + u.hostname;
}
return '<a href="' + hostpart + pathpart + u.search + '">' +
' + hostpart + pathpart + u.search + '</a>(input url: ' + url + ')';
}
and some test urls.
<a href="https://www.google.co.uk/search?q=mudcat">https://www.google.co.uk/search?q=mudcat</a>(input url: https://www.google.co.uk/search?q=mudcat)
<a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: /thread.cfm?threadid=172747)
<a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: http://mudcat.org/thread.cfm?threadid=172747)
<a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: http://www.mudcat.org/thread.cfm?threadid=172747)
<a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: https://loki.mudcat.org/thread.cfm?threadid=172747)
<a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: https://mudcat.org/thread.cfm?threadid=172747)
<a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: https://www.mudcat.org/thread.cfm?threadid=172747)
(remove=false)
<a href="https://mudcat.org/mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/mudcat.org/thread.cfm?threadid=172747</a>(input url: mudcat.org/thread.cfm?threadid=172747)
<a href="https://mudcat.org/www.mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/www.mudcat.org/thread.cfm?threadid=172747</a>(input url: www.mudcat.org/thread.cfm?threadid=172747)
(remove=true)
<a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: mudcat.org/thread.cfm?threadid=172747)
<a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: www.mudcat.org/thread.cfm?threadid=172747)
As I guess it represents "mudcat best practises" these days, links to Mudcat pages are returned as starting "https://mudcat.org" regardless of whether http or https or subdomains (eg. www.) are given in input.
The last 4 examples show bit of a puzzle to me. That is when mudcat.org is supplied without http(s)://. By my current thinking, without the protocol, the link given is relative so, eg. www.mudcat.org is part of the path and a result like https://mudcat.org/www.mudcat.org/thread.cfm?threadid=172747 is correct. On the other hand, I think this behaviour has been considered a bug. I’ve left the question open in this example. Calling the function with remove set to true removes the apparent duplication.