While wondering what to do to occupy some time, I decided to think about what I might do with a link maker. Here’s one attempt at a JavaScript possibility: function rep(url, remove=false){ prefix = "https://mudcat.org"; u = new URL(url, prefix); pathpart = u.pathname; if (u.hostname.search(/(mudcat.org)/i)>=0) { mc = u.pathname.search("mudcat.org"); if (mc >= 0 && remove){ pathpart = pathpart.substr(mc+10); } hostpart = prefix; }else{ hostpart = u.protocol + "//" + u.hostname; } return '<a href="' + hostpart + pathpart + u.search + '">' + ' + hostpart + pathpart + u.search + '</a>(input url: ' + url + ')'; } and some test urls. <a href="https://www.google.co.uk/search?q=mudcat">https://www.google.co.uk/search?q=mudcat</a>(input url: https://www.google.co.uk/search?q=mudcat) <a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: /thread.cfm?threadid=172747) <a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: http://mudcat.org/thread.cfm?threadid=172747) <a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: http://www.mudcat.org/thread.cfm?threadid=172747) <a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: https://loki.mudcat.org/thread.cfm?threadid=172747) <a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: https://mudcat.org/thread.cfm?threadid=172747) <a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: https://www.mudcat.org/thread.cfm?threadid=172747)
(remove=false) <a href="https://mudcat.org/mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/mudcat.org/thread.cfm?threadid=172747</a>(input url: mudcat.org/thread.cfm?threadid=172747) <a href="https://mudcat.org/www.mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/www.mudcat.org/thread.cfm?threadid=172747</a>(input url: www.mudcat.org/thread.cfm?threadid=172747)
(remove=true) <a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: mudcat.org/thread.cfm?threadid=172747) <a href="https://mudcat.org/thread.cfm?threadid=172747">https://mudcat.org/thread.cfm?threadid=172747</a>(input url: www.mudcat.org/thread.cfm?threadid=172747)
As I guess it represents "mudcat best practises" these days, links to Mudcat pages are returned as starting "https://mudcat.org" regardless of whether http or https or subdomains (eg. www.) are given in input. The last 4 examples show bit of a puzzle to me. That is when mudcat.org is supplied without http(s)://. By my current thinking, without the protocol, the link given is relative so, eg. www.mudcat.org is part of the path and a result like https://mudcat.org/www.mudcat.org/thread.cfm?threadid=172747 is correct. On the other hand, I think this behaviour has been considered a bug. I’ve left the question open in this example. Calling the function with remove set to true removes the apparent duplication.
|