I have a “Single-Line Text” custom field called “Asker URL” with the template tag set to AskExpertURL
. The values stored so far are a mixture of full URLs like:
http://movabletype.org
And shorthand URLs like:
movabletype.org
I am following the suggested template tag use on my site:
<mt:If tag="AskExpertURL">
Asker URL: <a href="<mt:AskExpertURL>">Asker Name</a>
</mt:If>
When I publish my site, sometimes the resulting HTML looks like:
Asker URL: <a href="movabletype.org">Asker Name</a>
But when I click that link on the published page, my browser thinks I want to visit the URL http://movabletype.org/some_directory/movabletype.org
. This seems silly. How do I fix it?
For reference, here are the settings for my custom field:
Charlie Gorichanaz on October 16, 2013, 4:53 p.m. Reply
This is expected browser behavior. Any href attribute value that does not begin with a protocol specifier or a pound sign (#) is interpreted as a relative URL pointing to another location on the same server. It might seem silly, but it is possible for a directory or file to be called “movabletype.org” and the URL
http://movabletype.org/some_directory/movabletype.org
might be valid.You have a number of options to deal with this. The third one is probably the easiest in your case.
Advise users to enter absolute URLs beginning with
http://
,https://
, etc.This is obviously not a fool proof solution, but it probably would not hurt, either.
Use form validation to force correct values
You could implement JavaScript validation to encourage the user to enter an absolute URL. This is also not foolproof as it won’t work if the user has JavaScript disabled or intentionally circumvents it by submitting to the form action URL via another script or by changing the web page code using a browsers’ developer tools.
You could also implement server side validation and block form submissions that do not validate. This is the best way to ensure all your data is the same format.
Standardize URLs through template code
Movable Type provides many convenient and powerful modifiers you can apply to the output of most template tags. In your case, you could use the modifier
regex_replace
to output an absolute URL, even if the data is a relative URL.The pattern I recommend is this:
Pattern
The pattern contained with slashes is
/^(?![a-zA-Z0-9+.-]+://)(?!//)(.*)/
^
anchors the following expression to the beginning of the string. If you omitted that, the regular expression engine would attempt to match anywhere in the string. This might be useful if you are using this on something like theEntryBody
tag, but then you should take care to avoid matching things that are not the start of a URL, such as by replacing^
with the word boundary anchor\b
.(?!…)
pieces are negative lookaheads that cause the entire pattern to only match if the contents between the opening(?!
and the closing)
do not exist. In this case, I am using two negative lookaheads immediately following the^
anchor to cause my replace to only happen on values that do not begin with either thing.[a-zA-Z0-9+.-]
matches any ASCII lowercase letters, uppercase letters, digits, or the characters+
,.
and-
. The+
following the character class’s closing bracket makes the pattern match if the character class matches at least one character with no limit on how many can match.://
match literally those characters.(?![a-zA-Z0-9+.-]+://)
matches any string that does not begin with sequences likehttp://
,HTTP://
,ftp://
, etc.(?!//)
is similar in that it matches any string that does not begin with//
, a such as a protocol relative URL.(.*)
then matches the remainder of the valueReplacement
The replacement, which does not need slashes, is
//$1
//
are just two literal slashes. Note you could replace this withhttp://
, but the two slashes alone work fine in almost all cases and have the added benefit of not causing warnings when switching between SSL and standard unencrypted pages.$1
is a backreference that outputs the characters that were matched by the first group marked by parentheses,(…)
. In this case, that should be “movabletype.org” if that was the original value.Dan Wolfgang on October 18, 2013, 1:22 p.m. Reply
It’s worth pointing out that there is a “URL” Custom Field type available. This field type requires that the URL start with the protocol (
http://
) and does not accept any other format — and that’s all it does. So, it’s a field designated for URLs but it’s not robust or smart enough to handle all of the variation that is likely to come in from a user-facing submission form. In other words, for this scenario Charlie’s answer is clearly the best solution.