What to look for, what to check.
This simple question opens up a veritable Pandora’s box when it comes to understanding URLs and what is safe to click on.
The concepts are simple, but how those concepts can be combined is complex, particularly if someone is attempting to deceive you.
I’ll try to make some sense of it all.
Become a Patron of Ask Leo! and go ad-free!

A URL or web address beginning with “http” has several components, the most important being the server or website address between the leading “//” and the next, single ‘/’. This is often obfuscated by making trusted websites — like paypal.com — appear as subdomains of a hacker’s domain — paypal.com.hackersdomain — or by using confusing encoding — hackersdomain%2fpaypal.com. Always examine the left-most portion of the full domain to confirm it’s going where you think. Https is available to fraudulent domains as well, so you cannot rely on it alone as an indicator of safety.

Three basic URL components
“URL” is short for Uniform Resource Locator. The most common one we know of is the web address — something like “https://askleo.com”.
There are three primary components to a URL. We’ll use this URL as our example for discussion:
http://www.somerandomservice.com/folder/page?parameter1=value1¶meter2=value2
- http://www.somerandomservice.com – Server. This identifies the protocol (http or https — the language of webpages) and the server that hosts the webpage. www.somerandomservice.com identifies a specific server on the internet from which what follows will be requested.
- folder/page – Page. The page specifies exactly what you are requesting from the server. Typically it’s a webpage, perhaps within a folder on that server, but it could also be a program to run on the server or a file to be downloaded.
- ?parameter1=value1¶meter2=value2 – Parameters. The question mark indicates that the rest of the URL contains parameters — additional information supplied to the page. Since “pages” can often be small (or large) computer programs, information from this part of a URL is given to those programs to use in ways that can affect the content of the page ultimately displayed.
Important note: The Server specification ends at the first “/” that occurs after the “http://” or “https://” start of the URL, and the Page specification ends at the first question mark after that. This rule is important to understanding whether a URL is valid, bogus, or misleading.
The server matters, part 1
I’ll restate the first part of that rule to focus on what we care about (I’ll use “http” from here, but this all applies to both “http” and “https” unless otherwise specified):
The server being contacted begins after the “http://” and ends at the next “/”.
Or, in this URL, the part that’s highlighted:
http://www.somerandomservice.com/folder/page?parameter1=value2¶meter2=value2
That’s the part that matters, because that’s the part that tells your browser what internet server to connect to. Everything else is secondary. Important, yes, but not nearly as important.
Let’s look at one of the ways that phishing attempts try to fool you.
http://www.somerandomservice.com/www.paypal.com
It might be tempting to look at that quickly and say “oh, that ends in paypal.com, so it’s PayPal!”
No, it’s not. Look again:
http://www.somerandomservice.com/www.paypal.com
That URL loads a page named “www.paypal.com” (a valid page name) from the server www.somerandomservice.com.
Now, my example is pretty lame, as “www.somerandomservice.com” is big and obvious at the front of that URL. But scammers use all sorts of variations on this theme to make it look like you’re going to someplace you trust, when you’re not if you don’t look closely.
The server matters, part 2
For this point, we need to pick apart the way server names are created and used.
URLs are created from right to left, and the individual components are separated by a period. Consider “www.somerandomservice.com”.
- “.com” is the top-level domain, and indicates which registry service is used to register the domain initially.
- “somerandomservice” is the domain name. This is the part you purchase when you register a domain name.
- ““www.” is the subdomain. Once you own the domain, you can create as many of these subdomains as you like.
In general, fully qualified domain names like “www.somerandomservice.com” identify a server on the internet. “photos.somerandomserver.com” would typically be a different server, but it doesn’t have to be.
The choice between using something like “photos.somerandomserver.com” versus “somerandomserver.com/photos” is purely one of site design, and has no security implications. That’s just how the person building the website chose to do it. There are geeky pros and cons to each, but for a typical web user, it doesn’t really matter.
What does matter is how subdomains can be abused. For example, it’s perfectly possible for this to be a valid domain:
http://www.paypal.com.somerandomservice.com
Once again, with only a quick glance, you might think it was actually paypal.com, since it starts with “http://www.paypal.com”.
In that example, “www.paypal.com.” is just a subdomain created by the owner of “somerandomservice.com” and has nothing at all to do with the real paypal.com.
Here’s a worse example:
http://www.paypal.com————————————————————.somerandomservice.com
Once again, it’s designed to fool you into looking like paypal.com, but in fact it’s not – especially if your browser happens to only show you the first part of the URL in your status bar since it’s so long.
Scammers use many different variations of this technique to trick you.
A slash is a slash is a … %2F?
This was brought up by a comment on this article (thanks, Ken!).
Characters in URLs can be “encoded” with a special representation that acts the same as the character it encodes. The format is a percent sign followed by a two-digit hexadecimal number (individual digits will be 0-9 or A-F).
A space character, for example, is %20, and you’ll actually see that in legitimate URLs from time to time, since an actual space character cannot be used.
%2F is the slash character “/”.
So this rule:
The server being contacted begins after the “http://” and ends at the next “/”.
still applies, but %2F could be seen in place of “/”. More correctly:
The server being contacted
begins after the “http:”, “/” or “%2F”, “/” or “%2F” and
ends at the next “/” or “%2F”.
It gets ugly, but the thing to remember is just this: %2F is exactly the same as “/”.
Here’s an example of how it might be abused:
http://www.somerandomservice.com%2Fwww.paypal.com/
That is not PayPal. Replace the %2F with “/” and you’ll see instead:
http://www.somerandomservice.com/www.paypal.com/
Clearly, it goes to www.somerandomservice.com.
Any URL with a % notation in the server portion (between that first “http://” and the next “/”) is suspect. A % notation after the server portion (in the page, or more commonly the parameters) is typically OK.
Https and secure websites
All of the above applies whether the URL begins with http or https.
Https adds two important things:
- It encrypts the data flowing between your computer and the server.
- It validates that the server you connect to is, in fact, the server you requested.
Important: https doesn’t validate you’re connecting to the server you think you are; it validates that you’re connecting to the server you requested. Those are two different things.
For example, let’s say you fall for one of my lame examples above, and click on a link like this:
https://www.paypal.com.somerandomservice.com
That’s an https connection. It is very easy for the owner of somerandomservice.com to install a completely valid https certificate for www.paypal.com.somerandomservice.com.
Thus, when you click on that link, your browser will confirm that you are indeed connecting to what you asked for: www.paypal.com.somerandomservice.com. That might not be what you think you asked for, if you fell for a scammer’s trick, but that’s all that https can validate for you: you got what you asked for.
Staying safe
It’s unfortunate that something fairly simple is quite complex once you assume people will attempt to deceive you.
I’ll sum it up with this:
For any URL you are about to click on, pay close attention to the domain name: everything between “http://” or “https://” and the next “/”. Remember that domain names build from the right, so if it ends in, for example, “.paypal.com”, you can be assured that it’s a domain or sub-domain owned by paypal.com.
Do this
Subscribe to Confident Computing! Less frustration and more confidence, solutions, answers, and tips in your inbox every week.
I'll see you there!
It’s also helpful to note that scammers will often create their HTML in such a way that what the link name is isn’t where you’re actually going. For instance [a href=”somerandomservice.com”]paypal.com[/a] (if you replace the square brackets with greater-than/less than signs) if viewed in html will look like you’re going to paypal.com, but will connect you to somerandomservice.com instead.
And then there are IP numbers…
There is even a nastier technique using the username:password construction in URLs like: http://username:password@domain/path/page. Example:
http://www.paypal.com:looksverysafe@phishingsite.net
All good points. Another “trick” phishers often use is the “%nn” method of specifying characters in hexadecimal. Beware of anything that includes such things outside of parameters (the part after “?”).
For example, “%2F” represents “/”, so the following does _not_ go to paypal.com:
http://www.example.com%2Fwww.paypal.com/login
So, strictly following the “between // and the first /” won’t work here, unless you know that %2F _is_ a “/”.
But, as I said, anything that uses “%nn” in the URL, especially in the server name part, should be looked at with great suspicion.
A very nasty technique is the use of international domain names, where non-Latin characters are used in the domain name. In this characters are used that look like those in our ‘normal’ alphabet, but are different characters as farad the computer is concerned. An example
could be: http://amazon.com where the ‘o’ in ‘amazon’ is not the letter ‘o’ from the Latin alphabet, but an ‘o’ from the Cyrillic (Russian) alphabet, or the omicron from the Greek.
Luckily, not all browsers support international domain names.
At the risk of adding even more confusion to Leo’s nice, simple explanation: Some anti-malware products use various means to verify the safety of a URL before letting you go there. I know of two vendors, Comodo and Sunbelt, who are testing DNS services as an efficient way to do this. You simply enter Comodo’s or Sunbelt’s DNS servers into your TCP/IP settings. (Internet 101: A PC can’t go to a URL without first querying a DNS server for the IP address of the URL.) If you try to go to a bad URL, these “smart” DNS servers return the IP address of a warning page, instead of the IP address of the bad website.
Use McAfee site advisor ( http://www.siteadvisor.com/ ). It will give you a red flag when a site is fishy.
I’ve always used the rule that it’s the item immediately before the .com (or .edu etc.)that matters. Isn’t this a simpler, but valid, rule for checking site authenticity?.
18-Aug-2010
Oops, from what you say my rule must include the first slash. That is, I should have said, “before the ‘.com/’ or ‘.com.countrycode/’ where the slash is the first one. Won’t that work as a simple rule for looking for the critical domain name?
Another way to stay safe with PayPal is knowing that official email from PayPal always addresses you specifically by name, as in “Dear John Doe,” not “Dear PayPal User” or similar, as scammers usually do. Still, it’s a good idea to not click a link, even in an email apparently from PayPal, just to be sure. Use a Favorite shortcut, or simply type paypal.com into your browser, to ensure that you’re going to the genuine PayPal website and not some other site.
I use firefox and they have an addon called WOT that I find useful, somehow it rates sites and advises you either the site is ok or else get off. I think it can be downloded free to use with other brosers
18-Aug-2010
Thanks for your info.
For the less technical person, how about using AVK-Link Scanner? While not a 100% guarantee, it is very useful, and after several years of using, has never failed to reveal a ‘bad’ site.
18-Aug-2010
Why not use Calling ID toolbar? This checks any site you visit and informs you whether it is safe or not. It has never failed me yet; and it’s free!
18-Aug-2010
When I clicked on the link to this article from your home page I got a warning from Avira, Malware found, “HTML/Spoofing.Gen” was found in a file. It happens every time. The page loads, despite. I click on remove, but concerned I may have something on my computer. This last time the warning popped back up again when I posted this comment.
Thank you for the specific explanation of URL warning signs. As I am sure you know from experience whenever an expert such as you explains what the hackers are doing and how they do it, that sets up the “dropping of the gauntlet”. They accept the challenge and design detours from and around your warning post! So “Be Wise and Use Your Eyes” is a good motto to remember.
This works well: http://safeweb.norton.com/
Norton itself rates sites here, and allows users too to chime in. Works great, & has warned me off several so called “good websites.”
I’ve found the WOT (Web of Trust) Firefox plug in very useful, and it’s saved me from some bad sites.
The only problem is that sometimes a really nasty site can be rated with a green symbol in your Google search results. You have hover over the green rating symbol and click through to see the user ratings. Sometimes a site with lots of “red” (danger) ratings still shows up as a “green” (safe) site.
23-Aug-2010
Hi.I saw an input from a reader that made this comment_”when I clicked on the link to this article from your page I got a warning from Avira, Malware found, “HTML/Spoofing.Gen” was found in a file. It happens every time. The page loads, despite. I click on remove, but concerned I may have something on my computer. This last time the warning popped back up again when I posted this comment.”_ I tried and my Kaspersky warned me off; you may have a Trojan.
Trojan-Spy.HTML.Fraud.gen Best wishes. WSS
In order to discuss this topic this page has on it examples of misleading URLs. Some overly-sensitive anti-malware tools are throwing a false positive because of that.
Again, there is no malware here.
26-Aug-2010
Good article. Some of both the article and comments are over my head. But, bottom line, I never, EVER, click on a link in an email when going to my bank, paypay account, or any other site which requires name/rank/serial number/credit card/password.
I was hoping this article would answer a question I’ve had for some time but, unless I missed something, it didn’t. My question isn’t whether a particular website is “safe” but, rather, to which website will I be taken if I click on a link?
Admittedly, most of these links in question are in known spam emails. But I, occasionally, get curious and (knowing I am not going to ever enter any personal information at the site) will click on them.
Here is a made-up example. http://www.xyz.com/abc. Clicking on that link takes me to, say, adultfriendfinder.com. If my assumption is correct, “xyz” is the server and “abc” redirects to adultfriendfinder.
My question is: how can one determine the destination website in advance of clicking the link?
I actually know of no way for average users to determine the final destination of a URL that is being redirected without actually going to it. I tend to use a very geeky command line tool called “curl” which lists the domains it’s accessing as they’re redirected, but I don’t expect the average user to want that. Perhaps someone knows of such a service and will leave a comment here.
26-Aug-2010
i always use google cache for entering a website i don’t know and don’t accept any cookies from unfamiliar websites or website which i am not part of.
Eset 5 works to stop you going to bad sites.It terminates that site.If you try to dowload bad stuff it will auto clean.