Technology in terms you understand. Sign up for my weekly newsletter, "Confident Computing", for more solutions you can use to make your life easier. Click here.

How Do I Know a Web Address is Safe?

//
Security when clicking onto a website confounds me. Some sites put the section of the site you are wanting ahead of the web address. Example http://photos.kodak.com and some put the section after example http://kodak.com/photos. These examples are just made up but I hope you understand what I’m saying. How do I know if I’m on the secure website I’m supposed to be on? At times I see other addresses flashing by on the toolbar that are not the site I clicked on before the actual site appears.

This simple question opens up a veritable Pandora’s box when it comes to understanding URLs and what is safe to click on.

The concepts are simple, but how those concepts can be combined is complex, particularly if someone is attempting to deceive you.

I’ll try to make some sense of it all.

Become a Patron of Ask Leo! and go ad-free!

Three basic URL components

“URL” is short for Uniform Resource Locator. The most common one we know of is the web address — something like “https://askleo.com”.

There are three primary components to a URL. We’ll use this URL as our example for discussion:

http://www.somerandomservice.com/folder/page?parameter1=value1&parameter2=value2
  • Paypal via httpshttp://www.somerandomservice.com – Server. This identifies the protocol (http or https — the language of webpages) and the server that hosts the webpage. www.somerandomservice.com identifies a specific server on the internet from which what follows will be requested.
  • folder/page – Page. The page specifies exactly what you are requesting from the server. Typically it’s a webpage, perhaps within a folder on that server, but it could also be a program to run on the server or a file to be downloaded.
  • ?parameter1=value1&parameter2=value2 – Parameters. The question mark indicates that the rest of the URL contains parameters — additional information supplied to the page. Since “pages” can often be small (or large) computer programs, information from this part of a URL is given to those programs to use in ways that can affect the content of the page ultimately displayed.

Important note: The Server specification ends at the first “/” that occurs after the “http://” or “https://” start of the URL, and the Page specification ends at the first question mark after that. This rule is important to understanding whether a URL is valid, bogus, or misleading.

The server matters, part 1

I’ll restate the first part of that rule to focus on what we care about (I’ll use “http” from here, but this all applies to both “http” and “https” unless otherwise specified):

The server being contacted begins after the “http://” and ends at the next “/”.

Or, in this URL, the part that’s highlighted:

http://www.somerandomservice.com/folder/page?parameter1=value2&parameter2=value2

That’s the part that matters, because that’s the part that tells your browser what internet server to connect to. Everything else is secondary. Important, yes, but not nearly as important.

Let’s look at one of the ways that phishing attempts try to fool you.

http://www.somerandomservice.com/www.paypal.com

It might be tempting to look at that quickly and say “oh, that ends in paypal.com, so it’s PayPal!”

No, it’s not. Look again:

http://www.somerandomservice.com/www.paypal.com

That URL loads a page named “www.paypal.com” (a valid page name) from the server www.somerandomservice.com.

Now, my example is pretty lame, as “www.somerandomservice.com” is big and obvious at the front of that URL. But scammers use all sorts of variations on this theme to make it look like you’re going to someplace you trust, when you’re not if you don’t look closely.

The server matters, part 2

For this point, we need to pick apart the way server names are created and used.

URLs are created from right to left, and the individual components are separated by a period. Consider “www.somerandomservice.com”.

  • “.com” is the top-level domain, and indicates which registry service is used to register the domain initially.
  • “somerandomservice” is the domain name. This is the part you purchase when you register a domain name.
  • “www.” is the subdomain. Once you own the domain, you can create as many of these subdomains as you like.

In general, fully qualified domain names like “www.somerandomservice.com” identify a server on the internet. “photos.somerandomserver.com” would typically be a different server, but it doesn’t have to be.

The choice between using something like “photos.somerandomserver.com” versus “somerandomserver.com/photos” is purely one of site design, and has no security implications. That’s just how the person building the website chose to do it. There are geeky pros and cons to each, but for a typical web user, it doesn’t really matter.

What does matter is how subdomains can be abused. For example, it’s perfectly possible for this to be a valid domain:

http://www.paypal.com.somerandomservice.com

Once again, with only a quick glance, you might think it was actually paypal.com, since it starts with “http://www.paypal.com”.

In that example, “www.paypal.com.” is just a subdomain created by the owner of “somerandomservice.com” and has nothing at all to do with the real paypal.com.

Here’s a worse example:

http://www.paypal.com————————————————————.somerandomservice.com

Once again, it’s designed to fool you into looking like paypal.com, but in fact it’s not – especially if your browser happens to only show you the first part of the URL in your status bar since it’s so long.

Scammers use many different variations of this technique to trick you.

A slash is a slash is a … %2F?

This was brought up by a comment on this article (thanks, Ken!).

Characters in URLs can be “encoded” with a special representation that acts the same as the character it encodes. The format is a percent sign followed by a two-digit hexadecimal number (individual digits will be 0-9 or A-F).

A space character, for example, is %20, and you’ll actually see that in legitimate URLs from time to time, since an actual space character cannot be used.

%2F is the slash character “/”.

So this rule:

The server being contacted begins after the “http://” and ends at the next “/”.

still applies, but %2F could be seen in place of “/”. More correctly:

The server being contacted begins after the “http:”, “/” or “%2F”, “/” or “%2F” and ends at the next “/” or “%2F”.

It gets ugly, but the thing to remember is just this: %2F is exactly the same as “/”.

Here’s an example of how it might be abused:

http://www.somerandomservice.com%2Fwww.paypal.com/

That is not PayPal. Replace the %2F with “/” and you’ll see instead:

http://www.somerandomservice.com/www.paypal.com/

Clearly, it goes to www.somerandomservice.com.

Any URL with a % notation in the server portion (between that first “http://” and the next “/”) is suspect. A % notation after the server portion (in the page, or more commonly the parameters) is typically OK.

Https and secure websites

All of the above applies whether the URL begins with http or https.

Https adds two important things:

  • It encrypts the data flowing between your computer and the server.
  • It validates that the server you connect to is, in fact, the server you requested.

Important: https doesn’t validate you’re connecting to the server you think you are; it validates that you’re connecting to the server you requested. Those are two different things.

For example, let’s say you fall for one of my lame examples above, and click on a link like this:

https://www.paypal.com.somerandomservice.com

That’s an https connection. It is very easy for the owner of somerandomservice.com to install a completely valid https certificate for www.paypal.com.somerandomservice.com.

Thus, when you click on that link, your browser will confirm that you are indeed connecting to what you asked for: www.paypal.com.somerandomservice.com. That might not be what you think you asked for, if you fell for a scammer’s trick, but that’s all that https can validate for you: you got what you asked for.

Staying safe

It’s unfortunate that something fairly simple is quite complex once you assume people will attempt to deceive you.

I’ll sum it up with this:

For any URL you are about to click on, pay close attention to the domain name: everything between “http://” or “https://” and the next “/”. Remember that domain names build from the right, so if it ends in, for example, “.paypal.com”, you can be assured that it’s a domain or sub-domain owned by paypal.com.

Podcast audio

Play

Video Narration

20 comments on “How Do I Know a Web Address is Safe?”

  1. It’s also helpful to note that scammers will often create their HTML in such a way that what the link name is isn’t where you’re actually going. For instance [a href=”somerandomservice.com”]paypal.com[/a] (if you replace the square brackets with greater-than/less than signs) if viewed in html will look like you’re going to paypal.com, but will connect you to somerandomservice.com instead.

    And then there are IP numbers…

  2. All good points. Another “trick” phishers often use is the “%nn” method of specifying characters in hexadecimal. Beware of anything that includes such things outside of parameters (the part after “?”).

    For example, “%2F” represents “/”, so the following does _not_ go to paypal.com:

    http://www.example.com%2Fwww.paypal.com/login

    So, strictly following the “between // and the first /” won’t work here, unless you know that %2F _is_ a “/”.

    But, as I said, anything that uses “%nn” in the URL, especially in the server name part, should be looked at with great suspicion.

  3. A very nasty technique is the use of international domain names, where non-Latin characters are used in the domain name. In this characters are used that look like those in our ‘normal’ alphabet, but are different characters as farad the computer is concerned. An example
    could be: http://amazon.com where the ‘o’ in ‘amazon’ is not the letter ‘o’ from the Latin alphabet, but an ‘o’ from the Cyrillic (Russian) alphabet, or the omicron from the Greek.

    Luckily, not all browsers support international domain names.

  4. At the risk of adding even more confusion to Leo’s nice, simple explanation: Some anti-malware products use various means to verify the safety of a URL before letting you go there. I know of two vendors, Comodo and Sunbelt, who are testing DNS services as an efficient way to do this. You simply enter Comodo’s or Sunbelt’s DNS servers into your TCP/IP settings. (Internet 101: A PC can’t go to a URL without first querying a DNS server for the IP address of the URL.) If you try to go to a bad URL, these “smart” DNS servers return the IP address of a warning page, instead of the IP address of the bad website.

  5. I’ve always used the rule that it’s the item immediately before the .com (or .edu etc.)that matters. Isn’t this a simpler, but valid, rule for checking site authenticity?.

    By itself, no. As you can see in the examples above scammers often try and trick you into looking at a misleading “.com” somewhere in their URL.

    Leo
    18-Aug-2010

  6. Oops, from what you say my rule must include the first slash. That is, I should have said, “before the ‘.com/’ or ‘.com.countrycode/’ where the slash is the first one. Won’t that work as a simple rule for looking for the critical domain name?

  7. Another way to stay safe with PayPal is knowing that official email from PayPal always addresses you specifically by name, as in “Dear John Doe,” not “Dear PayPal User” or similar, as scammers usually do. Still, it’s a good idea to not click a link, even in an email apparently from PayPal, just to be sure. Use a Favorite shortcut, or simply type paypal.com into your browser, to ensure that you’re going to the genuine PayPal website and not some other site.

  8. I use firefox and they have an addon called WOT that I find useful, somehow it rates sites and advises you either the site is ok or else get off. I think it can be downloded free to use with other brosers

    WOT is a good model – the problem is that “they” don’t rate sites, WOT users do. That means that ratings can be gamed and artificially missleading. Generally they’re not, but it can and does happen.

    Leo
    18-Aug-2010

  9. Thanks for your info.
    For the less technical person, how about using AVK-Link Scanner? While not a 100% guarantee, it is very useful, and after several years of using, has never failed to reveal a ‘bad’ site.

    If you mean AVG link scanner – while I won’t disrecommend it, I do typically advise people to turn it off as it is implicated in various browser problems and slowdown issues.

    Leo
    18-Aug-2010

  10. Why not use Calling ID toolbar? This checks any site you visit and informs you whether it is safe or not. It has never failed me yet; and it’s free!

    Not everyone wants yet another toolbar. (I’ve never heard of Calling ID, so I can’t speak to it’s trustworthyness either.)

    Leo
    18-Aug-2010

  11. When I clicked on the link to this article from your home page I got a warning from Avira, Malware found, “HTML/Spoofing.Gen” was found in a file. It happens every time. The page loads, despite. I click on remove, but concerned I may have something on my computer. This last time the warning popped back up again when I posted this comment.

  12. Thank you for the specific explanation of URL warning signs. As I am sure you know from experience whenever an expert such as you explains what the hackers are doing and how they do it, that sets up the “dropping of the gauntlet”. They accept the challenge and design detours from and around your warning post! So “Be Wise and Use Your Eyes” is a good motto to remember.

  13. I’ve found the WOT (Web of Trust) Firefox plug in very useful, and it’s saved me from some bad sites.

    The only problem is that sometimes a really nasty site can be rated with a green symbol in your Google search results. You have hover over the green rating symbol and click through to see the user ratings. Sometimes a site with lots of “red” (danger) ratings still shows up as a “green” (safe) site.

    This is actually my biggest concern with validation sites that rely on user feedback for their ratings – they can be gamed so as to provide misleading results.

    Leo
    23-Aug-2010

  14. Hi.I saw an input from a reader that made this comment_”when I clicked on the link to this article from your page I got a warning from Avira, Malware found, “HTML/Spoofing.Gen” was found in a file. It happens every time. The page loads, despite. I click on remove, but concerned I may have something on my computer. This last time the warning popped back up again when I posted this comment.”_ I tried and my Kaspersky warned me off; you may have a Trojan.
    Trojan-Spy.HTML.Fraud.gen Best wishes. WSS

    I don’t have a trojan.

    In order to discuss this topic this page has on it examples of misleading URLs. Some overly-sensitive anti-malware tools are throwing a false positive because of that.

    Again, there is no malware here.

    Leo
    26-Aug-2010

  15. Good article. Some of both the article and comments are over my head. But, bottom line, I never, EVER, click on a link in an email when going to my bank, paypay account, or any other site which requires name/rank/serial number/credit card/password.

    I was hoping this article would answer a question I’ve had for some time but, unless I missed something, it didn’t. My question isn’t whether a particular website is “safe” but, rather, to which website will I be taken if I click on a link?

    Admittedly, most of these links in question are in known spam emails. But I, occasionally, get curious and (knowing I am not going to ever enter any personal information at the site) will click on them.

    Here is a made-up example. http://www.xyz.com/abc. Clicking on that link takes me to, say, adultfriendfinder.com. If my assumption is correct, “xyz” is the server and “abc” redirects to adultfriendfinder.

    My question is: how can one determine the destination website in advance of clicking the link?

    A concrete example might be hotmailtips.com which takes you to a completely different site. Smile

    I actually know of no way for average users to determine the final destination of a URL that is being redirected without actually going to it. I tend to use a very geeky command line tool called “curl” which lists the domains it’s accessing as they’re redirected, but I don’t expect the average user to want that. Perhaps someone knows of such a service and will leave a comment here.

    Leo
    26-Aug-2010

  16. i always use google cache for entering a website i don’t know and don’t accept any cookies from unfamiliar websites or website which i am not part of.

  17. Eset 5 works to stop you going to bad sites.It terminates that site.If you try to dowload bad stuff it will auto clean.

Comments are closed.