How do I copy a copy protected web page?

Assuming your intent is legal - or at least moral - it's not hard to copy content from a web page that attempts to prevent it.

//

How do I copy/paste from sites that don’t permit it? There is info I’d like to send to a friend without a computer but has a machine that only sends/receives plain text. I want to send her stuff from this site as an example but they don’t permit copying/pasting. Is there anyway around that?

As you might expect, the website in question is trying to protect its content from theft. They have valuable information and I’m sure that people try to steal and republish their content frequently. That is, of course, quite illegal and a violation of international copyright law.

So I’ll assume that’s NOT what you have in mind. (Though technically even what you have in mind – while morally acceptable in my opinion – may still be in violation of that law.)

Copy protection on websites – be it just for pictures or for entire pages of content – is in my opinion pretty close to useless. It keeps honest people honest and that’s about as far as it goes.

Web pages, emails, whatever: if it can be seen, it can be copied.

Above Board Techniques

By “Above Board” all I really mean is using normal website behaviour to gain access to the text in ways that perhaps the web site owner hadn’t thought to prevent … yet.

The most common: printing.

… if it can be seen, it can be copied.

In this case, if you install a print-to-PDF printer driver such as PDFCreator and print that page to create a PDF, two
interesting things happen:

  • You have a nice PDF of the page. Perhaps that might be enough to get your friend a copy of the page. Certainly it has the highest “fidelity” in that it’ll include all the formatting and images as the original web page.
  • That PDF may, itself, have copy enabled. In my test of the website in question, I was able to print to PDF, and then select the desired text from the PDF and copy it elsewhere.

Another approach is to use the File -> Save As… option in the browser when viewing the page, and save it “as” plain text format. The results may vary from browser to browser, but you’re likely to get a good starting point from
which you can then copy the desired text.

Copying DataYet another approach is to use the “View Source” option available in most browsers which will allow you to view the underlying HTML for the page, and copy out the relevant content as needed. You’ll want to clean up the results, though, removing the HTML mark-up to make the results readable.

Underground Techniques

By “underground” I mean actually taking steps to actively disable whatever copy protection has been placed on the web page or image.

Two techniques come to mind:

  • Disable Javascript. Many sites will use Javascript to implement copy protection. Disabling Javascript, in turn, disables the copy protection completely. (That happened to be the case with the example site. It also disabled a number of popup ads as a bonus.) The easiest way is to use Firefox and the “NoScript” plugin which allows you enable or disable Javascript on a site-by-site basis.
  • Disable or circumvent CSS. CSS, for Cascading Style Sheets, is actually an incredibly powerful approach to defining web page look and feel and behaviour. Using CSS it’s quite possible to disable or modify the way web pages behave. It’s also easy to turn off: in FireFox click on ViewPage Style and then click on No Style. The page will be re-rendered without CSS and the result, which typically visually unappealing, may well be copy-able.

There may be other approaches as well, depending on the specific techniques used to disable copying, but those are probably the 95% solution.

Off The Wall Techniques

“Off the wall” as in things that sound really stupid or something you’d never think of, but are last resort measures.

If nothing else they’re proof of my original statement: if it can be seen, it can be copied.

  • Take a picture. Get your digital camera and take a picture of the screen. Instant copy.
  • Take a screen shot. Tools like SnagIt will not only automatically “page down” to get an image of the entire page (in perfect resolution, unlike your camera), but it also includes a “copy text” options that may well copy text for which traditional clipboard copy has been disabled.
  • OCR. Short for “Optical Character Recognition”, OCR tools can take that “picture” of a web page (ideally the screen shot since it has the best quality) and extract from it all the visible text as editable text.

There are probably more odd and unique ways that I’m not thinking of.

If It Can Be Seen, It Can Be Copied.

I present this not as a “how to” for people wanting to make illegal copies of web sites, or even for people who want to do more acceptable things like share otherwise inaccessible content with others.

My intent here is really to point out the futility of copy protection
schemes.

If you must present your information in a way that humans can read, listen or watch it then there exists a way for that content to be copied. Placing roadblocks just punishes those who would view or use your content in ways that are, ultimately, only beneficial to you without stopping those who would steal it anyway.

If someone can see it, they can copy it, forward it, publish it, whatever.

That’s simply the nature of today’s technology.

Not that they should, but they can.

There are 37 comments:

  1. steven Reply

    Why not just press the print screen button and paste it into painbrush, and then save the file.

    That gets you a picture of the text, but not edittable text that you could reformat and perhaps copy into word or notepad. (Also fails if the text goes below the bottom of the screen.)

    Leo
    06-Jul-2010

  2. Ted Reply

    Use the printscreen key on your keyboard. Then open Paint and paste. Rename and save in whatever format you desire. You can edit the file later, or keep it entirely, complete with webpage url…

  3. Steve Curling Reply

    Hi Leo,
    I have used Snagit for years and have never found anything it wouldn’t copy including videos.
    Thanks,
    Steve

  4. Douglas Gross Reply

    Turning off CSS may stop JavaScript from using CSS class names and IDs, but if you are already going to turn JavaScript off you don’t need to turn off CSS. Basically, the PDF solution is the best, because the only other choice is to learn how to remove the code for copyright “protection.” I am surprised anyone uses such techniques, because they are very easily circumvented. What the webmasters of such sites need to really do is put such material in a secured PDF, and any amount of information that is more than very brief belongs in a PDF instead of on a web page anyway.

  5. Doclocke Reply

    Anyone who uses either Windows Vista or Windows 7 had a nifty utility called “Snipping Tool” that simplifies copying of any part of what is displayed on the monitor. Try it, you’ll like it.

  6. Nicholas Gimbrone Reply

    In some very real sense, this is exactly the sort of sharing that digital restrictions management is aimed at stopping… text is just the degenerate case of a form of digital media is all. An effort to insure that only the immediate end consumer of digital data may view that digital data (and perhaps also add time or count limits on such consumption).

  7. Paul Mierop Reply

    The easiest way I found was to highlight all the Text and copy with Ctrl-C and paste it in a new Word document and make all the changes I want, or just print it.

    The whole point here is that your “Ctrl+C” approach has been disabled by the web page author.

    Leo
    07-Jul-2010

  8. DSU Reply

    With Firefox you don’t need to turn off javascript. Simply Click on (for windows) Tool–> Options –> Content –> Advanced and uncheck Disable or Replace context menus.

  9. Jose Prudencio Reply

    There’s a variant for the “copy as” solution. One should copy all the page to a file in a local directory and then open it with a HTML editing program. After this, just select and copy the desired text to paste it elsewhere. I did it a couple of times successfully.

  10. John Reply

    Ashampoo make a program called Snap, i think they are up to version 4 now. and what it can not do is not worth mentioning!

  11. James Reply

    Has any noticed “has a machine that only sends/receives plain text”?
    I don’t see that you can do much better than ctrl-A, ctrl-C, ctrl-V, to copy all the text into a text editor.

    Of course, you’ve lost all the formatting, and you have to delete all the stuff you don’t want, but at least you’ve got plain text.

    The point of the article was that CTRL+C for copy had been disabled by the website author.

    Leo
    07-Jul-2010

  12. petrus Reply

    The easiest way to copy anything that you can see on your computer screen is by downloading and using a small program called “Fast Stone Capture “.I use it on a daily basis and has become very attached to it.

  13. Wheatridge Reply

    Leo, you forgot the absolute easiest way to copy that page. All someone has to do is to use that “prtScrn” button that has been on computers since the beginning in the 1970′s. Using “Ctrl” PrtScn the computer places a copy of the screen into your memory. Then open a graphics program and past that immage. Resize, etc and save it as a .jpg, etc.

    Unfortunately that only gets you a picture of the page, not the text that the questioner was asking for. (And of course if the page is longer than a single screen it doesn’t get the whole thing.)

    Leo
    08-Jul-2010

  14. Charlie Griffith Reply

    This method can also be tried…
    …I’m using Win 7 Ultimate (if that is significant) ….and I copied (Control+C) some letters/comments from a newspaper site into Word, then moved that file into Open Office, and from Open Office I sent it in PDF form to myself as an email.

    Brilliant?….probably not, but it worked….previously the “comments” on that website did not survive emailing, but somehow the insertion of Word and Open Office as middle-steps eliminated any anti-copying measures.

    Cheers….

    This also misses the point of the article, where CTRL+C has been disabled by the website author.

    Leo
    09-Jul-2010

  15. James Reply

    Another easy way is to use a C++ or vbs script to access the email or the web page, there are countless SMTP and HTML libraries out there, and if it’s a simple web page, I even have native C++ and C# code that can do it without the use of a library.

    You can also right click-> view page source, and then extract whatever you can from that. However, CSS and frames can be used to prevent you from getting any useful info out of that.

  16. Lala Reply

    an advance thank if i may require it anytime(but without ill intention)

  17. Steve from Montreal Reply

    As a web developer I get this question all the time “I want to put my pictures/text/pdf files/whatever on the Internet, but I don’t want people to copy them”

    To which I always respond “Then you shouldn’t be putting it on the Internet!”

    In the end we usually compromise for some basic protection from novice users.

    But I will keep this link to demonstrate how many different ways you can get around copy protection.

    (shameless plug, hope it’s OK: http://www.adeointernetmarketing.com/)

  18. allabarra Reply

    Another useful copy mechanism is using ALT + printscreen to copy an open box within a page, without copying the rest of the whole page behind. You can then paste this into whatever other program you use.

  19. Need your help! Reply

    Hey Leo, thank you for your guide, but I can’t still copy texts from some sites, for example: moon.vn. They use JS code to load the page, so if I disable JS, it’s also unable to load the text inside, too. I used Opera to disable JS after page is loaded, but they’re no longer allow Opera. Is there anyway to workaround here?
    Thanks so much!

  20. Dana Childs Reply

    Snagit…best new software “toy”! Love it! Thanks for the tip! All the other tricks which I already knew about did not work…tried Snagit and it worked for getting the information I needed from a webpage. All I wanted to do was print the page so I could have the info in front of me for comparison when I make phone calls. Seems like the site I was looking at is being little too overly cautious with protecting their information, not sure why. Like you said…if you can see it, they can copy it and use it. But with regards to Snagit, looks like it may have some other great uses as well. Thanks again!

  21. Kendo Reply

    Most copy protection techniques are not very secure, but there are some that cannot be exploited at all. For example ArtistScope provide a Site Protection System (ASPS) which uses a custom web browser, which unlike all other web browsers, has been designed to protect page media rather than expose it.

    I’m not at all familiar with the technology you mention, but my first reaction: I’ll bet I could copy everything using screen capture, or screen capture over remote access. Basically my position remains: if you can see it, it can be copied.

    Leo
    04-Aug-2011

  22. Ricky Reply

    Thanks dude… Really useful .. i have a doubt , if i disable javascript , will the website admin find this trick that we are using this facilities (copy & paste)??? Pls help me … Again Thanxxxxxxxx

  23. James Reply

    I gave up trying to copy protect my material a loooong time ago.
    Too many people today are tech. savvy.

  24. Jon Reply

    I’m trying to do this right now so that I can copy the text of a user licence agreement for some software that I’m downloading. I want to be able to refer to the text afterwards, without having to refer to the website. So this has legitimate uses too!

  25. africancop Reply

    right click on the web page.use view source….the new tab contains the html link….copy what you need.remove the tag parts.

  26. User888 Reply

    Another way to get around is to Save the Webpage in text format, open the saved file using Notepad and copy the information.

    • Roland Reply

      Java trick worked here too
      Fantastic solution
      I just wanted to copy a receipt for cooking
      No idea why they protected it
      Hardly can bring my computer to the kitchen

      Roland

  27. biscuit Reply

    I try to block the javascript of the website I want to copy but no success.
    I have learned here that website also uses CSS to prevent copying/highlighting of text.
    So I use “Pendule” an extension for chrome that can view or disable css of the websites.
    Finally, I succeed in highlighting the text. Now I can copy it… :)

  28. daniel wilianto Reply

    Oh yeah, I can’t fathom why are some web programmers so dumb so that they think that they can prevent people from copying the contents of their websites. Even now as I speak, many websites do it. So dumb. As if it can achieve anything. As you said, as long as it’s shown to people, it can be stolen. Even when I was just a computer newbie, I can already steal the pictures using the Print Screen button on the keyboard. I think I was still using Win 9x during that time.

  29. simon ezeh Reply

    Hi. Does it mean you cant crack ASPS i.e ArtistScope Site Protection System? Someone said any site protected with it is beyond copying, printing, saving, downloading, etc. Is it true?

    • Leo Reply

      If it can be seen it can be copied in one form or another. Worst case you can take a photograph of the screen, but generally there are techniques that give better results as well. I’m not at all familiar with ASPS.

Leave a reply:

Before commenting please:

  • Read the article. Seriously. You'd be shocked at how many people make comments that prove they didn't.
  • Comment only on the article. If you have a new, unrelated question start with the search box at the top of the page.
  • Don't post personal information. Email addresses, phone numbers and such will be removed.

VERY IMPORTANT: because of a rise an comment spam that's making it through our filters any comments that do not add to the discussion - typically off topic or content-free comments - run a very high risk of being flagged as spam and removed.

If you have a new question unrelated to the article above, ask it on the Ask Leo! ask-a-question page.