As you might expect, the website in question is trying to protect its content from theft. They have valuable information, and I’m sure people try to steal and republish their content frequently. That is quite illegal, and a violation of international copyright law.
I’ll assume that’s NOT what you have in mind. (Though technically, even what you have in mind, while morally acceptable in my opinion, may still be in violation of that law.)
While I’ll answer your question, my real goal here is to point out to site owners just how futile website copy-protection schemes can be.
If it can be seen, it can be copied.
Become a Patron of Ask Leo! and go ad-free!
There are several techniques to copy text from websites trying to prevent it, including print to PDF, copying from that PDF, viewing the source of the webpage, disabling JavaScript, disabling CSS, or even taking photographs or screenshots and running those through OCR. Website and other digital-content owners need to realize that if it can be seen, it can be copied.
Above board techniques
By “above board”, I mean using normal website and browser behavior to gain access to text in ways the website owner perhaps hadn’t thought to prevent.
The most common: printing. Specifically, Print to PDF.
The result is a nice PDF of the page. Perhaps that’s enough for you to save. Certainly it has the highest “fidelity” in that it’ll include all the formatting and images exactly as the original webpage.
If saving to PDF doesn’t meet your need, it’s possible the PDF is copy enabled. In my test of the website in question, for example, I was able to print to PDF and then select the desired text from the PDF to copy elsewhere.
Another approach is to use File -> Save As…1 in the browser when viewing the page, and save it “as” plain text. The results will vary from browser to browser, but you’re likely to get a good starting point from which you can copy the desired text.
Yet another approach is to right-click on the webpage and use the “View Source” option available in most browsers. This allows you to view the underlying HTML for the page and copy the relevant content as needed. You’ll have to clean up the results, though, removing the HTML mark-up to make the results readable.
Other techniques
Here I mean taking steps to actively disable whatever copy protection has been placed on the webpage or image.
Two techniques come to mind.
- Disable JavaScript. Many sites use JavaScript to implement copy protection. Disabling JavaScript disables the copy protection completely. (That happened to be the case with my example site. As a bonus, it also disabled a number of popup ads.) The easiest way is to use Firefox and the “NoScript” plugin, which allows you enable or disable JavaScript on a site-by-site basis.
- Disable or circumvent CSS. CSS, short for Cascading Style Sheets, is a powerful approach to defining how webpages look, feel, and behave. It’s also easy to turn off: in Firefox, click on View (you my need to press and release the ALT key to expose the menu bar first), Page Style, and then click on No Style. The page will be re-rendered without CSS and the result, while visually unappealing, may well be copy-able.
Depending on the specific techniques used to disable copying, there may be other approaches.
Off-the-wall techniques
“Off the wall” as in things that sound really stupid or something you’d never think of, but are last-resort measures.
They’re also proof of my original statement: if it can be seen, it can be copied.
- Take a picture. Get your digital camera and take a picture of the screen: instant copy.
- Take a screen shot. Tools like SnagIt will not only automatically “page down” to get an image of the entire page (in perfect resolution, unlike your camera), but it also includes a “copy text” option that may well copy text for which the traditional clipboard copy has been disabled.
- OCR. Short for “Optical Character Recognition”, OCR tools take an image of a webpage (ideally the screenshot, since it has the best quality, but possibly also the photo) and extract all the visible text as editable text.
There are probably more odd and unique ways I’m not thinking of.
If it can be seen, it can be copied
Like I said, this isn’t intended as a “how to” for people wanting to make illegal copies of webpages, or even for people who want to do more acceptable things, like share otherwise inaccessible content with others.
That it turns out to be one, however, underscores my real point: copy-protection schemes are pretty futile. If you present your information in a way that humans can read, listen, or watch, then there are ways for that content to be copied.
Placing roadblocks only punishes the innocent. It puts barriers in the way of those who would view or use your content in ways that are only beneficial to you, without really stopping those who are determined to steal it anyway.
If someone can see it, they can copy it, forward it, publish it, whatever. Not that they should, but they can.
That’s simply the nature of today’s technology.
Do this
Subscribe to Confident Computing! Less frustration and more confidence, solutions, answers, and tips in your inbox every week.
I'll see you there!
Podcast audio
Footnotes & References
1: If present. Edge doesn’t seem to have it. Also note that it may have moved in recent browsers to a sub-menu of the ellipsis (…) menu, and may be called something else similar, like “Save page as…”. Gotta love consistency.
Why not just press the print screen button and paste it into painbrush, and then save the file.
06-Jul-2010
You’re encouraging illegal activities which constitute a violation of international copyright law
You’re assuming what’s being copied is copyrighted. That’s not always the case. I’ve also seen cases where it’s the copyright owner that needs to do the copying.
You can, then, run an OCR program like Free OCR on the resulting image file and convert the image to text. The text file would need editing as OCR is never perfect. I used ABBYY Fine Reader for a month once when a trial version came with a scanner. It does a much better job but it’s still not perfect.
ABBY Fine Reader contains a Screenshot Reader with optional OCR which is invaluable for extracting text from digital images such as photos of interpretation boards. The company sometimes give it away as a standalone product in marketing exercises; they are currently offering it on their site as a free trial or for a pretty reasonable €9.99.
I often have problems with printing websites to pdf, particularly after the first page and sometimes find online tools, such as html-to-pdf, work better.
For those sites which can be copied, I’d love to know how to make Word accept all the images instead of having to copy them in via Paint.
Can you post a link to the Fine Reader offer?
This usually works
http://www.pdfmyurl.com
Hi Leo,
I have used Snagit for years and have never found anything it wouldn’t copy including videos.
Thanks,
Steve
Turning off CSS may stop JavaScript from using CSS class names and IDs, but if you are already going to turn JavaScript off you don’t need to turn off CSS. Basically, the PDF solution is the best, because the only other choice is to learn how to remove the code for copyright “protection.” I am surprised anyone uses such techniques, because they are very easily circumvented. What the webmasters of such sites need to really do is put such material in a secured PDF, and any amount of information that is more than very brief belongs in a PDF instead of on a web page anyway.
Anyone who uses either Windows Vista or Windows 7 had a nifty utility called “Snipping Tool” that simplifies copying of any part of what is displayed on the monitor. Try it, you’ll like it.
The easiest way I found was to highlight all the Text and copy with Ctrl-C and paste it in a new Word document and make all the changes I want, or just print it.
07-Jul-2010
With Firefox you don’t need to turn off javascript. Simply Click on (for windows) Tool–> Options –> Content –> Advanced and uncheck Disable or Replace context menus.
There’s a variant for the “copy as” solution. One should copy all the page to a file in a local directory and then open it with a HTML editing program. After this, just select and copy the desired text to paste it elsewhere. I did it a couple of times successfully.
Ashampoo make a program called Snap, i think they are up to version 4 now. and what it can not do is not worth mentioning!
I use Ashampoo Snap, myself. They’re up to version 11. I got it on sale for 11 bucks. Ashampoo has a few good programs. Just wait for them to go on sale as the retail price is much higher.
The problem with that kind of program is that it only takes a picture of your screen and you’d have to run it through an OCR program to extract the text.
Has any noticed “has a machine that only sends/receives plain text”?
I don’t see that you can do much better than ctrl-A, ctrl-C, ctrl-V, to copy all the text into a text editor.
Of course, you’ve lost all the formatting, and you have to delete all the stuff you don’t want, but at least you’ve got plain text.
07-Jul-2010
The easiest way to copy anything that you can see on your computer screen is by downloading and using a small program called “Fast Stone Capture “.I use it on a daily basis and has become very attached to it.
Leo, you forgot the absolute easiest way to copy that page. All someone has to do is to use that “prtScrn” button that has been on computers since the beginning in the 1970’s. Using “Ctrl” PrtScn the computer places a copy of the screen into your memory. Then open a graphics program and past that immage. Resize, etc and save it as a .jpg, etc.
08-Jul-2010
IF it’s txt the questioner need :
Using the PrntScreen Key Formular:
Procedure: (windows environment)
1) Ctrl + PrntScr to copy screen in clipboard. or wherver. what matters is, you got it captured!!
2) Open run.
3) type mspaint and press enter. Ms Paint will open (would be surprised if you expected anything different program)
4) Paste Image
5) Save Image (u can it as my name: WebExtremist)
IF it’s image the questioner want, END!!
To get txt:
1) Open the saved image using Microsoft Onenote
2) Right click the image (inside the onenote). there it is: “extract text from image”.
If page too long, scroll and take different shots….and set the process in loop…..whiles a shot remain
1) Open the saved image using Microsoft Onenote
2) Right click the image (inside the onenote). there it is: “extract text from image”.
If page too long, scroll and take different shots….and set the process in loop…..whiles a shot remain
hello mr.web…
‘m not getting that how to open the image inside the onenote??
can u please explain it?
It may be that you’re using the OneNote MS Store App instead of the OneNote component of MS Office. The Store App doesn’t have the “extract text” option.
This method can also be tried…
…I’m using Win 7 Ultimate (if that is significant) ….and I copied (Control+C) some letters/comments from a newspaper site into Word, then moved that file into Open Office, and from Open Office I sent it in PDF form to myself as an email.
Brilliant?….probably not, but it worked….previously the “comments” on that website did not survive emailing, but somehow the insertion of Word and Open Office as middle-steps eliminated any anti-copying measures.
Cheers….
09-Jul-2010
Another easy way is to use a C++ or vbs script to access the email or the web page, there are countless SMTP and HTML libraries out there, and if it’s a simple web page, I even have native C++ and C# code that can do it without the use of a library.
You can also right click-> view page source, and then extract whatever you can from that. However, CSS and frames can be used to prevent you from getting any useful info out of that.
As a web developer I get this question all the time “I want to put my pictures/text/pdf files/whatever on the Internet, but I don’t want people to copy them”
To which I always respond “Then you shouldn’t be putting it on the Internet!”
In the end we usually compromise for some basic protection from novice users.
But I will keep this link to demonstrate how many different ways you can get around copy protection.
(shameless plug, hope it’s OK: {URL removed}
Another useful copy mechanism is using ALT + printscreen to copy an open box within a page, without copying the rest of the whole page behind. You can then paste this into whatever other program you use.
Snagit…best new software “toy”! Love it! Thanks for the tip! All the other tricks which I already knew about did not work…tried Snagit and it worked for getting the information I needed from a webpage. All I wanted to do was print the page so I could have the info in front of me for comparison when I make phone calls. Seems like the site I was looking at is being little too overly cautious with protecting their information, not sure why. Like you said…if you can see it, they can copy it and use it. But with regards to Snagit, looks like it may have some other great uses as well. Thanks again!
Most copy protection techniques are not very secure, but there are some that cannot be exploited at all. For example ArtistScope provide a Site Protection System (ASPS) which uses a custom web browser, which unlike all other web browsers, has been designed to protect page media rather than expose it.
04-Aug-2011
Thanks dude… Really useful .. i have a doubt , if i disable javascript , will the website admin find this trick that we are using this facilities (copy & paste)??? Pls help me … Again Thanxxxxxxxx
@Ricky
There should be no problem. The website administrator can’t find out who you are. Even if they could get a hold of your IP address, there’s not much they could find out about you.
http://ask-leo.com/what_can_people_tell_from_my_ip_address.html
I gave up trying to copy protect my material a loooong time ago.
Too many people today are tech. savvy.
I’m trying to do this right now so that I can copy the text of a user licence agreement for some software that I’m downloading. I want to be able to refer to the text afterwards, without having to refer to the website. So this has legitimate uses too!
Another way to get around is to Save the Webpage in text format, open the saved file using Notepad and copy the information.
I try to block the javascript of the website I want to copy but no success.
I have learned here that website also uses CSS to prevent copying/highlighting of text.
So I use “Pendule” an extension for chrome that can view or disable css of the websites.
Finally, I succeed in highlighting the text. Now I can copy it… :)
Hi. Does it mean you cant crack ASPS i.e ArtistScope Site Protection System? Someone said any site protected with it is beyond copying, printing, saving, downloading, etc. Is it true?
If it can be seen it can be copied in one form or another. Worst case you can take a photograph of the screen, but generally there are techniques that give better results as well. I’m not at all familiar with ASPS.
My method’s very slow, but good for copying paragraphs of text, or if you don’t want to download any extensions- right click the text, select “Inspect element,” then find the text inside the body of the code, which will be inside … codes. Double click them and the text will show (I have only tried this in chrome so it may differ in other browsers). Or you could possibly delete the codes used to make it unable to be copied?? I haven’t figured this out yet though.
Hi, Another way under chrome is to use “Save text to google drive” extension. Best regards from Paris.
Using “no style” on firefox method worked great for me, thanks a lot
Thank you so much for the help, your work is very appreciated, God bless you !
Actually i had a question that i am working online but i am not getting the copy paste option so can you please revert me as how can i do my copy paste work so that i can get completed my work as soon as possible.Anybody if can be helpful to solve my query.
If you want to protect your pdf files, you can use . With this tool you can enter DRM restrictions to your pdf file. It is pretty awesome
And people can still screen-shot it page by page.
Also, various people have also created programs that decrypt and display the password for a PDF file, enabling the removal of all DRM.
IT just worked like a charm
Save the web page as Text data…they just open the text data and edit it. :) it is very simple.
In the search machine or anywhere click right on website and choose print and you will able to copy there
PRINT TO FILE!
NOT ALWAYS AVAILABLE! DOESN’T ALWAYS PRINT TEXT!
Windows 10 and later has a “Save as PDF” option under “Printers”, so you no longer need a third party PDF printer.
Thank you! I had a mobile Facebook page post that wouldn’t select the text. Weird. Some glitch, Selecting print with the PDF Creator but not printing, makes the text show up there as selectable.
Yes. view – “no style” on firefox method is perfect, fastest and perfect. Thanks
I found that sometimes, the inability to copy text can be remedied by clicking on the Compatability icon in Internet Explorer ( or on Tools and the Compatibility View Settings) depending on the version of browser.
I was having my PC scanned to detect problems. The online scan, scanned my computer and came up with technical information that I did not know about on my PC. I would like to be able to copy and paste my PC’s information into MS Word. The scan was done online, automatically, by Reimage which is a part of Major Geeks.com. I am not able to highlight or copy my own computer’s system information. I have asked Reimage three different times, via e-mail, over the past week to inform me how I can get a copy of all my PC’s information. They have not even been polite enough to respond to my inquiries. They were quick to take my money. I have tried turning off Java Script, I have used the “Ctrl + u;” F12 key (in Firefox); and “Ctrl + F.” None of them had any affect. Would you know how to get around their protection of my PS’s information? Thank you.
Reimage is not part of Major Geeks. You probably clicked on an ad on their page. It’s a similar situation to this article.
https://askleo.com/whats_the_difference_between_an_ad_and_your_recommendation/
If the program doesn’t allow you to copy and paste its information, your alternative is to take a screenshot.
https://askleo.com/whats_a_screen_shot_and_how_do_i_make_one/
I tried to go to the Reimage website but was blocked by Malwarebytes with this message:
Malwarebytes blocked a suspected bad URL or an unwanted program.
In general, Leo disrecommends registry cleaners and boosters:
https://askleo.com/whats_the_best_registry_cleaner/
Can anyone copy part 2 of the post from the site http://securebit.xyz
The only method i know is ocr or taking a photo. In ocr, one should manually check the content for any errors in recognition which is more tedious than typing the content and the latter method, including screenshots, output non editable content. So those methods are not really copying.
Oh and i forgot to mention that ocr, screenshot, snipping tool etc could also be rendered useless. I have developed a script for that too. But the above webpage does not block ocr or screenshots.
It’s an interesting technique that certainly seems to make plain old copy/paste difficult. I didn’t look at it too closely, but from what I can see … all I’d need to do is a simple cipher substitution based on the custom font you’re using. (But they key seems to be that font.) While it would keep the casual user from copy/pasting, it does NOT *prevent* copy/pasting, or the results from being able to be restored to their un-“encrypted” form. I believe.
My claim is that screenshots cannot be blocked. (There are ways around every block I know of.) And with a nice high-resolution image you could get from a screen shot, OCR would be extremely accurate.
Yes, but the font could be scrambled randomly for each post. A simple script could be used for substitution even though i did it manually.
Random scrambling of the font at page render time could also be done or assign a random key to encrypt instead of simply replacing a character.
As mentioned earlier, the whole point of copying a content is to paste it somewhere else with minimum effort. Deciphering each font is far more difficult than typing the whole content by hand.
I think all the other methods till date could be easily disabled in a few mouse clicks.
Regarding screenshot, i can say that you could only use a screen recorder and not an image. It’s a simple script combining 3 or 4 known methods.
The screenshot could be taken with a custom app which could overcome the limitation the current softs have.
All these methods combined will prevent 90% of content theft without compromising SEO.
I’m currently outside of my home country. When i get back I could demo the screenshot prevention script. I tested it with tools like snipping tool and a bunch of other tools whose name i don’t remember. But you could still record the screen and take a snap from that.
90 to 95 % people who copy content would not want to go through all these processes as they are lazy.
I would like to know from a veteran like you on how you rate my idea on practical grounds.
Thanks
I hope my arguments were not offensive. The reason I came to this website and argue with you guys is because i want to improve the script to make it as good as possible. I’m aware of the method you already mentioned. I wanted to know whether there were more methods to defeat the logic.
I think my biggest objection is that you’re claiming to “prevent copy/paste”. That is not the case. You ARE making it harder, which can be valuable, particularly in a classroom setting where people are attempting to cheat. But it does not prevent copy/paste at all. (Thinking about it a little more whatever algorithm you come up with could be reverse engineered and an enterprising student could set up an un-scrambling page for his entire class. So the process to copy paste for lazy students would be copy from the original, paste into the unscrambler, and then copy from there.)
And if there is enough motivation — say someone tries to use this plugin to prevent copy/pasting of something both large and valuable — they, too, have incentive to perform the various work arounds discussed.
Screen shot: no recorder needed. Just remote desktop from one machine to another. Bring up the page on the remote machine, and then screen shot on the local.
Regarding the remote desktop i think there is no possible way to prevent copy paste. But I doubt who would do that in the first place where he has to do it for all the posts in a website with loads of active users. It’s easier to just type it down i think. Taking a screenshot via rdp would be hard of the content is lengthy.
Reverse engineering is possible but in the real world, if the encryption method or keys change, would it be practically possible?
Every hashes could be cracked but that doesn’t mean every passwords are cracked. Brute forcing hard pass could give you the result in years but is that practically done?
What is not practical should be considered impossible at least for one or two generation (wink).
I disagree. Consider the classroom scenario… you have 30? 100? people tasked with typing what should not be copy/pasted. All it takes is one enterprising entrepreneur to reverse engineer it and provide it as a service to his classmates.
As for RDP screen shots — particularly when it’s long it’s easier to make a series of screen shots (snip, page down, snip, page down, etc….) than it would be to re-type it all. So again we disagree.
Again, you’re making it more difficult, but you are not preventing it. Anyway, we’re going around in circles. No more on this topic.
I was able to take a screenshot. Apparently the script doesn’t block third party screenshot software.
Click here to see screenshot
As i mentioned earlier the website only blocks copying from source or using any devtools or by disabling javascript. The site was put up to showcase this ability. I said that i can add a small little script with it to render screenshot softwares or key combination useless.
Since it was a demo for a wordpress plugin, i did not add this extra feature which could be added on demand.
Mark, did u find any other way to copy the content? Screenshots are not editable. Even jf you ocr it, there may be errors or i could evwn develop a custom font to defeat the ocr.
The whole point of copying is to make the job easy. So ocring is not really copying. Ita pretty useless to copy the content add a screenshot and post somewhere else.
Please let me know if you find a real method to copy the content.
I did read each word in article and comments. Just the two most objectionable copy protection cases: 1) Tiko, August 22, 2017, on software tools that offer free run – just for selling purpose, but do not allow saving their reports; and 2) Jon, March 8, 2012, the inexcusable case: software products that are regularly paid, present you a contract text you must agree – ‘sign’ – to be allowed to use what you have already paid, but the supplier makes it very, very difficult for you to save a copy of the contract you obligated yourself to‼! I cannot agree they are excusable by any means! Of course, as Leo goes to the point, you can ‘print screen’, even long texts (a number of screens), run an OCR, and get it. And they may have used even other resources besides all those to copy protect just web contents. Please, Leo, do you have any other suggestion for this case (number 2)? By the way, on smartphones, there is the “Universal Copy app”, working on Android. But I did not find anything similar for Windows (mine is 7, for a few days yet!).
For #2: stop using (and paying for) the software if you object to their terms.
If they use real copy protection like the ArtistScope Site Protection System (ASPS) all the methods mentioned here will be useless. Only way will be to take a photo of the computer screen from a remote camera. Even the media links are safe from packet sniffers and data cannot be extracted from memory.
If it can be seen, it can be copied.
Yandex was developed in the old Soviet Union, and it is still available, and it is fabulous. Yes, you can put an url in the icon and reveal many sites. <>. Go ahead and copy, I read Pinterest and other restricted sites regularly.
This looks like a good place to insert my similar question and that is how to copy or download a video from a site where it appears to be no longer possible. I have for many years now taken some very expensive online courses. The videos were left on the Digital Chalk website so I could always go back. Just in case I would always download a copy using the Video Download Helper browser add-on. But then they said that they would only leave the videos up for a year. And then it became increasingly difficult to download them using the plug-in. It required many tries to get a successful download and now it does not work at all. I have tried various programs, always looking for a free one and even tried a paid one I think called Snag-it. The best I have been able to do is copy the video with but no sound, using my Thinkpad X1 Carbon. Any suggestions? I don’t feel like this is stealing because I have paid and paid a lot for these videos.
First, I’d reach out to the video’s owner with your conundrum. They may have a supported approach.
The brute-force way that any video can be captured is to simply use a screen recorder capable of also recording system audio. It’s today’s equivalent of using one VCR to record another. It works in real time — meaning a one hour video takes one hour to record.
Again, I do this not so as to encourage piracy (technically while what you are asking to do feels moral, it may still be illegal, which is why I’m not mentioning specific tools), but rather to point out to those who post videos, for example, that they’re still copy-able one way or another.
Thanks for your very fast reply, Leo. I think have been following you for 20 years. I have even used a video camera to film my grandaughter dancing in world championship competitions on the internet. So I will try anything LOL
I’m surprised nobody has mentioned Nirsoft’s “cache view”
I’ve been using it for years.
Captures txt or pictures, probably video too.
Also using Ctrl U seems to work for txt. Although you “have to” remove all the scripts and other bits of code to tidy it up. I use it for recipes that have it disabled.
From my browser I use a Extension from http://www.emailthis.me.
Find a page I like, then I just email it to myself to archive for future action. I find, if required in editable form the contents of the emailed version can be copied/pasted into say Word for future action
Whenever I research a topic, I do a lot of small scale copying and down-editing, often small snippets or even just single words or names to be sure to have the right spelling. I get rather angry with web site owners who try to block this. Mainly because I don’t see how they achieve anything, other than making their web-sites unfriendly to casual users. The blocking seems impolite, inconsiderate, thoughtless, towards your potential client or costumer, or people interested in your message. My small revenge is usually to avoid linking or referencing these web pages, unless they are somehow essential (which is rarely the case). The search engines ought to do the same: Simply down-grade these pages. They make themselves less valuable.
Great article! I really like your mentality. I need co,puter help. Do you do remote computer work?
Sorry, we don’t do remote computer work.