Technology in terms you understand. Sign up for the Confident Computing newsletter for weekly solutions to make your life easier. Click here and get The Ask Leo! Guide to Staying Safe on the Internet — FREE Edition as my thank you for subscribing!

How Do I Copy a Webpage? Four Approaches and an Interesting Insight

//
How do I copy an entire webpage? I copy and paste, but not everything appears as I see it. For example, I’m copying and pasting a bank statement to Word, but portions of the page appear empty.

It depends on exactly what you’re trying to do. What is it that you intend to do with the result?

There are several approaches.

None of them are what I’d call clean, but depending on your goal, one or more of them might work for you.

Become a Patron of Ask Leo! and go ad-free!

Copying a web page by printing or saving to PDF gives you the highest fidelity copy to save, as does printing to paper. Copy/Paste into another application like a word processor can work, but often results in loss of formatting so may be good only for portions of the page. A variety of clipping/snipping tools, like Evernote, are good alternatives and often provide several approaches.

Print or save to PDF

If all you’re attempting to do is save a copy for your records, this is my top recommendation.

I do it myself for my banking records. I visit my bank’s website every month, display the statement, and then “print” it to a PDF file, which I then save.

PDF is perfect for several reasons. They’re easy to produce, and PDF is so ubiquitous you probably already have a reader.

While viewing the webpage you want to copy, type CTRL+P to print, and then, depending on your system or browser, select “Save as PDF”, or perhaps “Print to PDF” as the “Printer” to print to.

Saving an Ask Leo! article as PDF in Microsoft Edge
Saving an Ask Leo! article as a PDF in Microsoft Edge. (Click for larger image.)

It’ll ask you what to name and where to save the file. The result will be a PDF of the webpage saved for your records.

Print to paper

It’s probably not what you’re looking for, but it should be said. Sometimes for archival purposes, hard copy is the way to go.

Some HTML pages print differently than they appear on screen. This is controlled by how the webpage was designed. If you print this page, for example, items such as the advertisements and menu bar will not be printed. Ideally, printing will give you useful (but not necessarily identical) results.  If your print function provides a preview, check it out to save yourself paper. Printing this page, for instance, requires 25 pieces of paper or more, because it includes the comments.

Copy/Paste

There are several approaches to copy/pasting a webpage, but there’s almost no chance of getting exactly what appears in your browser.

Depending on the page design and program you’re pasting into, many elements will not copy over, or will copy over differently. Consider that the same exact page viewed in two different browsers looks slightly different. You’ll see the same page, but not the same exact results. If different browsers, which are specifically designed for viewing webpages, don’t display identically, then the chances of other programs (such as Word) doing so are basically zero.

In your browser, copy the entire webpage by doing this:

  • Click anywhere within the webpage you want to copy.
  • Type CTRL+A to select everything on the page.
  • Type CTRL+C to copy that selection to the clipboard.
  • Switch to Word (or your word processing program of choice).
  • Type CTRL+V to paste.

If you do this with, say, the example article I used above, you’ll see it looks very different and quite wrong.

Article pasted into Word online
Article pasted into Word online. (Click for larger image.)

It includes things like the menu (much of which is hidden on screen, but not hidden from copy/paste), and much more you likely don’t want.

You can, of course, then edit that result down to whatever you want, but that can be a chore.

Instead of Ctrl+A, you might consider selecting only that portion of the webpage you want. Here’s that result of copy/pasting the article itself:

Portions of a web page copy/pasted into Word
Portions of a webpage copy/pasted into Word. (Click for larger image.)

It’s certainly better, but far from perfect.

Once again, your exact results will vary dramatically depending on how the original webpage was constructed. It may be reasonably close, or it may result in a jumbled mess.

In general, copy/paste is a reasonable approach when you want to save only a portion of text from a webpage. Various limitations make it less than ideal for trying to save the entire page.

Clipping tools

Some programs include what they refer to as “clipping tools” designed to clip content from webpages onto the clipboard.

Evernote is the tool I use for this purpose. Using its provided browser add-on, when you tell it to clip a page, it presents a list of different approaches to try.

Evernote snipping a web page
Evernote snipping a webpage. (Click for larger image.)

Using the different options, you can see exactly what the clip will look like before Evernote saves it. Yet again, depending on how the page was designed, different approaches to clipping yield different results. It’s easy to try different ones to see which get the results you want.

In Evernote’s case, it then saves the result in a note, which may be enough, or may be something you can then copy/paste elsewhere, print, or export using Evernote’s built-in tools.

Clipping tools like this are often the best option, since they’re designed to do something close to what you’re looking for, and can attempt to understand the nuances of the webpage they’re operating on.

Save HTML?

For the heck of it, right-click on the page you want to save and click on View page source. You’ll see a jumble of arcane HTML and other code that represents the webpage.

View Source on a web page
View Source on a webpage. (Click for larger image.)

You could save this. It would be a canonical, exact copy of the webpage.

Except, of course, it’s not. It includes references to many other files you normally don’t see or care about: the style sheets, fonts, and pictures used to build the page when displayed on screen. None of those are included in the source; only the references to them.

Another approach in some browsers is a “Save page as…” function. Its location varies in different browsers; in Edge and Chrome it’s off the “More tools” menu.

Save Page As
Save page as…

This will also save the raw HTML to disk, but will also save all the supporting files referenced into a sub-folder.

“View source” or “Save page as…” can be valuable to locate the plain text behind a webpage — say hidden or hard-to-read text — but you’d have to wade through or search the HTML to find it. Outside of that, it’s only valuable to those trying to figure out why their webpages aren’t displaying properly.

If you found this article helpful (did you make a copy? Smile) you’ll love Confident Computing! My weekly email newsletter is full of articles that help you solve problems, stay safe, and increase your confidence with technology.

Subscribe now, and I’ll see you there soon,

Leo

Podcast audio

Play

Video Narration

Posted: May 29, 2020 in: Web Browsers
This is a major update to an article originally posted May 24, 2007
Shortlink: https://askleo.com/3034
« Previous post:
Next post: »

New Here?

Let me suggest my collection of best and most important articles to get you started.

Of course I strongly recommend you search the site -- there's a ton of information just waiting for you.

Finally, if you just can't find what you're looking for, ask me!

Confident Computing

Confident Computing is the weekly newsletter from Ask Leo!. Each week I give you tools, tips, tricks, answers, and solutions to help you navigate today’s complex world of technology and do so in a way that protects your privacy, your time, and your money, and even help you better connect with the people around you.

The Ask Leo! Guide to Staying Safe on the Internet – FREE Edition

Subscribe for FREE today and claim your copy of The Ask Leo! Guide to Staying Safe on the Internet – FREE Edition. Culled from the articles published on Ask Leo! this FREE downloadable PDF will help you identify the most important steps you can take to keep your computer, and yourself, safe as you navigate today’s digital landscape.



My Privacy Pledge

Leo Who?

I'm Leo Notenboom and I've been playing with computers since I took a required programming class in 1976. I spent over 18 years as a software engineer at Microsoft, and after "retiring" in 2001 I started Ask Leo! in 2003 as a place to help you find answers and become more confident using this amazing technology at our fingertips. More about Leo.

52 comments on “How Do I Copy a Webpage? Four Approaches and an Interesting Insight”

  1. Vista and up-to-date xp machines have a third option, XPS printer. Click on File Print Printer Setup (how you get there depends on the program) and select XPS printer as your printer. This is MS’s answer to PDF.

    That said, I haven’t used it myself.

    Reply
    • It’s now 2020. The world has moved on… and unfortunately this year not in a good way.

      But, saving a webpage to pdf is astoundingly easy and quick.

      Use Opera browser. Right click the page and under the save as there is a save as pdf. Click, bang, DONE.

      The single biggest time saver of anything computer wise to date for me. And it keeps it as an absolutely faithful rendition of the page… but, there’s always a but, ensure that the page doesn’t have somewhere down where you click a “show more +” thingy to… well show more. A totally retrograde so called innovation as far as I’m concerned.

      To see what I mean by that go to {link removed} and click on any of the listings that show up. Pity I can’t add a screenshot to this post. And yes, Leo will have very good reasons for not allowing that I’m sure.

      Reply
  2. Leo, you missed out taking a screenshot – Ctrl+Shift+Print Screen. That gets a faithful representation of (most of) what’s on the screen. Problematic, of course, if the webpage goes off the bottom or side of the screen, but nothing’s perfect.

    Reply
  3. The easiest thing to do is use Firefox, and then install the extension that allows you to do it (I forget the name right now, but I know there is one). Besides, Firefox is a million times better than IE will EVER be. 🙂

    Reply
  4. Another way:

    Copy the Web site address (URL), open a word processor (Ex. MSWord or Writer from OpenOffice.org; it may work with other processors).

    In the File menu, select Open, and paste the URL in the line where you type the name of the file that you want to open.

    Press Enter or Accept and wait.

    The word processor will open the site.

    In MSWord you have to “break the links” or something like that. I think that option is in the Edition menu.

    I believe this works specially with .html files

    Thanks Leo for your site!

    Reply
  5. Try the program Net Snippets. I use it a lot, in particular to save copies of web pages showing my receipts of things I have just purchased off the web. The program even makes a cute “snatching” sound as it snatches the content right off the screen and into a nice format that can be easily categorized, organized, searched,etc. It comes with a toolbar and one of the buttons on the toolbar is “Add Entire Page” which, as you guessed, copies the entire page for you. Check it out.

    Reply
  6. The best way to save a web page is to “save as” “Web Archive, single file, *.mht”. This is much better than using .htm or .html as it does not need to create sub directories and is much more compact. Firefox has an add-on called “Mozilla Archive Format” which saves as *.maf, which can be opened in either Firefox or IE. However, it is only available for Firefox 1.5x but will work in version 2.x but has to be modified and is a bit tricky to do.

    Reply
  7. You can obtain a screenshot right down to the bottom of a web page even if it goes off the screen – use FastStone Capture, a great piece of freeware from http://www.faststone.org SnagIt also does scrolling sreenshots, but you have to pay for it.

    Reply
  8. I enjoyed the article and downloaded the pdf creator which I had not known about — sounds handy, and I might go to paperless bank statements now also.

    However, I was curious why you didn’t mention the Save as Archive in IE (mht extension). That seems to get an exact copy of the page without the fuss of a separate file that contains the graphics, etc. I used to use it a lot, though now that I am using Firefox I don’t have that ability anymore.

    Reply
  9. Now that I downloaded PDF Creator, I just tried to save this page using it. All it got was down to the first paragraph of the copy/paste portion. It also didn’t save ads, but that didn’t hurt my feelings. However the lost information would have bothered me. I think saving to the archive is safer!

    Reply
  10. A great subject, mini-tutorial reply by Leo and useful informed comments.Another simple consideration when using Copy & Paste from a web page, especially at your bank’s site: highlight the material/info you want to save/copy type Ctrl +C, then go to the page where you’re going to place the info . . . Click > ‘Edit’> ‘Paste Special’ and click ‘Unformated Text’. This will place the saved info on your page without any of the annoying formating from the web page this allows you to format it to match existing font, color, size etc.or just leave the copied results as they are.

    Reply
  11. —–BEGIN PGP SIGNED MESSAGE—–
    Hash: SHA1

    For those mentioning “mht”, or any other bundled-archive format: my concern is
    future compatibility. What tools can be used to read those formats, and will
    they really be around, or will they be everywhere and on every platform?

    PDF has become such a defacto standard for document production and archival
    that it seems the safest. I can read it on pretty much any machine and any OS –
    even my phone. And I expect it to remain viable for a long, long time.

    That being said, MHT and others are certainly viable alternatives as well if
    you’re comfortable with them.

    Leo

    —–BEGIN PGP SIGNATURE—–
    Version: GnuPG v1.4.6 (MingW32)

    iD8DBQFGWcTqCMEe9B/8oqERAv9uAJ9B4rZZEkc6C/3tM2TYVSEx7HBTZgCfRjb9
    TFYKidvjy1od/TJmEVaApqc=
    =gmOY
    —–END PGP SIGNATURE—–

    Reply
  12. I’m surprised noone’s mention HTTrack, “HTTrack is a free (libre/open source) and easy-to-use offline browser utility. Source code is available for Windows and Linux/Unix/BSD.”

    The website is here http://www.httrack.com/.

    Regards
    Stuzz

    Reply
  13. Thanks Leo.Very informative article.
    Nevertheless,I seldom save to PDF.I find it too restrictive.
    It’s also as proprietary to Adobe as is the .doc format is to Microsoft.
    Personally I don’t believe that the PDF format will survive nor will the M$ doc format.
    There is a movement afoot to get away from proprietary formats for which royalties have to be paid.
    One example would be the Open Document format.
    Anyway,my preferred format for now is mht most of the time.
    Then html, if the page contains elements (pictures etc)that I may want to save separately.
    It’s easy to “lift” them out of a html file.
    Another advantage is that most browsers open a html or mht file.
    They will not open a pdf,unless you have the plugin.
    Additionally,most wordprocessors from various companies can open htm,html and mht formats.

    Some programs were mentioned for saving webcontent.
    I’m sure the posters wrote this in all innocence,
    but NetSnippets is no longer available and FastStone Capture went shareware 2 days before the comment was posted.
    http://www.faststone.org/index.htm
    Regardless,Capture is still a great screencapture program.
    I also use easyWebSave from
    http://www.easywebaction.com/en/
    This is a great, low cost utility for saving webcontent.
    As always,just my point of view.
    (or 2 cents if you will 🙂

    Reply
    • The easywebaction.com link comes up now with an empty link. Nothing on the page. However the Fastone has a latest version which is free and seems wonderful. I will be using it. I have been using Evernote but it is not friendly for then copying a full version of the page to another format. I copy a lot of web pages with text and pictures. Here is what I do now. I copy sections in order. A section is batch of pure text or a single picture. Don’t mix both in one copy. Then I use a trick I learned from Leo. Instead of using simple ctrl-V, I use SHIFT and CTRL and V and select unformatted copy. Works perfectly. A bit more work but not too much.

      Reply
  14. I personally use the ScrapBook Addon with Firefox. It’s an excellent Addon to capture the current level of the current page (text, images et al), multiple levels of the current page, a selected portion of the current page, and last but not the least, the ability to save all pages currently open in tabs. The import/export function of saved pages is neat. All in all, it’s a very neat and handy tool for research work, where you need to save numerous web pages.

    Reply
  15. I often want to save only a portion of text that I see on a web page, but on some web pages, I can’t select the exact portion, it includes either section before or section after. Do you know why is that?

    Reply
  16. I’ve tried to use PDF Creator as mentioned but then it does not print the entire web page into PDF. Anyone can help me to solve this? Is there any necessary additional setting for this issue?

    Regards,

    Brian

    Reply
  17. This Web Page Could not be saved ? I am also wont to Apply This Code For My Web Page.But I dont have code for my web page Please give me solution & Source Code.

    Reply
  18. With XP Home and XP Pro I use the Print Screet key. That copies the web page or whatever you are trying to save or print a copy of. Then you can paste it into WordPad or whatever wordprocessor you have. I just tried it with Leo’s Home Page and came out with what looked like me to be a perfect copy.

    Reply
  19. PDF is not a proprietary standard it is an open standard that was officially published on July 1, 2008 by the ISO as ISO 32000-1:2008.
    As for printing to pdf on Windows Vista I recommend Smart PDF Converter. One can manipulate the output pdf the same way as he/she can change printer settings for ‘usual’ paper printer

    Reply
  20. I have to do mark ups on new web page design and use multiple monitors. Best way I found of copying a page and putting it to PDF or Powerpoint is to have the web page active and key Alt+Print Screen then key Ctrl+V to paste it to your end source.

    Reply
  21. So now I understand how to save one page on a website, but what if I want to save the entire website. I am going to use file> save all (including images).
    Thanks

    Reply
  22. I am having a brain fart here. Years ago I used to use a SAVE feature that let me select the number of levels deep to go….I used it to work offline and save me dial up speed and time. Had to becarefull that I didn’t do too many levels as it would download pages from the links etc (got exponentially larger)…. BUT I COULD SWEAR that was built into Internet Explorer??? Version 5 or ??? Was it not?

    I can say for sure that it wasn’t a paid program, it was free and easy to do. I can’t for the life of me remember where or what it was if not IE 5> PLEASE HELP

    Reply
  23. It cut my last comment off? Anyway Go to favorites find the site already have to have book marked it, then right click it, select make available offline and then select how many levels etc. GREAT!

    Reply
  24. If you’re actually trying to copy the entire web page, even the part that is not displayed on the screen unless you scroll down, there are programs [ie FireShot (IE and Firefox) and Screengrab (Firefox)] that will allow you to do this very easily.

    Reply
  25. Use UnMHT with FireFox. Works EVERY time unlike the ever frustrating IE with the often messages of “this webpage could not be saved” with no reason why. I like IE – but MS is forcing me to use FF after not having solved this basic feature after all these YEARS.

    Reply
  26. Hi Leo,

    Is there any free software that allows me to save webpages as a virtual book? Which I can organize by chapters etc. Instead of copy pasting everything to word etc.

    Thanks
    Akash

    Not that I’m aware of.

    Leo
    13-Feb-2010

    Reply
  27. I am trying to copy a page I made myself so that I can read and edit the text areas later.

    I made an interactive web page that has tons of text areas and boxes that can be edited from the same page. Is there anyway to actually copy the exact content?

    P.S. I was successful in “saving as” with firefox, but the place I created it for uses Internet Explorer, where I haven’t gotten to work yet.

    Reply
  28. After reading Leo’s answers to an awesomely bewildering array of questions for a couple of years now, it seems to be a miracle that the machines we call “computers” actually work at all. These infinite variations of just about everything so far conceived.

    I accept that timing the competition, turf guarding, copyright and such all enter into this complexity, but I wonder if there isn’t a better way out of, and away from, all of these endless questions that are created each time something “new” is created?

    These are such a Babel.

    Yes, I’ve read the article at the top of the page, that’s what has provoked this question.

    Reply
  29. PicPick–Google it–It is a screen capture software. If all you are trying to do is save a statement for archival purposes you can use this to capture the page as an image. It will even auto scroll the page for you. You can save in multiple image formats as well. I use it regularly for bank statements, bills paid online, etc.

    Reply
  30. tried the control + a on page i wanted to copy and THIS page and NEITHER time did it work – i have never ever worked on any computer where those types of commands actually … like the oxymoron they are called … function.

    Reply
  31. Not just helpful responses but in clear easy to follow language. Giving an idea of how things work. Even if the info is a bit off for a specific situation, you can now google for a better search and find what you exactly need. A very smart site. I’m very savvy, but couldn’t match Leo.
    Yet I have to add something to this post. Since 1995 I’ve been downloading everything on a site. css, pics, everything. Locally it looks the same and has the same code. Nothing works perfectly but these site down-loaders are a necessity. Do a search like “Save entire website”, and find software like HTTrack or one of the many others.
    Also this site would be friendlier if “preview” of a comment wasn’t wiped. Not a biggie.
    – Arthur

    Reply
  32. I tried the save as pdf option and it worked great. I was flubbing around, wondering how to use the information stored somewhere/somehow for the purpose of printing to produce an image and since I never make PDFs, I wouldn’t have thought of this. You’re pretty much my hero. Thanks!

    Reply
  33. I just found my entire website in PDF form on a site called Printfu.com. WTF? Isn’t this a blatant invitation to copy my intellectual property?

    Reply
  34. I am in need of saving a 100 + pages of a website before they pulled the plug. A quick internet search brought me here. What a simple and eloquent solution to print to PDF. Downloaded the driver and it worked like a charm.

    Thank you!

    Reply
  35. Thanks Leo, I wanted to copy content from an html page and after googling a bit, found your advice and saved what I wanted!

    Reply
  36. It has gotten even easier to print a webpage. Using the new Edge browser, all it takes is to right-click anywhere on the page and select Print. A window opens up with options to print as PDF or print to a printer. Also, Edge will clean out all the ads and print out only the main article. Just tried it on this article, came out to be 19 pages including comments. There are options to select which pages to print and whether or not to print background graphics.
    Firefox can also print out pages in PDF format.

    Reply
  37. I use iPad / iPhone it’s easy.. Select the page.. Click the share button..Scroll down to print…. Scroll across the pages shown and tap the pages you dont want. Return to the first page… use two fingers to expand the page you see the menu again…. Tap the share button and then … Tap save to Books…. Then the possibilities are endless it is in your Books. I typed this from memory hope I did not miss something.
    Thanks Leo for your regular issue

    Reply
  38. When you mentioned Evernote for clipping, I’m surprised you didn’t mention Windows’ own built-in Snipping Tool. While it may not be quite as robust, for most things I find it does what I need, and is especially useful when my employer does not allow me to install software on the work computer.

    Reply
    • The snipping tool simply creates a screen shot — an image or picture of the region you choose. While that can very useful for some things, it’s typically not what people want when they say “copy a web page”.

      Reply
  39. Two other useful methods :

    1.- The Save As feature of your browser, provided it has a single-file html-type option. Internet Explorer had a wonderful format for that (I don’t know if it still exists). I have tons of webpages saved in it. They can still be opened by other browsers. It was called .mht, if I remember correctly.

    I now use Vivaldi, which has two such options : Save As Webpage, HTML only, which produces .html or .htm files. And Save As Webpage, Single File, which produces .mhtml files. The first one produces small files, but the webpage is not rendered to the full. The second one produces much bigger files, but the result is much closer to the original.

    2.- A dedicated browser extension, doing a similar job, but better. I use the Chrome extension Single File , which gives perfect results every time, and is dead simple to use.

    The huge advantage is, it produces a single file which is completely faithful to the original, and retains most of its interactive characteristics. Meaning, at least the links would be active. This is very important, since many web pages are now full-fledged computer programs, operating on your desktop.

    It also allows you to append the URL of the webpage to the file, which is a thing you’ll want to do most of the time, in order to be able to check if the page was updated since you saved it.

    You can also add annotations to the page, although I find this is not easy enough.

    There was a marvelous extension for that on Firefox, with its own standard-ish format, called .maff. Unfortunately, it died when Firefox changed its extension standard, like many other much-loved extensions. It was even more intuitive than Single File. Now, those .maff files cannot be opened anymore, at least in an easy way, again to the best of my recollection.

    But all .mht, .htm or .mhtml files should be future-proof, and readable by any browser.

    Reply

Leave a reply:

Before commenting please:

  • Read the article.
  • Comment on the article.
  • No personal information.
  • No spam.

Comments violating those rules will be removed. Comments that don't add value will be removed, including off-topic or content-free comments, or comments that look even a little bit like spam. All comments containing links and certain keywords will be moderated before publication.

I want comments to be valuable for everyone, including those who come later and take the time to read.