Ever since the Web got popular in the mid-90s, people have been in search of a tool that would help them grab Web pages or sites.
This could be for a variety of reasons:
- You're giving a presentation and want to show a Web site, but you won't have an Internet connection.
- You're not sure a Web site is going to stick around and you want to make sure you have access to its content.
- You want to study how a Web site was constructed.
- You're doing research and want to gather together lots of material.
Following is a list of tools that I've read about or used that enable a user to grab pages or sites. This list is in alphabetical order. I'll be honest up front: the only one that I really consider worthwhile is wget. All others I've ever used had some problem with them.
Operating Systems: Windows
Never used, but some people recommend it. Here's how it describes itself: "Why leave yourself at the mercy of Web sites that may or may not be in business when you return to them? Schedule SuperBot to download entire Web sites onto your hard drive and view them when you're ready. You can also burn them onto auto-run CDs. SuperBot resumes partial file download, saving you the aggravation of having to download the same file twice. You can restrict SuperBot from downloading specific file types when saving the pages (.zip, .exe, and so on), as well as cookies, images, and sound files. You can also choose how far down SuperBot drills into the site, and whether or not to stay inside the server that the downloading began. It can also update only the files that have changed since the last page download."
Operating Systems: Windows
Never used, but we know that others are happy with it.
Operating Systems: Windows, Mac
Cost: Not worth it
This was one of the first programs to grab Web sites, but it is not recommended. The Windows client is clumsy; the Mac version is atrocious.
The best solution, easily. Works as advertised, and it's free! One caveat (which to me is actually a strength): it runs from the command line. This make it faster and easier to use, in my opinion.
Operating Systems: Unix, Linux, Windows, Mac OS X
Here's an example. Let's say you want to download the entire site at http://www.foo.com. Just run this:
wget -m -p http://www.foo.com
This command will grab the entire site, and -m and -p will grab all images, CSS, etc. If you add the -k flag, wget will convert all absolute links into relative links. Nice!
If you're on Windows, by the way, copy wget.exe into your Windows system directory (usually at C:windows) so you can run it anywhere from the command line.
TIP: Use the -rk options, r = recursive, k = convert-links