Need to Access a Whole Website Offline?

As I have continued to grow, learn, and devote myself to the academic world and the development of my mind I have noticed that I have subtle growths in abilities outside of just biology and chemistry. Ever so slowly I have collected small but powerful nuggets of knowledge in the computer sciences, and it has drastically changed my life for the better.

Computers are everywhere, whether you like it or not, and artificial intelligence, robotics, and coding are slowly taking over the world. It is inevitable. The writings on the walls: DeepMind beat the Go Champion, Google put Boston Dynamics up for sell, Multiple countries began discussing a universal monthly income for their citizens. These are red flags. There are only 4 types of work and until DeepMind exhibited its ability to actually learn and teach itself how to do a task by beating the Go player, we at least did not have to fear the onset of automation in 2 of the 4 types.

Things…things are’a changin’…

However ominous that may sound we have to give it to technology. It continues to exceed our expectations and give us amazing things at an ever quicker pace. No we come to the topic at hand, websites. The internet is beautiful for many reasons, but one of them is that the billions of people who are on it often are talented and brilliant human beings who can create great things. If you know where to look you can find some of them, though sometime you have to deserve it.

So the question is can we save a whole website for access offline? I’m not talking a single page here, no, we want the whole thing. Can we rip it off the nets? Is it possible?

.

.

.

The answer is yes. 

Not only can you access it, but you can access most of it’s features too given the right tools. Look no further though, I will give you all you need here.


 

Tools for downloading a whole website

  • HtTrack -This program copies the contents of an entire site. It can even grab grab the pieces needed to make active code content work offline.
  • Wget – This program is a classic command-line tool developed for this specific task. It is sometimes included with Unix/Linux systems. It is also available for Windows users as well (newer 1.13.4 ). Despite being a free utility, it packs a lot of bang for the buck allowing for non-interactive downloads of files from the Web. It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies.
    • Note: Wget is a bit more extensive in the computer department, A good example run is doing the following comd line:

      wget -r --no-parent http://site.com/songs/
      
      

      You also might try:

       --mirror instead of -r.

      and you might want to include:

      -L/--relative

      so as to not follow links to other servers. Here are some of the options explained, if you only get an index try a “-r”.

      -p                 get all images, etc. needed to display HTML page.  
      --mirror           turns on recursion and time-stamping, sets infinite 
                         recursion depth and keeps FTP directory listings
      --html-extension   save HTML docs with .html extensions  
      --convert-links    make links in downloaded HTML point to local files. 

      There are additional details available as well, see the Wget Manual and its examples. I’ll leave more links in the reference links.

       

  • ServerFault – Used to backup websites for server based entities
  • Internet Download Manager – This tool has a Site Grabber utility with a lot of options, some of which let you completely download any website you want.
  • ItSucks – Already a winner in my book, this program is a java web spider (web crawler) with the ability to download (and resume) files. It can also be customized with regular expressions and download templates. Hosts a GUI and console interface.
  • WebSnake – A powerful offline browser for the Windows platform. Worked like a charm when I tried it out.
  • WebZip – Another good program worth noting.
WebZIP Offline Browser - screenshot

A screenshot of the WebZip program doing its thing (4/8/16)


Remember my site if you ever need access to things when you’re off the grid, you never know when you might need my page! Let me know as always if any links die and if this helps you any feel free to comment below.

 

References

  • http://linuxreviews.org/quicktips/wget/ – Wget help #1
  • http://www.krazyworks.com/?p=591 – Wget help #2
  • http://brew.sh/ – Installs the stuff you need that Apple didn’t bother giving you, at least as far as they are concerned. I don’t have any Apple products at the moments to mess with it but I will try to set up a OS this summer to see what they are all about. The HomeBrew formulae are all in Ruby.
  • StackExchange – Recent conversation started on Stack Exchange concerning a couple of these programs and this exact issue.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: