Website Migration Using Wget

There are occasions when you need to move a website from one hosting provider to another and the more standard approach of using FTP to collect all of the files isn’t available.

This can sometimes occur because there has been a falling out between the owner of the website and the existing web host, the access details have been lost, the web host can’t be contacted, the migration is urgent etc.

Wget is a common unix tool, that is also available on windows. Wget works from the command line, and has many different configuration options available to control exactly how much it will download from the starting point it is given and subsequently what it does with what it finds.

Wget works by starting at the homepage and trawling through the site getting a copy of every html or image file that it can find a link to, that is part of the website it started at.

We often use wget to completely mirror remote sites, when a new customer comes over to us from another web hosting provider, we often copy the site for them using wget. To use it on our server, log in using ssh. From the command prompt, run wget with the url of the file you want to download. This will download the file directly to our server. As as hosting provider we have to operate very fast internet connections, and so using wget directly from our servers is much faster than downloading it to your local machine and then re-uploading the files to our servers.

Another common use is, as I said, to mirror an entire site. Let’s assume you are moving the Anchor website from website hosting company A to hosting company B. You have your new account setup, and you have logged in via ssh to B’s server. Now to mirror your site, run

wget -r <a target="_new" rel="nofollow noopener" href="" target="_blank"></a> and wget will recursively download your website to the new account.

Now you should have a complete copy of your website, but be warned, wget does not read javascript, so all those fancy rollover effects will not work unless you copy the correct files manually.

By default wget will create a directory named after the site it is downloading, you probably want to put the files in the directory you are in at the moment, so just add -nd to the command. This tells wget not to create directories except when needed for your website.

The final command should look something like this

wget -rnp -nd <a target="_new" rel="nofollow noopener" href="" target="_blank"></a>

Another word of warning is in relation to websites which are produced by programming languages. Wget is really only useful for mirroring sites in a specific set of circumstances. If the website has been constructed using asp, php, perl, java etc, wget will only download the html files that these programs render rather than the original source files. This is important to take note of since these programming languages may be performing taskssuch as changing the content of the page based on the user, interacting with a database to collect statistics, or accept orders.

Once you’ve used wget to make a copy of your website, it’s important that you test the files in the new location to ensure it is behaving in the same way that the original site did.

Immobilienmakler Heidelberg

Makler Heidelberg