How do I create my own personal archive of a website?
I want to save all of the information on a particular website. I suppose I could just copy-paste the text into a txt file, but that sounds tedious.
There must be a better way than just navigating through all of the pages of the website and using ctrl-S.
Scrapbook addon, works pretty well. You just have to think about what you are doing.
For example, if your website has content like:
site.domain/1
site.domain/2
site.domain/3
site.domain/1000
then you can use something like python to generate a list of all the versions from 1 to 1000, paste them into scrapbook, and it will save all of them.
That will not get branches with other names though.
You can of course manually save each page, but that will not be feasable on large sites.
There is also the option to follow links, up to a depth of 5 (1 means every link on the first specific page you save, like site.domain/3 will be followed, and also saved. depth 2 means every link on those linked sites will also be followed and saved.) this can get really tricky though if it links to anything outside, as well as produce duplicates.
I think the usual name for this is "siterip", maybe you could use that to search for some tools.
You can use command line utils like curl and wget to do it, but it's tricky getting the details correct.