Author Topic: Download a Website?  (Read 2439 times)

Offline SorO_Lost

  • Epic Member
  • ****
  • Posts: 7197
  • Banned
    • View Profile
Download a Website?
« on: September 09, 2016, 07:14:47 PM »
Pretty much as the thread title, first attempt botched things terribly.

Anyone know a good program, or how to configure HT Track, to download the card-only section of DiceMastersDB? Essentially I need from their Catalog, to it's linked specific set list, to the linked dice/card list, to the linked card & image, and then stop. No user uploaded deck lists, no crawling on top links, etc.

The website goes down the 22nd so I'd like to get a copy before it goes down.

Offline kitep

  • DnD Handbook Writer
  • ****
  • Posts: 1947
  • Lookout World!
    • View Profile
Re: Download a Website?
« Reply #1 on: September 10, 2016, 11:18:13 AM »
I'm no expert, but I did play around with HTTRAK a few months ago.  I'll try some things and see if I can get it to do what I think you want.

In the meantime, I'd recommend going to the Wayback Machine ( archive.org ) and seeing if you can get it to copy DiceMasterDB.  Then you won't have to worry about the Sept 22 cutoff date.

Good luck!

Offline Amechra

  • Epic Member
  • ****
  • Posts: 4560
  • Thread Necromancy a specialty
    • View Profile
Re: Download a Website?
« Reply #2 on: September 10, 2016, 12:59:54 PM »
Pretty much as the thread title, first attempt botched things terribly.

Anyone know a good program, or how to configure HT Track, to download the card-only section of DiceMastersDB? Essentially I need from their Catalog, to it's linked specific set list, to the linked dice/card list, to the linked card & image, and then stop. No user uploaded deck lists, no crawling on top links, etc.

The website goes down the 22nd so I'd like to get a copy before it goes down.

I'm messing around with wget at the moment - I'm lazy, so I'm just scraping it and grabbing the images.
"There is happiness for those who accept their fate, there is glory for those that defy it."

"Now that everyone's so happy, this is probably a good time to tell you I ate your parents."

Offline SorO_Lost

  • Epic Member
  • ****
  • Posts: 7197
  • Banned
    • View Profile
Re: Download a Website?
« Reply #3 on: September 10, 2016, 03:05:47 PM »
I'm messing around with wget at the moment - I'm lazy, so I'm just scraping it and grabbing the images.
I think HTTrack has a grab image choice, I should have tried that.

@Kite good idea but I have no idea how to request a copy. Fortuitously someone already did it and there is a copy for the 9th, literately yesterday, so at least I don't have to worry about the deadline. Can I direct link to Archive's images for forum usage?

Offline Amechra

  • Epic Member
  • ****
  • Posts: 4560
  • Thread Necromancy a specialty
    • View Profile
Re: Download a Website?
« Reply #4 on: September 10, 2016, 05:25:15 PM »


Edit: This is hot-linked from the Archive.
"There is happiness for those who accept their fate, there is glory for those that defy it."

"Now that everyone's so happy, this is probably a good time to tell you I ate your parents."

Offline altpersona

  • Legendary Member
  • ****
  • Posts: 2000
  • #78
    • View Profile
    • You are here
Re: Download a Website?
« Reply #5 on: September 10, 2016, 10:39:47 PM »
wget -r --no-parent www.website.com/stuff/
The goal of power is power. - 1984
We are not descended from fearful men. - Murrow
The Final Countdown is now stuck in your head.

Anim-manga still sux.

Offline kitep

  • DnD Handbook Writer
  • ****
  • Posts: 1947
  • Lookout World!
    • View Profile
Re: Download a Website?
« Reply #6 on: September 11, 2016, 10:09:23 AM »
I'm no expert, but I did play around with HTTRAK a few months ago.  I'll try some things and see if I can get it to do what I think you want.

Sorry to say, but I'm stumped.  The "û" in "Faerûn" seems to be throwing HTTRACK for a loop.