Program to automatically save info from the web

By NV30
Dec 21, 2002
Topic Status:
Not open for further replies.
  1. Is there one? I'd like it to save a web page/image every so often, say once every hour or whenever it's changed if that's possible. Would this be feasible?
  2. Nodsu

    Nodsu Newcomer, in training Posts: 9,431

    Under a _real_ OS like Linux/Unix, of course it's possible!

    Windows... You could use the Scheduling Agent with wget or sth like that.
  3. Vehementi

    Vehementi TechSpot Paladin Posts: 3,199

    Tools -> Internet Options -> Temporary Internet Files -> Settings -> Check for newer versions of stored pages

    Is this what you're talking about?
  4. NV30

    NV30 Newcomer, in training Topic Starter Posts: 339

    Thanks for the replies. Vehementi: No not what I meant. Nodsu: The Scheduling Agent with xp does not have this option, it can only run programs at certain times. Is there something you can download?
  5. Nodsu

    Nodsu Newcomer, in training Posts: 9,431

    You can run wget say, every 10 minutes, it is smart enough not to download again things that it already has.

    If you think Windows scheduler is bad, there are other schedulers out there too..
  6. NV30

    NV30 Newcomer, in training Topic Starter Posts: 339

    Thanks. I'll search a bit, but could you post a link?
  7. NV30

    NV30 Newcomer, in training Topic Starter Posts: 339

    n/m found it.
  8. NV30

    NV30 Newcomer, in training Topic Starter Posts: 339

    Well I got it installed, but how do I use it? The readme isn't too clear.
  9. Nodsu

    Nodsu Newcomer, in training Posts: 9,431

    Got what?

    Wget? I can sure help you with that..

    Oh, and wget does not schedule itself, you still need another program for that..
  10. NV30

    NV30 Newcomer, in training Topic Starter Posts: 339

    Yep, wget. What other program do you need?
  11. Nodsu

    Nodsu Newcomer, in training Posts: 9,431

    cron.. :p

    Really, I think cron is a great tool. But you'll need some
    *X OS or Cygwin and I suppose that may be a little overkill..

    For Win.. Dunno, try
    http://www.splinterware.com/products/wincron.htm
  12. NV30

    NV30 Newcomer, in training Topic Starter Posts: 339

    Thanks. I give up. :eek: What is wget used for anyways? How do I run it?
  13. Nodsu

    Nodsu Newcomer, in training Posts: 9,431

    You download it, fiddle a few settings maybe, open up a console and type in the command line. Program will do the rest.

    http://www.gnu.org/software/wget/wget.html
     
  14. NV30

    NV30 Newcomer, in training Topic Starter Posts: 339

    Well it's installed and everything, but when I open it, it appears as a DOS windows for two seconds before disappearing. It's then not available by Alt-tabbing or any other method. I am using WinXP Home.
  15. Nodsu

    Nodsu Newcomer, in training Posts: 9,431

    Correct, wget is a console program if you run it without parameters, it will complain and exit immediately.

    If you want to use it with a scheduling program, you will enter the correct parameters to the scheduler, or even better - write a .bat file.

    To run this thing interactively, you will need to run "cmd" or "command" to get a console window first.


    A simple example:

    You want to check for a new version and download www.fluffykitties.com every 5 minutes.

    (Assuming you downloaded the Windows Scheduler from the link I posted above):
    Make a new event, name it Wget
    Set Application by browsing to wget.exe
    Set parameters as "--recursive --level=0 --timestamping www.fluffykitties.com"
    Set working dir to whatever
    Set schedule to Every hour/selected minutes, every 5 minutes
    do save and exit, exit program

    You will see Windows Scheduler on your systray

    WS will run wget with given parameters every 5 minutes.

    By default wget will now recursively follow _all_ links on www.fluffykitties.com and download _everything_ under a folder called "www.fluffykitties.com" in Working dir.

    Because web pages usually have links to other servers too, I don't recommend using this example carelessly or you may end up having the whole internet on your HD :D

    I posted the documentation link for wget before, you can also run "wget --help" in command prompt to get a short listing of parameters. And of course you can post back and ask anything.

    Edited to not automatically parse URLs. Web page www.fluffykitties.com does not exist. Don't make any assumptions on my sexual alignment :p
  16. NV30

    NV30 Newcomer, in training Topic Starter Posts: 339

    Thanks for the help. Is there a way to just download a certain file? If you put the URL directing to a certain file would it download just that file? Also, can it save information from a form/cgi script?
  17. Nodsu

    Nodsu Newcomer, in training Posts: 9,431

    Of course you can get a single file.

    I don't quite understand what do you mean by downloading info from a form.

    Wget just parses the URL you give it and saves whatever the server serves. Just like any browser but without displaying the thing.
  18. NV30

    NV30 Newcomer, in training Topic Starter Posts: 339

  19. Nodsu

    Nodsu Newcomer, in training Posts: 9,431

    Ah I get it now.

    No such thing, sorry. I don't think any automated program can simulate click events or such for scripts.

    You can complain to the makers of the website and ask them to allow URL-based queries.
  20. NV30

    NV30 Newcomer, in training Topic Starter Posts: 339

    I see. Oh well, you can't have everything. ;)
Topic Status:
Not open for further replies.


Add New Comment

TechSpot Members
Login or sign up for free,
it takes about 30 seconds.
You may also...


Get complete access to the TechSpot community. Join thousands of technology enthusiasts that contribute and share knowledge in our forum. Get a private inbox, upload your own photo gallery and more.