8 Replies Latest reply: Nov 22, 2012 1:17 AM by Fabrizio Bartolomuci
Fabrizio Bartolomuci Level 1 Level 1 (0 points)

I managed to collect the behavior of a complex web site into a webarchive. Thereafter I would like to turn that webarchive into an html set of nested directory. Yet, when I did it both with Waf and with a commercial software bought on the the Apple store, what I got was just the nested directory with the html page at the bottom and no images, nor css nor working links. If you are interested the webarchive document is at:

http://www.miafoto.it/it/GiroMilano.webarchive

while the weak product of the extraction is at:

http://www.miafoto.it/it/Giromilano/Pagine/default.aspx

and the empty directories above.

In addition to the different look, the webarchive displays the same behavior as the official web site - when a listbox vales is selected and then the button pushed - while the extracted version just produces a page with no contents, by reloading itself rather than the official page. As you may see the webarchive is over 1MB while the product of the extraction just little over 1 KB.

What is wrong with it and how may I perform such an apparently trivial business with usable results?

Thanks, Fabrizio


MacBook Pro, Mac OS X (10.7.1)
  • 1. Re: Converting a webarchive to html
    andyBall_uk Level 7 Level 7 (20,320 points)

    http://sourceforge.net/projects/webarchivext/files/ seemingly works better than waf or the later https://github.com/robrohan/WebArchiveExtractor/downloads

    on the example file you gave, although you lose the perfect map image, although scrolling in that doesn't work offline even in the webarchive format

     

    if a current site, Camino's save-as webpage complete is easier

  • 2. Re: Converting a webarchive to html
    Fabrizio Bartolomuci Level 1 Level 1 (0 points)

    What I need from tra HTML page that the web archive as long as the source page do differently of all my experiments, is the possibility of selecting a line and trigger the loading of a new page with the belonging station embedded: I am not so much interested in that page so much as using that as a trampoline to show the correct one in order to parse the belonging station. Have you checked this possibility is given with your solution? given even waf displays a page but it leads nowhere upon selecting a line.

     

    Thanks, Fabrizio

  • 3. Re: Converting a webarchive to html
    andyBall_uk Level 7 Level 7 (20,320 points)

    I think I follow what you want, & no, although the application produces most all of the referenced files, it doesn't create a page that will call up a new    ...atm.it/it/Giromilanoone one like the webarchive does.

  • 4. Re: Converting a webarchive to html
    Fabrizio Bartolomuci Level 1 Level 1 (0 points)

    I am astonished. Is it possible that a webarchive holds inside itself a sort of magic that may not be written on a book of spells? In that case I change my question: is there the possibility to modify the html part in a webarchive and have it keeping on working, and in that case how would I do it without screwing up the non ascii characters in the heading an possibly elsewhere as a normal editor would do?

  • 5. Re: Converting a webarchive to html
    andyBall_uk Level 7 Level 7 (20,320 points)

    no magic, but as you've seen, the converter/folderizer apps aren't consistent.

     

    you can surely edit the html using any raw editor, or even a plist editor.

     

    since you rely on the site still existing/being accesssible - can you use an iframe to display the actual site, instead ?

  • 6. Re: Converting a webarchive to html
    Fabrizio Bartolomuci Level 1 Level 1 (0 points)

    As for the raw editor, have you got any suggestion for either mac or windows leaving possibly apart vi or emacs?

    If any needed the site as-is, of course your option would be good, yet my need is to use that page to open a number of other ones by programmatically selecting a listbox in the page. In practice I need to feed the page with a select value and the page should open the correspondent one like it were manually selected. The problem is that the page presently performs this feat by using scripts stored in .axd virtual files, very Windows style, I have no cue about how to use and much less create.

    Thereafter having the possibility of modifying the webarchive would be just the start of the story.

     

    Of course that would be more a Windows forum issue, but given I am a Mac fan, Apple people are smarter and Safari webarchives seem to tackle them, I feel more comfortable in proposing it here :-)

  • 7. Re: Converting a webarchive to html
    andyBall_uk Level 7 Level 7 (20,320 points)

    http://www.suavetech.com/0xed/0xed.html is quite nice on a mac, & free; textwrangler should be more useful, certainly easier to read.

     

    http://www.nightproductions.net/prefsetter.html will open .webarchives & show how they're arranged, just have to add.plist to the name for it to recognise them. I expect that other plist editors will do the same

     

    if your saved webarchive 'remembers' date & other settings ( seems to?), is it possible to save one for each required value?

  • 8. Re: Converting a webarchive to html
    Fabrizio Bartolomuci Level 1 Level 1 (0 points)

    My problem is that I would like the webarchive to shed some light on the inner working of this Windows technique to build a nuclear facility for opening a page upon selecting a listbox! So if most of the things in the webarchive remain hidden, I am afraid the shed light would not be shining...

    What I am still looking is to have that page or webarchive to programmatically select one element of its listbox to open the appropriate page on the remote site, as the webarchive does, so rising my expectations that is a doable thing.