Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Converting a webarchive to html

I managed to collect the behavior of a complex web site into a webarchive. Thereafter I would like to turn that webarchive into an html set of nested directory. Yet, when I did it both with Waf and with a commercial software bought on the the Apple store, what I got was just the nested directory with the html page at the bottom and no images, nor css nor working links. If you are interested the webarchive document is at:

http://www.miafoto.it/it/GiroMilano.webarchive

while the weak product of the extraction is at:

http://www.miafoto.it/it/Giromilano/Pagine/default.aspx

and the empty directories above.

In addition to the different look, the webarchive displays the same behavior as the official web site - when a listbox vales is selected and then the button pushed - while the extracted version just produces a page with no contents, by reloading itself rather than the official page. As you may see the webarchive is over 1MB while the product of the extraction just little over 1 KB.

What is wrong with it and how may I perform such an apparently trivial business with usable results?

Thanks, Fabrizio

MacBook Pro, Mac OS X (10.7.1)

Posted on Nov 20, 2012 1:24 AM

Reply
Question marked as Best reply

Posted on Nov 20, 2012 8:03 AM

http://sourceforge.net/projects/webarchivext/files/ seemingly works better than waf or the later https://github.com/robrohan/WebArchiveExtractor/downloads

on the example file you gave, although you lose the perfect map image, although scrolling in that doesn't work offline even in the webarchive format


if a current site, Camino's save-as webpage complete is easier

8 replies
Question marked as Best reply

Nov 20, 2012 8:03 AM in response to Fabrizio Bartolomuci

http://sourceforge.net/projects/webarchivext/files/ seemingly works better than waf or the later https://github.com/robrohan/WebArchiveExtractor/downloads

on the example file you gave, although you lose the perfect map image, although scrolling in that doesn't work offline even in the webarchive format


if a current site, Camino's save-as webpage complete is easier

Nov 20, 2012 8:31 AM in response to andyBall_uk

What I need from tra HTML page that the web archive as long as the source page do differently of all my experiments, is the possibility of selecting a line and trigger the loading of a new page with the belonging station embedded: I am not so much interested in that page so much as using that as a trampoline to show the correct one in order to parse the belonging station. Have you checked this possibility is given with your solution? given even waf displays a page but it leads nowhere upon selecting a line.


Thanks, Fabrizio

Nov 20, 2012 1:01 PM in response to andyBall_uk

I am astonished. Is it possible that a webarchive holds inside itself a sort of magic that may not be written on a book of spells? In that case I change my question: is there the possibility to modify the html part in a webarchive and have it keeping on working, and in that case how would I do it without screwing up the non ascii characters in the heading an possibly elsewhere as a normal editor would do?

Nov 21, 2012 9:59 AM in response to andyBall_uk

As for the raw editor, have you got any suggestion for either mac or windows leaving possibly apart vi or emacs?

If any needed the site as-is, of course your option would be good, yet my need is to use that page to open a number of other ones by programmatically selecting a listbox in the page. In practice I need to feed the page with a select value and the page should open the correspondent one like it were manually selected. The problem is that the page presently performs this feat by using scripts stored in .axd virtual files, very Windows style, I have no cue about how to use and much less create.

Thereafter having the possibility of modifying the webarchive would be just the start of the story.


Of course that would be more a Windows forum issue, but given I am a Mac fan, Apple people are smarter and Safari webarchives seem to tackle them, I feel more comfortable in proposing it here :-)

Nov 21, 2012 11:24 AM in response to Fabrizio Bartolomuci

http://www.suavetech.com/0xed/0xed.html is quite nice on a mac, & free; textwrangler should be more useful, certainly easier to read.


http://www.nightproductions.net/prefsetter.html will open .webarchives & show how they're arranged, just have to add.plist to the name for it to recognise them. I expect that other plist editors will do the same


if your saved webarchive 'remembers' date & other settings ( seems to?), is it possible to save one for each required value?

Nov 22, 2012 1:17 AM in response to andyBall_uk

My problem is that I would like the webarchive to shed some light on the inner working of this Windows technique to build a nuclear facility for opening a page upon selecting a listbox! So if most of the things in the webarchive remain hidden, I am afraid the shed light would not be shining...

What I am still looking is to have that page or webarchive to programmatically select one element of its listbox to open the appropriate page on the remote site, as the webarchive does, so rising my expectations that is a doable thing.

Converting a webarchive to html

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.