Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

Move iWeb MobileMe Blog to Wordpress

I had to migrate my private Blog about my 2 years old little daughter from Apples mobileme servers to a wordpress account.

I wanted to keep all the comments too (Marlas birth..). So i wrote a perl script to do the job. It is far from beeing perfect, but it works for me. So it may work for you too. You have to adjust some things, that‘s for sure.


Features:

-Keep Comments

-Keep Images (hmm mostly)

-Sets email-adresses for comments by Name

-Sets Tags by text strings occured in title or body

-Free, but my daughter will get every euro you donate! THX!


Note: The script works only while the apple servers are up. After 30. June all your comments are gone!



http://iweb2wordpress.weinschenks.com/Website/iweb2wordpress.html

Posted on Apr 21, 2012 1:13 PM

Reply
43 replies

Jun 25, 2012 11:30 AM in response to Justin Braem

Justin Braem wrote:


@MirkoW First, thanks so much for creating this script... what a huge help.


I've tried using it with some success, but there are a couple of issues I can't seem to resolve. I'm hoping there's an easy answer to my questions.


The blog I'm working with is currently mirrored in two places: The original site is at http://web.me.com/geoffreyg/Cars/Blog/Blog.html and I've recently reconfigured the iWeb project for a new hosting account, so the same content is mirrored at http://lanciainfo.com/Lancia_Info/Blog/Blog.html (sans comments).




I've tried running the script (v1.5) on the me.com site and it always results in the "0 postings" message that other users have experienced, although I'm sure the xml feed exists and is entered properly in the script.

You have the same Problem as William McCallum1

William McCallum1Re: Move iWeb MobileMe Blog to Wordpress


The blog-archive.xml points to http://www.lanciainfo.com/

So you have to edit the line 187:



# ------------ Find posts, this only works on web.me.com
@posts = ($res->content =~ m/<link>(http:\/\/web.me.com\/.*html)<\/link/g);


I tried this in my local script, but your blog-archive.xml seems to be totally messed up.

The Links go to adresses like this:

http://www.lanciainfo.com/Cars/Blog/Entries/2012/5/22_Aurelia_brakes.html


But the oage doesn't exist. 404 You can proof it by posting it to your browser.

The script needs a proper xml, otherwise it can not get any content.


I have a solution which may works for you. You have to replace line 187


# ------------ Find posts, this only works on web.me.com
@posts = ($res->content =~ m/<link>(http:\/\/web.me.com\/.*html)<\/link/g);


with


# ------------ Find posts, this only works on web.me.com
my $the_content = $res->content;
$the_content =~ s/www\.lanciainfo\.com/web.me.com\/geoffreyg/gim;
print $the_content if $verbose;
@posts = ($the_content =~ m/<link>(http:\/\/web.me.com\/.*html)<\/link/g);


These lines are trying to fix your messed xml.


I have better luck if I reconfigure the script and run it on the blog at lanciainfo.com – in this case all posts are successfully exported and I have been able to migrate them to a new wordpress installation at http://blog.lanciainfo.com. However, there are two significant issues: first, the post dates all seem to be reset to the same date, and second, the embedded image links are all broken. I suspect these two problems are connected since WP sorts image files into folders that reflect the date they are posted.


Can you offer any insight as to what's happening or how I might fix these issues?


Thanks again,

Justin


the same date: in my local resulting xml (whith the proposed modifications) all seems ok.


embedded image links are all broken:

Set

# Add x to month of upload dir; normaly 1
my $add_upload_dir_month = -1;

play with this value


Hope that helps!

Mirko

Jun 25, 2012 4:32 PM in response to MirkoW

Mirko, thank you so much for all your work on this! I'm having a couple of weird issues. I'm trying to migrate www.jewsandothers.com. I figured out the edit in line 187:

@posts = ($res->content =~ m/<link>(http:\/\/www.jewsandothers.com\/.*html)<\/link/g);


Posts are now being found and processed. But the really weird thing is that in each post, the paragraphs come out in reverse order. Do you have any idea why that might be happening?


Also, some paragraphs—seemingly those that have a link or a <span> tag in them—have their first parts missing completely. They seem to be missing everything up to the beginning of the last link or span tag. The open <span> tag itself is missing, but the text inside is there, and the closing </span> is there in the XML.


In addition, embedded images aren't working. They don't appear to be in the XML file at all. Might that be related to the above issues?


Any help you can provide would be great. Otherwise I'll be doing a lot of copying and pasting in the next few days.


If you want to see the result (having transferred only 5 posts for testing):

http://dev.fuzzylines.com/jaowp/


Thanks!

Fuzzy

Jun 26, 2012 2:40 PM in response to MirkoW

Dear Mirko,


thanks for the great tool! You really put an effort into it and your solution is just plain perfect.


I used the script to generate the .xml for this blog http://web.me.com/aadr.witten/DieAndereSeite/Blog/Blog.html and imported it into wordpress. It works fine up to the last 5 postings. The comments of the last five postings are there, but the actual text is missing (besides the pictures which is fine as i dont want to reproduce the entire blog layout in wordpress) ... Can you give me an hint what he have been doing wrong? This is the Wordpress version:

http://urban-seifert.de/NZ/


Viele Grüße aus dem Ruhrpott 🙂


Urban

Jun 26, 2012 3:24 PM in response to linesarefuzzy


linesarefuzzy wrote:


Mirko, thank you so much for all your work on this! I'm having a couple of weird issues. I'm trying to migrate www.jewsandothers.com. I figured out the edit in line 187:

@posts = ($res->content =~ m/<link>(http:\/\/www.jewsandothers.com\/.*html)<\/link/g);


Posts are now being found and processed. But the really weird thing is that in each post, the paragraphs come out in reverse order. Do you have any idea why that might be happening?


Also, some paragraphs—seemingly those that have a link or a <span> tag in them—have their first parts missing completely. They seem to be missing everything up to the beginning of the last link or span tag. The open <span> tag itself is missing, but the text inside is there, and the closing </span> is there in the XML.


In addition, embedded images aren't working. They don't appear to be in the XML file at all. Might that be related to the above issues?


Any help you can provide would be great. Otherwise I'll be doing a lot of copying and pasting in the next few days.


If you want to see the result (having transferred only 5 posts for testing):

http://dev.fuzzylines.com/jaowp/


Thanks!

Fuzzy


Hi Fuzzy,


reverse order:

line 102

my $para_attach_to_top = "paragraph_style_2";

to

my $para_attach_to_top = "";


you may want to adjust the line before too:

my $exclude_para_pattern = "Out of the Box|iWebBlogPrev|iWebBlogNext"

You can explude "Creative Commons Attribution" and Fuzzy Lines Design


some paragraphs parts missing:

yes this is a known problem by the script, but I can't fix it anymore. The Problem is, that in the source is something which is not enclosed by <p class="$para_pattern">. So the script cannot find the content. In this case its often im immages are dopped into paragraphs.

If I would fix that I had to rewrite the script with some Readability functionality. can't do that in short time. You have to check those postings by hand.


first embedded image is missing:

Try this

line 430


$post_paragraph =~ s/$last_link_or_para//gim;

to

#$post_paragraph =~ s/$last_link_or_para//gim;


Hope that helps,

Mirko

Jun 26, 2012 3:35 PM in response to Ruhrstadt

Ruhrstadt wrote:


Dear Mirko,


thanks for the great tool! You really put an effort into it and your solution is just plain perfect.


I used the script to generate the .xml for this blog http://web.me.com/aadr.witten/DieAndereSeite/Blog/Blog.html and imported it into wordpress. It works fine up to the last 5 postings. The comments of the last five postings are there, but the actual text is missing (besides the pictures which is fine as i dont want to reproduce the entire blog layout in wordpress) ... Can you give me an hint what he have been doing wrong? This is the Wordpress version:

http://urban-seifert.de/NZ/


Viele Grüße aus dem Ruhrpott 🙂


Urban


Hi Urban,


you have to set the pattern in line 97


my $para_pattern = "(paragraph_style.*|.*style_1|Body)";

to

my $para_pattern = "(paragraph_style.*|.*style_1|Body|Free_Form)";


If you have more postings where the conent is missing you have to look in the html source for

<p class="Free_Form">blabla</p>

and add the classname to $para_pattern.


since your Blog uses german dates you have to uncomment line 111 (after # Month) and make line 112 a comment.


Hoffe Du kommst damit weiter!

Gruß,

Mirko

Jun 27, 2012 6:30 AM in response to MirkoW

Thanks for your help. Is there a reason you don't just copy all the html in the post? Why search through for only <p> tags and process them individually? On my blog at least, the content of each post seems to always be contained in a unique div element with class "style". There must be an easy way to capture everything within <div class="style">...</div>, but I'm having trouble figuring it out.


Thanks,

Fuzzy


P.S. If this works out, I will make a donation on your website!

Jun 27, 2012 12:33 PM in response to christinaahmannnevill

christinaahmannnevill wrote:


Hi MirkoW,


Thanks for taking the time to develop the script. I'm having some difficulties running the script would you mind helping me out?


My http://web.me.com/cahmann/ChristinaAhmann/Updates/rss.xml


Redirects to: www.christinaahmann.com/ChristinaAhmann/Updates/rss.xml


New blog: http://66.240.252.128/~christin/wordpress/

Christina,


I've just sent you an email with a special version for your blog. Doesn't make sense to post all the modifications.

just give it a try, only 3 (2?) days left.


Best regards,

Mirko

Jun 27, 2012 12:38 PM in response to linesarefuzzy

linesarefuzzy wrote:


Thanks for your help. Is there a reason you don't just copy all the html in the post? Why search through for only <p> tags and process them individually? On my blog at least, the content of each post seems to always be contained in a unique div element with class "style". There must be an easy way to capture everything within <div class="style">...</div>, but I'm having trouble figuring it out.


Thanks,

Fuzzy


P.S. If this works out, I will make a donation on your website!


sorry too much work for only 3 days left.


may be you have success by this idea:


line 319

my @para = ($res->content =~ /<p.*?class="$para_pattern">(.*?)<\/p>|(<img.*?src=".*?png\".*?>)|(<img.*?src=".*?jpg\".*?>)/g);



into


my $the_content= $res->content;
$the_content =~ s/<\/*span.*?>//gim;
my @para = ($the_content =~ /<p.*?class="$para_pattern">(.*?)<\/p>|(<img.*?src=".*?png\".*?>)|(<img.*?src=".*?jpg\".*?>)/g);



best regards,

Mirko

Move iWeb MobileMe Blog to Wordpress

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.