File conversion to Word

I have hundreds/thousands of files that are listed by my MacBook as "Document", or, more ridiculously "Unix Executable File." These were all once upon a time (5-6 years ago), Word doc or docx documents. I can mass-rename them with a docx extension and access them that way, but they are not content-searchable. I can open them in Word with the recover text option, but that's going to take forever to do all those files one by one.


I have downloaded two apps on the Apple store, both which cost money and you can't test without paying for them, tried to do a bulk conversion, and it didn't work for these files. They are both pending refund right now: The Document Converter, and Document Converter by RootRise.


Is there a way to bulk convert such files to docx in a way that makes them valid Word files that are content-searchable in Spotlight?


MacBook Pro 13 inch 2020, Big Sur 11.5



MacBook Pro 13″, macOS 11.5

Posted on Aug 1, 2021 7:13 AM

Reply
8 replies

Aug 1, 2021 11:14 AM in response to masqthephlsphr

I probably have a script here that will prompt you for the parent folder containing your DOCX and DOC documents and using the installed LibreOffice v7.1.5 command-line tool, loop through the DOC files and convert them to DOCX, in the same, or designated output folder. I do not see any need to convert DOCX to DOCX again. Probably, these are a priority and should be handled before dealing with the UNIX Executable bearing file icons.


I converted a 1995 and a 2003 DOC file to DOCX using the first paragraph and Spotlight easily found search items in the resulting DOCX files.


In the Finder, locate one of your UNIX Executable Files, press the option key, right-click on it and select the secondary menu item Copy "..." as Pathname. This puts the full UNIX path on the clipboard.


Now, launch the Terminal application and at the command-line prompt, enter mdls followed by a space, and then paste the UNIX path from the clipboard. What is the entry for the kMDItemContentType?


mdls $(pbpaste)


Word DOCX documents have a content type of "org.openxmlformats.wordprocessingml.document" and Word DOC files have the string "com.microsoft.word.doc". If you get anything else or a dyn.... entry, it's anyone's guess what these files were originally.

Aug 1, 2021 7:41 AM in response to Barney-15E

I'm pretty sure these files, created in 2015 and 2016, were docx, not doc. So I don't think that's the issue. They came over from my old computer, which was a 2011 MacBook that never got past Sierra.


What search terms do you suggest for finding the right Word script? Because all the Google searches I do seem to think I want to convert to PDF, which... no. I just want to convert Word files not recognized as such to Word docx.

Aug 1, 2021 7:36 AM in response to masqthephlsphr

Apple removed all the old file recognition they used in favor of the file extension method (and metadata). The Unix Executable File has no extension and no metadata, so that is the closest thing it can identify for file type.


If they were doc files and you made them docx, the text importer for spotlight may not recognize them.

It is also possible that Microsoft did not make a text importer for doc files (I don't have any to check).

If you rename them .doc, they should up in compatibility mode in Word where you can save them as docx. However, that doesn't resolve the main problem of having too many to convert that way.


As another vector to find a converter, Word is scriptable, so you may be able to find a script that would open and re-save all of the files as the "new" OpenDocument format (docx).

Aug 2, 2021 8:56 AM in response to VikingOSX

I think that, with the exception of files truly created prior to 2003, most of the problems I encountered in a simple add-on of the docx extension was the need to reindex spotlight. So I am going through the files folder by folder and identifying files that are not listed as Kind=Microsoft Word docx, and adding on the docx extension, then reindexing and seeing if I can do a content search. So far that seems to be working, but I am going to bookmark this discussion in case I come up against any scenario that is being stubborn in large bulk.

Aug 2, 2021 10:40 AM in response to masqthephlsphr

Kind of glossed over the fact that you need to remember that third paragraph from my original post:


In the Finder, locate one of your UNIX Executable Files, press the option key, right-click on it and select the secondary menu item Copy "..." as Pathname. This puts the full UNIX path on the clipboard.


The UNIX pbpaste command would then place the full UNIX path from the clipboard onto the command-line permitting the xxd result.


Be careful adding extensions. Perform a Get Info (option+command+i) on the apparent extensionless file to see if its real extension is simply hidden ([√] Hide extension) by the Finder. If this is unset on that panel, then use the xxd syntax above to see if it is actually a Word document.


You can tell Spotlight to find all Word documents (both doc and docx) with extensions (not UNIX executable) via:

kind:word



This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

File conversion to Word

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.