Apple Event: May 7th at 7 am PT

Looks like no one’s replied in a while. To start the conversation again, simply ask a new question.

problem with import of XML Apple HealthKit Export Version: 12

For several years I have been exporting my Apple Health Export to XML and then importing it into R for analysis. I just upgraded to iOS 16.0. I'm using the same code I used in August, but now I get the following error:


read_xml.character("~/Downloads/apple_health_export/export.xml") :  
ATTLIST: no name for Attribute [68]


Given that I just did the upgrade I thought I'd see if anyone else was having a new problem with the export. I've tried a couple of different R packages to do the import with no luck.


It's HealthKit Export Version: 12


My export.xml is now 1.7 GB so it makes it awkward to try to check it out with a text editor.


Any suggestions?


I've done quite a bit of work with the XML export previously. Here are some blog posts on that topic. I'm hoping I'm not alone having a new problem with this version. I just did the update this afternoon, so I suppose my next step is to try the most sophisticated technique I know of and turn my phone off and turn it back on and hope for something different.

iPhone 13 Pro

Posted on Sep 16, 2022 4:56 PM

Reply
Question marked as Best reply

Posted on Sep 18, 2022 1:45 PM

OK. I finally upgraded to iOS 16. I saved a copy of my iOS 15 Health Data for comparison.


I usually try to give Apple the benefit of the doubt. But this case, Apple completely screwed this up. Not even close.


First of all, Apple is still using DTD files. That's surprising. Plus, it is embedded rather than referencing a URL. That's unusual. Most importantly, Apple scrambled the syntax so it won't ever work. Note to Apple, use a schema instead and do it correctly. No other way to say it.


Edit: in addition to the above, Apple flat-out did the DTD incorrectly. I fixed the scrambled syntax and was able to get "xmllint" happy, but it still failed validation. So I went ahead and fixed the DTD too.


Next, Apple scrambled the output of some elements, repeating the "startDate" attribute twice. The standard rule of thumb is to use a proper XML-processing library to generate XML for output. But those old XML libraries are cryptic and difficult to use. And sometimes, vendors who provide XML-processing libraries release buggy ones (that dig's for you, too, Apple 😄 ). So often, developers take shortcuts and just spit out text that they hope will be valid XML. Full disclosure, I do this myself. But if you roll your own, you've got to do it correctly. Apple failed at that.


There are two distinct fixes you will have to apply. I will write this out as a longer sequence of steps just to make things crystal clear.

1) Export your health data. I saved mine to my iCloud Drive. You'll definitely need a Mac to fix this.

2) Go into your iCloud Drive, or wherever you saved the file and expand the "export.zip" file. It will be expanded into a folder named "apple_health_export". For ease of description, I copied my file to "/tmp" and exported it there.

3) In the Terminal, navigate to that folder using "cd /tmp/apple_health_export". (Not including quotes)

4) Create a new filed called "patch.txt" with the following content:

--- export.xml	2022-09-18 15:17:09.000000000 -0400
+++ export-fixed.xml	2022-09-18 16:37:08.000000000 -0400
@@ -15,6 +15,7 @@
   HKCharacteristicTypeIdentifierBiologicalSex       CDATA #REQUIRED
   HKCharacteristicTypeIdentifierBloodType           CDATA #REQUIRED
   HKCharacteristicTypeIdentifierFitzpatrickSkinType CDATA #REQUIRED
+  HKCharacteristicTypeIdentifierCardioFitnessMedicationsUse CDATA #IMPLIED
 >
 <!ELEMENT Record ((MetadataEntry|HeartRateVariabilityMetadataList)*)>
 <!ATTLIST Record
@@ -39,7 +40,7 @@
   startDate     CDATA #REQUIRED
   endDate       CDATA #REQUIRED
 >
-<!ELEMENT Workout ((MetadataEntry|WorkoutEvent|WorkoutRoute)*)>
+<!ELEMENT Workout ((MetadataEntry|WorkoutEvent|WorkoutRoute|WorkoutStatistics)*)>
 <!ATTLIST Workout
   workoutActivityType   CDATA #REQUIRED
   duration              CDATA #IMPLIED
@@ -63,7 +64,7 @@
   duration             CDATA #IMPLIED
   durationUnit         CDATA #IMPLIED
 >
-<!ELEMENT WorkoutEvent EMPTY>
+<!ELEMENT WorkoutEvent (MetadataEntry?)>
 <!ATTLIST WorkoutEvent
   type                 CDATA #REQUIRED
   date                 CDATA #REQUIRED
@@ -79,6 +80,7 @@
   minimum              CDATA #IMPLIED
   maximum              CDATA #IMPLIED
   sum                  CDATA #IMPLIED
+  unit                 CDATA #IMPLIED
 >
 <!ELEMENT WorkoutRoute ((MetadataEntry|FileReference)*)>
 <!ATTLIST WorkoutRoute
@@ -153,6 +155,7 @@
   dateIssued       CDATA #REQUIRED
   expirationDate   CDATA #REQUIRED
   brand            CDATA #IMPLIED
+>
 <!ELEMENT RightEye EMPTY>
 <!ATTLIST RightEye
   sphere           CDATA #IMPLIED
@@ -203,13 +206,6 @@
   diameter         CDATA #IMPLIED
   diameterUnit     CDATA #IMPLIED
 >
-  device           CDATA #IMPLIED
-<!ELEMENT MetadataEntry EMPTY>
-<!ATTLIST MetadataEntry
-  key              CDATA #IMPLIED
-  value            CDATA #IMPLIED
->
->
 ]>
 <HealthData>
  <ExportDate/>

Ignore the colours. That is just something the forum software adds to a code block.

5) Run the following command to patch your XML export file: "patch < patch.txt". (Again, omit the quotes.) The output should look like this:

patching file export.xml
Hunk #6 succeeded at 206 with fuzz 2.

Don't worry about that hunk #6 warning. I had to manually hack the patch file since it originally included two lines of content, which would be different. Luckily, patch gracefully handles it.

All this command does is replace the DTD data with something correct.


Similar questions

35 replies

Oct 29, 2022 7:27 AM in response to svraka

svraka wrote:

Thanks etresoft!

For my export today on iOS 16.1 it was enough to apply your patch. There weren't any duplicate `startDate`s and the patch didn't throw any warnings either. The resulting XML is still not valid but I could process it the same way I did on iOS 15. Maybe Apple changed fixed a few things in iOS 16.1?


Try this export on 16.1 without the repair script. That would be the test to see if this was fixed.

Oct 30, 2022 8:33 AM in response to svraka

Updated to ios16.1 on my iPhone yesterday. Tried a new export of health data to xml. Had three errors in the initial (data def) section of export.xml. BUT ... no more duplicate "startDate" tags detected :-)

Based on my (and entreSoft's) experience, dup "startDate" error must have been fixed.


Errors in front part were misplaced or duplicate closings (">") around lines 156, 207 and 212 - which precluded import and parsing by Python 'etree' libs. Obvious errors fixed manually.


Have not actually validated any of the data yet.

problem with import of XML Apple HealthKit Export Version: 12

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple ID.