Need to assistance unpacking a string

Hi happy weekend...

I'm still not fluent yet, brain still rewiring... but admiring scripts


So this project is one of a series of conversion, porting over scripts from google to apple. Reason being I like to slowing do it all in house in apple, to minimize export data to google processing and porting back results; besides I had to break it into chunks for google since it has a lower character limit than Apple Numbers.


Phase I

GOALS

  1. unpack Sheet 1 cell A1 string (or perhaps the entire column not limited to A1) in chunks of 3 to populate Sheet 2
  2. Count each occurrence of each group of three digit city codes LAX RDU YYZ e.t.c
  3. Lastly rank them in Sheet 3 account most repeated cities and the number first five ranks and the least repeat five ranks and count of occurrence. You should end up with ten cities


Phase II (not expecting to get this far due to complexity, I will be happy with the first phase )

Is to apply permutation to catch the occurrence disregarding the exact order. Reason to catch possible misspells. certain city share same letters only rearranged e.g ORD DOR JFK FKJ CDG LHR RHL

  1. Count occurrence using new criteria
  2. rank them
  3. save ranks to Sheet 4


That's basically the original js script did. I guess the main thing I would be loosing is the side bar with the menu that displays the results in addition to writing to the sheet.


My attempt not working, & incomplete too; looks like i'm far from setting it up. I've been tinkering a bit with it though.


-- Get the active sheet
tell front document of application "Numbers"
	set sourceTable to table "Table 1" of sheet "Sheet 1"
	set destTable to table "Table 1" of sheet "Sheet 2"
	set rankedData to table "Table 1" of "Sheet 3"
	-- Get the input string from cell A1 of the active sheet
	set inputStr to "A1" of table "Table 1" of sheet "Sheet 1"
	
	-- Unpack the string into 3 digits until exhausted
	set digits to {}
	repeat with i from 1 to the length of inputStr by 3
		set end of digits to text i thru (i + 2) of inputStr
	end repeat
	
	-- Write the digits to the new sheet
	tell table 1 of sheet 2
		repeat with i from 1 to the length of digits
			make new row at end with properties {values:{item i of digits}}
		end repeat
	end tell
	
end tell

My blue print


Sample of my google result


thanks as always, open to suggestions and what you have e.t.c

MacBook Air

Posted on Oct 29, 2023 8:35 PM

Reply
Question marked as Top-ranking reply

Posted on Oct 31, 2023 9:55 AM

I misunderstood from your original description that the content of the cell would be something like

LAX RDU YYZ e.t.c

but in reality it's

LAXRDUYYZ e.t.c


So I had erroneously changed the script from

repeat with i from 1 to the length of inputStr by 3

to

repeat with i from 1 to the length of inputStr by 4


Also some components like 0E8 are converted by Numbers to 0E+00, i.e. 0 x 10^0. To counteract that formatting the cell as text keeps the original value 0E8.


I was actually wondering if Sheet 2 was needed at all. Maybe you had put it just to count the occurrences of each code, but the script counts them so maybe that part could be avoided if that's the case.


So with all these musings I suggest the following version. See in next message.

Similar questions

21 replies
Question marked as Top-ranking reply

Oct 31, 2023 9:55 AM in response to Stvylife

I misunderstood from your original description that the content of the cell would be something like

LAX RDU YYZ e.t.c

but in reality it's

LAXRDUYYZ e.t.c


So I had erroneously changed the script from

repeat with i from 1 to the length of inputStr by 3

to

repeat with i from 1 to the length of inputStr by 4


Also some components like 0E8 are converted by Numbers to 0E+00, i.e. 0 x 10^0. To counteract that formatting the cell as text keeps the original value 0E8.


I was actually wondering if Sheet 2 was needed at all. Maybe you had put it just to count the occurrences of each code, but the script counts them so maybe that part could be avoided if that's the case.


So with all these musings I suggest the following version. See in next message.

Nov 1, 2023 9:59 PM in response to Recycleur

I confess I may not understand the aim here, and why a script is the preferred way (maybe hundreds of strings to uppack?), but here is a simple way to unpack the string and find the most commonly recurring values:





Table 2:


A2: input 1


A3, and filled down (drag down enlarge table until no value in B, in the example to row 210):


=$A2+3


B2:


=MID(Table 1::$A$2,$A2,3)


Then apply 'is distinct' filter on column B as in the screenshot.


Create a Table 3.


Select B and C of Table 2, and command-c to copy.


Click once in upper left cell in Table 3, the Edit > Paste Formula Results.


Sort descending on column B.


You can also skip creating a filter on Table 2 and get the sorted results quickly using a Pivot Table, like this:




Click in Table 2, from the menu choose Organize > Create Pivot Table > On Current Sheet.


Drag fields as shown in screenshot, change 'Count (Sum)' to 'Count (Average)' by clicking the circled i.


Sort descending on B.


SG

Oct 30, 2023 5:43 PM in response to Stvylife

PART 2


sort(codeList, 1, -1, {comparer:codeCount_ascending})

set leastUsedCodes to items 1 thru 5 of codeList
set mostUsedCodes to reverse of (items -5 thru -1 of codeList)

tell front document of application "Numbers"
	
	tell rankedData
		if row count < 11 then set row count to 11
		
		repeat with i from 1 to count of mostUsedCodes
			set value of cell (i + 1) of column 2 to (code of item i of mostUsedCodes)
			set value of cell (i + 1) of column 3 to (codeCount of item i of mostUsedCodes)
		end repeat
		
		repeat with i from 1 to count of leastUsedCodes
			set value of cell (i + 6) of column 2 to (code of item i of leastUsedCodes)
			set value of cell (i + 6) of column 3 to (codeCount of item i of leastUsedCodes)
		end repeat
	end tell
	
end tell


on addToList(theList, theItem)
	set foundCode to false
	
	repeat with anItem in theList
		if code of anItem is theItem then
			set anItem's codeCount to (anItem's codeCount) + 1
			set foundCode to true
		end if
	end repeat
	
	if not foundCode then
		set end of theList to {code:theItem, codeCount:1}
	end if
end addToList


script codeCount_ascending
	-- Compare two records by codeCount property.
	on isGreater(a, b)
		(a's codeCount > b's codeCount)
	end isGreater
end script


continued...

Oct 31, 2023 10:03 AM in response to Stvylife

middle of PART 1, version 2


-- keep original part here
	
	repeat with aCell in sourceTable's column 1's cells
		
		set inputStr to value of aCell
		
		if inputStr is not missing value then
			
			-- Unpack the string into 3 digits until exhausted
			
			set digits to {}
			repeat with i from 1 to length of inputStr by 3
				try -- in case the number of characters is not a multiple of 3
					-- if using sheet 2
					--set end of digits to text i thru (i + 2) of inputStr
					
					-- if not using sheet 2
					set aCode to text i thru (i + 2) of inputStr
					my addToList(codeList, aCode)
				end try
			end repeat
			
			(*
			-- not needed if going directly from Sheet 1 to Sheet 3
			-- Write the digits to the new sheet
			tell destTable
				set newRow to add row below last row
				set countDigits to count of digits
				if column count < countDigits then set column count to countDigits
				repeat with i from 1 to countDigits
					set format of newRow's cell i to "text" -- new line to take care of codes like 0E8
					set value of newRow's cell i to item i of digits
					my addToList(codeList, value of newRow's cell i)
				end repeat
			end tell
			*)
		end if
		
	end repeat
	
-- keeep original part here


Nov 1, 2023 9:15 AM in response to Stvylife

I can assure you that my list is the correct one based on the 627-character string you posted. I tested it four different ways, with Numbers Find, with the script, with COUNTMATCHES on the full string and with COUNTIF on the separated parts (on Sheet 2, reactivated part of the script). Note: COUNTMATCHES gives a higher result for AAA because there are a lot of them grouped together).


It looks like you haven't put or called the CustomQuicksort part yet. That's how it would look with it.



0E8 has 12 instances, hence it doesn't make the top-5.


Little improvement for speed to the addToList handler.


on addToList(theList, theItem)
	-- version 3
	repeat with anItem in theList
		if code of anItem is theItem then
			set anItem's codeCount to (anItem's codeCount) + 1
			return
		end if
	end repeat
	
	set end of theList to {code:theItem, codeCount:1}
end addToList


Finally, don't worry, ORD and DOR will be counted separately, they are different strings.



Oct 30, 2023 5:42 PM in response to Stvylife

Start with that.


PART 1


-- Get the active sheet
tell front document of application "Numbers"
	set sourceTable to table "Table 1" of sheet "Sheet 1" -- if you define an object use it, see in code
	set destTable to table "Table 1" of sheet "Sheet 2"
	set rankedData to table "Table 1" of sheet "Sheet 3"
	set codeList to {}
	
	set destTable's row count to 1
	
	repeat with aCell in sourceTable's column 1's cells
		
		set inputStr to value of aCell
		
		if inputStr is not missing value then
			
			-- Unpack the string into 3 digits until exhausted
			set digits to {}
			repeat with i from 1 to length of inputStr by 4
				set end of digits to text i thru (i + 2) of inputStr
			end repeat
			
			-- Write the digits to the new sheet
			tell destTable
				set newRow to add row below last row
				set countDigits to count of digits
				if column count < countDigits then set column count to countDigits
				repeat with i from 1 to countDigits
					set value of newRow's cell i to item i of digits
					my addToList(codeList, value of newRow's cell i)
				end repeat
			end tell
			
		end if
		
	end repeat
	
end tell


continued...

Oct 30, 2023 5:40 PM in response to Stvylife

PART 3


-- https://mac.softpedia.com/get/Utilities/A-Dose-of-Sorts.shtml
(* Quicksort — customisable version
Algorithm: S.A.R. (Tony) Hoare, 1960.
AppleScript implementation: Arthur J. Knapp and Nigel Garvey, 2003.
Very minor modifications by Nigel Garvey 2007 & 2010.
NG's version of the customisation, rethought 2010.

Parameters:
New: (list, range index 1, range index 2, record with optional 'comparer' and 'slave' properties). The 'comparer' value is a script object containing an isGreater(a, b) handler which determines if object a is "greater" than object b. The 'slave' value is a script object containing a swap(a, b) handler, which is called when items a and b of the list being sorted are swapped, and a shift(a, b) handler, which is called when the insertion sort moves item b of the list being sorted to position a, the intervening items being shifted up one position in the process.
Or, for compatibility with the customisation for the older version of CustomQsort: (list, range index 1, range index 2, script object containing the isLess(a, b), swap(a, b), and shift(a, b) handlers mentioned above)
Where the 'comparer' or 'slave' properties are omitted, or the customisation parameter isn't a record or a script object, the sort has default handlers which respectively compare items directly and do nothing.
*)

on CustomQuicksort(theList, l, r, customiser)
	script o
		property comparer : me
		property slave : me
		
		property lst : theList
		
		on qsrt(l, r)
			-- Get an item from the middle of the range to use as a pivot value.
			set pivot to my lst's item ((l + r) div 2)
			
			-- Traverse the range from both ends, swapping items from the low end whose values are greater than or equal to the pivot with items from the high end whose values are less than or equal to it. Stop when the traversals cross.
			set i to l
			set j to r
			repeat until (i > j)
				-- Search from the low end for the next greater-or-equal value.
				set u to my lst's item i
				repeat while (comparer's isGreater(pivot, u))
					set i to i + 1
					set u to my lst's item i
				end repeat
				


continued...

Oct 31, 2023 10:06 AM in response to Stvylife

PART 2, version 2


sort(codeList, 1, -1, {comparer:codeCount_ascending})

set leastUsedCodes to items 1 thru 5 of codeList
set mostUsedCodes to reverse of (items -5 thru -1 of codeList)

tell front document of application "Numbers"
	
	tell rankedData
		if row count < 11 then set row count to 11
		set format of columns 2 thru 3 to "text" -- new line to take care of codes like 0E8
		
		repeat with i from 1 to count of mostUsedCodes
			set value of cell (i + 1) of column 2 to (code of item i of mostUsedCodes)
			set value of cell (i + 1) of column 3 to (codeCount of item i of mostUsedCodes)
		end repeat
		
		repeat with i from 1 to count of leastUsedCodes
			set value of cell (i + 6) of column 2 to (code of item i of leastUsedCodes)
			set value of cell (i + 6) of column 3 to (codeCount of item i of leastUsedCodes)
		end repeat
	end tell
	
end tell

-- keep original part here


Version 2 updates complete

Oct 30, 2023 5:41 PM in response to Stvylife

PART 4


				-- Search from the high end for the next lesser-or-equal value.
				set w to my lst's item j
				repeat while (comparer's isGreater(w, pivot))
					set j to j - 1
					set w to my lst's item j
				end repeat
				
				if (i > j) then
				else
					-- Swap the two values (even if they're the same!) and continue.
					set my lst's item i to w
					set my lst's item j to u
					
					-- Tell the slave object this exchange has occurred.
					slave's swap(i, j)
					
					set i to i + 1
					set j to j - 1
				end if
			end repeat
			
			-- At this point, items l thru j are all greater than or equal to the pivot and items i thru r are all greater than or equal to it. There may be an item BETWEEN j and i, which will be an instance of the pivot in exactly the right place. This can be ignored.
			
			-- Recursively apply the process to the lesser-or-equal items and then to the greater-or-equal items. When both recursions have returned, the range will be fully sorted.
			if (j > l) then qsrt(l, j)
			if (i < r) then qsrt(i, r)
		end qsrt
		
		-- Default comparison and slave handlers for an ordinary, qSort-style sort.
		on isGreater(a, b)
			(a > b)
		end isGreater
		
		on swap(a, b)
		end swap
	end script
	


continued...

Oct 30, 2023 5:42 PM in response to Stvylife

PART 5


	-- Process the input parmeters.
	set listLen to (count theList)
	if (listLen > 1) then
		-- Negative and/or transposed range indices.
		if (l < 0) then set l to listLen + l + 1
		if (r < 0) then set r to listLen + r + 1
		if (l > r) then set {l, r} to {r, l}
		
		-- The customisation parameter for the earlier version of CustomQsort was a script object containing both comparison and slave handlers. Accept that, if presented; otherwise expect a record containing comparer and/or slave properties, whose values are script objects containing the comparison and slave handlers respectively. If either property is omitted from the record, or if the parameter isn't a script object or a record, use the appropriate default handers in the script object above.
		if (customiser's class is script) then
			try -- Use the customising script's isLess handler, if it has one.
				customiser's isLess
				set o's comparer to customiser
			end try
			try -- Use the customising script's slave handlers, if it has them.
				customiser's swap
				customiser's shift
				set o's slave to customiser
			end try
		else if (customiser's class is record) then
			set {comparer:o's comparer, slave:o's slave} to customiser & {comparer:o, slave:o}
		end if
		
		-- Do the sort.
		o's qsrt(l, r)
	end if
	
	return -- nothing.
end CustomQuicksort

property sort : CustomQuicksort



** COMPLETE **

Oct 31, 2023 8:48 AM in response to Recycleur

getting some errors from PART 1


Do we need to set a count for length of A1 perhaps that's what not carring through. The string in quote was echoed from A1, that's what I have in cell A1 Sheet 1


--error "Numbers got an error: Invalid row count" number -10000


--error "Can’t get text 41 thru 43 of \"Iahmsyhnlbna0A80B80C10D90E80F20G70H10I20J4\"." number -1728 from text 41 thru 43 of "Iahmsyhnlbna0A80B80C10D90E80F20G70H10I20J4"




Nov 1, 2023 8:34 AM in response to Recycleur

I apologize for not mentioning that the standard is three letters and that there were no delimiters. The code was a product of concatenation and as such, it had no splices. Normally, I would have been able to share the document, but I was in my own headspace. You’re right, 0E8 (represents CROWNPOINT, NM), which does throw a curveball. The original Sheet 2 was meant to count and troubleshoot the initial sample. If we were doing it manually with a formula, it might have been useful; however, it would be of no use at this point, especially on a large data sample. I appreciate the suggestion, and sure did run the updated version last night, and it sure went straight to Sheet 3 niece! beautiful indeed


I like the looks of the script, it might generally look neater than js version.


Anyway, I am try now to string rest of the parts together to test out the second phase. Do I simply paste them all in sequence, still trying to sort that in... with little troubles figure the point to join the scripts.


oh here is sample of the comparison of your script with the original version

I also changed the data sample with less repeats


"hnlhnliahiahiahsfomsymsyhnlbna0J40J40J40J40A80B80C10C10D90E80F20G70H10I20J40J40J40J40J4AAAAAAAAAGGGLLLRRR0C10D9LASbna0E80F20G70H10I20J4AAAGGGbna0A80B80C10D9LASbna0E80F20G70H1AAAGGGbna0A80B80C10D9LASbna0E80F20G70J4AAAGGGbna0A80B80C10D9LASsfo0E80F20G7GGG0A80B80C10D9LASsfo0E80F20G7AAAiah0B80C10D9LASsfo0E80F20G7AAAiahmsy0D9LAS0E80F2AAAiahmsyLASsfo0E80F2AAAiahmsyLASsfo0F2AAAiahmsyLASAAAiahmsyLASAAAiahLASAAAiahmsyLASiahmsyLASmsyLASmsymsymsymsymsyLLLiah0B80C10D9LASsfo0E80F20G7AAA0J4iahmsy0D9LASsfo0E80F2AAAAAAiahmsyLASsfo0E80F2AAAsfoiahmsyLASsfo0F2AAAmsyiahmsyLASAAAiahmsyLASAAAiahmsyLASAAAiahmsyLASiahmsyLASmsyLASmsymsymsymsymsy"


Next I will add "DOR ORD JFK FKJ KJF*" or even some none* real states to get the shared characters in, to see how it counts that. The rank of that should be separated.

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

Need to assistance unpacking a string

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.