How do I find the most frequently occuring numbers in a set.

How do I find the most frequent occurring numbers in a set. Is there a function that will accomplish that task? I can find the mode using the function, but I was wondering if there was a way to find the top five or ten most occurring numbers in a set.


An example would be if you had 100 days of lottery drawings, 5 numbers per day and wanted to find the top ten most frequently drawn numbers. So basically there is a set with 500 numbers. I found the mode, by using the mode function and selecting the entire set of data that I am using but have not been able to figure out a way to find the most frequent numbers if I wanted the top five or ten occurring numbers.


I could not find any related posts. I appreciate your help.

MacBook

Posted on Apr 6, 2012 3:24 PM

Reply
5 replies

Apr 6, 2012 5:00 PM in response to Senor Hammer

Count the ocurrences of each possible number using COUNTIF.


Pick out the five (or ten) largest counts using LARGE. and LOOKUP


Here's a smaller sample showing results after nineteen draws of five numbers each from a pool of 25 possible numbers:

User uploaded file

The smaller table at the top, named Draws, records the five numbers drawn in each draw in a column. The numbers her were generated randomly, yours would be recorded after each draw. The colour fills were applied to the cells using conditional formatting, and are there as an aid to my checking of the counts reported in Counts.


On the second table, Counts:


Column A: This column contains the (25) numbers in the pool for each draw.


Column B: This column uses COUNTIF to count the number of times each number has been drawn in the nineteen draws recorded on Draws:


B2, and filled down to B26: =COUNTIF(Draws :: $B:$T,"="&A)-ROW()/10000


The second part of the formula subtracts a small amount from the count to prevent duplicate values in this column. The column is formatted to show no decimal places, but the actual value in each cell will always be slightly smaller than the integer displayed.


Column C uses LARGE to pick out the ten largest adjusted counts in column B, then uses those values to lookup the associated number in column A, and posts that number in column C.


C2, and filled down for as many rows as you want numbers returned: =LOOKUP(LARGE($B,ROW()-1),$B,$A)


Scaling you to your larger data collection is mostly a matter of making a larger Counts table.


Regards,

Barry

Apr 6, 2012 8:00 PM in response to Senor Hammer

Hola Señor,


B2: =COUNTIF(Draws :: $B:$T,"="&A)-ROW()/10000

B2: =COUNTIF(Draws :: $B:$F,"="&A2)


My formula on top, yours below.


$B:$F is a correct edit for your situation, where the drawn numbers are listed in five columns. $ is an operator that specifies the reference is 'absolute', columns, even if the formula is filled or copied to the left or right.


"="&A and "="&A2 are equivalent, provided the formula is on row 2.

When it is filled down to the next row, the first will look the same ("="&A), but will now reference the cell on row 3 of column A. the second ("="&A2) will change to "="&A3.


"="&A2 is a 'Relative" cell reference. Filled to the right, the column reference (A) will change as the formula is filled right (which it won't be in this case); filled down, the row reference will increase by 1 for each new row.


The expression is the condition statement that determines which value will be included in the count. The condition statement must be presented as a text string, accomplished here by placing quotes around the comparison operator ( = ), then using the concatenation operator ( & ) to attach the content of cell A2, as text to the =. For many people, me included, writing the condition in COUNTIF (or SUMIF) is the part giving the most difficulty.


So the first part of the formula looks fine.


Part 2, the part following the - sign is necessary if there's any possibility of two or more of the numbers having been drawn the same number of times. The probability leans heavily toward this happening, so this part of the formula is necessary. LOOKUP will find only the first occurrence of a repeated count, and will find it repeatedly. Without that adjustment, if the top two numbers are 4 and 9, and each was drawn 41 times, the Top 10 will show 9 in places 1 and 2.


Note that the adjustment does introduce a bias. If the there are three numbers tied for tenth place in the count, the adjustment as written will always assign the tenth place to the one closest to the top of the list (ie. the one whose cell has the lowest row number). Changing the minus sign to a plus sign will reverse the bias and favour numbers further down the list in cases of equal counts.


Now that you've waded through that here's the short answer: Your formula looks fine for getting the actual count, but needs to have the second part added to make the LOOKUP work.


Regards,

Barry

Apr 6, 2012 5:06 PM in response to Senor Hammer

Afterthought:


Check your Lottery administration's website. If it's similar to the Lotto 6/49 site in Canada, the operator will have posted a draw history going back many years, and will also have posted tables showing the statistics you are pulling from the table here.


If your lottery is operated in the same manner as the Canadian one, knowing these statistics should be little help in determining the likely winning numbers in a future draw. Lottery officials take great care to make the draws both random and unpredictable.


Regards,

Barry

Apr 6, 2012 7:18 PM in response to Barry

Barry,


This has been a GREAT help! Thank you. I got the formula to work but I used the formula =COUNTIF(Draws :: $B:$F,"="&A2) becuase I could not get it to work without referencing the cell with the number being counted. For Example, cell A2 is the number 1, A3 is the number 2 etc. I did not understand what the $ and " and & noted and how it affected the outcome. I also had the Table formatted with the Draws in the A Column and the numbers going across the rows. Basically the opposite. I guess it won't matter because it is counting individual numbers. Anyway, thanks again for the great help and if you don't mind I could use some input on wether my beginner version of your formula is correct.

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

How do I find the most frequently occuring numbers in a set.

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.