Jump to content
Science Forums

Lottery Probability


Iron4ever

Recommended Posts

  • 2 years later...

Chaos and Certainty.

 

In a 6/48 lottery, the odds of having a consecutive set such as 1-2-3-4-5-6 are less likely than drawing any random-numbered set; simply because the number of consecutive-numbered sets is much smaller than a sea of random-numbered sets. Thus we know that the random-numbered set is more likely, but which particular one, we can not know.

 

The reason is that consecutive-numbered set always exists as a 6 digit unit in a more limited domain. The random-numbered set exists throughout the random domain. The equality only exists in randomness, but not between randomness and limited domains. The equality lies within chaos, or within more certainty, but not between chaos and certainty.

 

Similarly, a 3 numbered consecutive subset of a six digit set has lower probability than any random set.

Link to comment
Share on other sites

Yes,

2-4-3-1 does count as consecutive set, but its domain is limited to 1-2-3-4.

 

The probability presupposes that each drawing is an independent event. However, in addition to independent drawing, the drawing of sets must be accounted for (a drawing of consecutive unit such 1-2-3-4 in a domain of 49 numbers. Thus, a drawing of 6 balls independently is also a drawing of a six-ball set from the batch. The probability that a random set of balls, instead of a consecutive set of balls, must be taken into account.

 

Yes, we are drawing each ball individually, but we are also effectively drawing a handfull of six balls. at the same time.

Thus probability is defined by independent and dependent events.

 

The probability of a consecutive numbered set is smaller than a random set.

Link to comment
Share on other sites

Some mathematicians, or statisticians will say that a chance of 1-2-3-4-5-6 is the same as drawing 2-12-13-22-31-35 in a lottery. But they only take into account each individual drawing.

 

The chance of 1-2-3-4-5-6 is smaller than that of drawing any other random-numbered set, precisely because we must account for the set-drawing probability (consecutive set v. random-numbered set) to account for handful-drawing event.

 

Numbers nearer to the previously drawn number are less likely to be drawn, in reality, and when handful-drawing probability is accounted for as history backfeed loop.

Link to comment
Share on other sites

Welcome to hypography, lawcat!

The chance of 1-2-3-4-5-6 is smaller than that of drawing any other random-numbered set, precisely because we must account for the set-drawing probability (consecutive set v. random-numbered set) to account for handful-drawing event.
This is incorrect.
Some mathematicians, or statisticians will say that a chance of 1-2-3-4-5-6 is the same as drawing 2-12-13-22-31-35 in a lottery.
They, including me, say this because it can be proven and demonstrated to be true.
So, what is the probability of drawing 2-6-8-9 vs. 1-2-3-4, in mathematical terms? (2-6-8-9 was chosen randomly)
Assuming that each numbers is uniformly randomly drawn, the probability of drawing either of these sequences, or any other sequence, seemingly “bad” choices like 1-1-1-1-1-1 are the same: [math]\frac{1}{48^6} = \frac{1}{12230590464}[/math], or about one in twelve billion.

 

A way to understand this is to note that a sequence of 6 numbers from 1 to 48 is simply a six digit base 48 representation of a number. The likelihood of choosing a sequence of 6 numbers from 1 to 48 is the same as choosing a single number from 1 to 12230590464, a sequence of 2 numbers from 1 to 110592, of 3 numbers from 1 to 2304, a sequence of 3 numbers from 1 to 36 followed by 3 numbers from 1 to 64, etc. In other words, the numeral system used to represent the number has no impact on its probability of being randomly drawn.

 

It’s important not to confuse the probability of drawing a particular consecutive sequence (eg: 1-3-4-5-6) or non-consecutive sequence (eg: 17-27-9-4-39-30), which are identical, with the probability of some consecutive sequence being drawn vs. some non-consecutive sequence. Since there are only 48-6+1=43 consecutive sequences, the probability of a given draw being a consecutive sequence is [math]\frac{48}{12230590464}[/math]. The probability of a non-consecutive sequence being drawn is [math]\frac{12230590464-48}{12230590464}[/math]

Numbers nearer to the previously drawn number are less likely to be drawn, in reality, and when handful-drawing probability is accounted for as history backfeed loop.
:Exclamati :QuestionM How can this be so, lawcat? Can you give a mathematical proof or numeric simulation demonstration of this claim? Or, can you provide a link or reference backing up your claim? :QuestionM :Exclamati

 

It’s actually a site rule that you be able to provide one of these to back up your claim. If you think about the problem systematically, though, or study an introductory probability and statistics text or website, I think you’ll change you mind, and agree with all the mathematicians and statisticians that the probability of a given sequential series being drawn is the same as a non-sequential sequence.

 

Here’s a short MUMPS program that draws random 6 number sequences of 1-10 (it’s necessary to use a smaller range of numbers that 1-48 in order to get any wins in a timely manner), and its output for 23 million drawings or so:

S B=10,N=6 K C S D=","
F I=0:1:B-N X "F J=1:1:N S $P(S,D,J)=I+J" S C(S)=1
F I=0:1:B-N X "F J=1:1:N S $P(S,D,J)=$R(B)+1" ZT:$D(C(S))  S C(S)=2
S (T(0),T(1),T(2),T1(1),T1(2))=0
F T1=1:1 S T1(1)=T1(1)+T(1),T1(2)=T1(2)+T(2) W $J(T(1),8),$J(T(2),8),$J(T1(1),8),$J(T1(2),8),! S (T(0),T(1),T(2))=0 F T=1:1:1000000 X "F J=1:1:N S $P(S,D,J)=$R(B)+1" S T(+$G(C(S)))=T(+$G(C(S)))+1
      0       0       0       0
      3       4       3       4
      5       7       8      11
      8       2      16      13
      6       7      22      20
      2       4      24      24
      2       2      26      26
      2       3      28      29
      7       2      35      31
      8       6      43      37
      3       2      46      39
      7       7      53      46
      4       4      57      50
      5       1      62      51
      3       5      65      56
      8       4      73      60
      5       8      78      68
      8       5      86      73
      4       2      90      75
      1       5      91      80
      2       6      93      86
      4       5      97      91
      0       4      97      95
      3       7     100     102

From left to right, the numbers are consecutive sequence wins, non-consecutive sequence wins, and cumulative consecutive and non-consecutive wins. Each line is 1 million drawings. There are 5 consecutive sequence, and 5 random non-consecutive sequences.

 

Of course, most real lottery drawings use physical ball-blowing machines, so are not truly random, so there are tiny differences in the actual probabilities of specific sequences. Describing these are left as an exercise for the reader ;)

Link to comment
Share on other sites

Craig,

 

I understand perfectly that factorial statistics show that the probabilites for all numbers are the same. My point is that those statistics have limited purpose related to independent individual drawings, but do not account for set probabilities; limited against random domain probabilites.

 

If you are drawing 6 balls, and you are drawing those 6 balls from numbers 1-20, your random domain is 1-20. Your six-ball limit must place a probability constraint in relationship to your random domain.

 

For example, if you are drawing 6 balls from the batch of numbers 1-6. There is a 100% chance that the numbers drawn will be 1-2-3-4-5-6, in any sequence.

 

However, as you pointed out above in response to Freeze, when you draw 6 balls from a batch of 50, the chances that the set will be consecutive are much smaller than that the set will be in a wider domain (i.e. numbers from 2-37), spread in the random domain.

 

Now, when you draw 6-ball set from a random domain of 1-2,500,000, the chance that you will get a consecutive 6 ball set are almost null. The probability is that the 6-ball set will be from a much wider domain. Which one, we can not know with precision in randomness.

 

Thus, the limit of your desired set, versus the size of the random domain must place a statistical probability constraint on numbers being drawn next to each other.

 

This is not to say that there is no probability that the numbers nearer each other will be drawn. But chances are smaller, even if negligibly so.

 

This constraint must be accounted for by history feed. (If you draw number 2, in a 1-2,500,000 random domain, the chances that the next number will be 3 of 1 are smaller than that the number will be 500,000, because of the constraint of the desired size of sequence, in this case 6; even though the chances of the draw are the same on an independent event basis.)

 

The logical explanation for the lower probability of the consecutive set versus non consecutive set is that the drawing is also a set drawing, even though it is done ball by ball. This distinction is not the same as black balls versus yellow balls. The consecutive sequence, unlike color, is a domain constraint intrinsic to the random domain that we are drawing from. We don't know whether there is higher probability that the six numbers will come from domain of 2-26, or 15-49 in the random domain of 1-50; but we do know that a consecutive domain has lower chance.

Link to comment
Share on other sites

Craig,

 

I understand perfectly that factorial statistics show that the probabilites for all numbers are the same.

That is not what I described in post #110.

 

Calculating the probability of drawing a particular sequence of [math]n[/math] uniformly distributed random integers from 1 to [math]m[/math] as [math]\frac1{m^n}[/math], as I’ve done above (eg: the probability of drawing any particular sequence of 6 numbers from 1 to 48 is [math]\frac1{48^6} = \frac1{12230590464}[/math]) does not use factorials.

 

Calculating the probability of drawing a particular sequence of [math]n[/math] members of a set of [math]m[/math] uniquely identified members, which is [math]\frac{(m-n)!}{m!}[/math], does, the “!” symbol denoting the factorial function. For example, the probability of drawing a particular possible sequence of 6 balls at random from a collection of 48 balls labeled 1 to 48 is [math]\frac{(48-6)!}{48!} = \frac1{48 \cdot 47 \cdot 46 \cdot 45 \cdot 43 \cdot 42} = \frac1{8835488640}[/math]. When drawing in this manner, some sequences, such as 1-1-2-3-4-5, are impossible, because there is only one ball labeled with each number.

 

From these two different calculations, we can calculate a third probability. The probability of drawing a sequence of [math]n[/math] uniformly distributed random integers from 1 to [math]m[/math] in which the numbers drawn are unique is [math] \frac{m!}{(m-n)! m^n}[/math] (eg: the probability of drawing 6 numbers from 1-48 is [math]\frac{8835488640}{12230590464} = \frac{2556565}{3538944} \dot= 0.722409[/math]. Note how much easier to calculate this way than with techniques like enumerating all of the possible non-unique number containing sequences and calculating and summing their probabilities.

My point is that those statistics have limited purpose related to independent individual drawings, but do not account for set probabilities; limited against random domain probabilites.
Lawcat, you’re misusing several common terms, and, I think, failing to grasp an important idea of of systematic counting and probability.

 

A set is an unordered collection of elements, ie: 1-2-3-4 and 4-1-3-2 are the same set. A sequence is an ordered collection of elements, ie: 1-2-3-4 and 4-1-3-2 are not the same sequence.

 

"Random domain probability", and several other phrases you’ve used in this thread aren’t, as best I can tell, conventionally meaningful phrase. A good test of whether your terms are recognizable is to search for them on the www. If, as in the case of this one (or its plural), the only reference found is your post, unless you intend to coin a new phrase, you likely shouldn’t be using it.

 

More importantly, I get the impression you believe that probability of drawing 1-2-3-4-5-6 is not the same as drawing 2-12-13-22-31-35 in a lottery. Regardless of how the balls are drawn, assuming that no ball is unfairly weighted or sized to be more likely to be drawn than another – which fair lotteries strive to assure – the probability of drawing 1-2-3-4-5-6 is the same as the probability 2-12-13-22-31-35, and exactly equal to the values given above for a couple of different ball-drawing methods. The reason for this is an important isomorphism between numbers and their numeric representations: for example, that the number of possible 6 digit base 48 numbers equals the number of possible 1 digit base 48 numbers to the 6th powers, ie: [math]f(b,n)=f(b,1)^n[/math]

 

It’s worth noting that none of the major lotteries (eg: the US’s Powerball) use the simple scheme we’ve been discussing, in which order is important (eg: a Powerball ticket with the numbers 1-2-3-4-5--1 wins the jackpot if the numers 2-4-3-1-5--1 are drawn). Actual Powerball machines select the first 5 numbers from a drum containing 55 uniquely labeled balls, and the last number from a drum containing 42 uniquely labeled balls. So the probability of winning a Powerball jackpot is [math]\frac{(55-5)!5!}{55!42} = \frac1{146107962}[/math].

Link to comment
Share on other sites

Thanks for the explanation. It may very well be easier to assume that every number has the same probability of being drawn as any other number--therefore any particular desired set has the same probability of being drawn as any other set of same size.

 

My point is: This is not true. The desired set places a known constraint in relationship to the domain from which you are drawing numbers.

 

If Y is the domain of 1-100, and X represents the desired drawing of six balls from a range of 1-100.

 

The 6 consecutive number range (6x v. 1-100y, is narrower in area than the 2-75 range. The six consecutive number range is the smallest--we know this. Which range will be drawn we don't know. But the range of six consecutive numbers has the smallest area.

 

This information must be fed into the calculation to account for the lower probability of numbers next to each other being drawn.

 

That is my point.

Link to comment
Share on other sites

I understand the basics of lottery probability. But if we look at a lottery result, after the fact, with 20/20 hindsight, the predicted odds never reflect the reality of the result. For example, we have a lottery where the odds of winning are defined as 1 in a million. The next day, when the drawing is finalized Tom wins. Based on after the fact, Tom turned out to have a probability of 1.0, since he can be proven to have won. With 20/20 hindsight statistics turns into cause and affect. All the rest had no chance to win and their probability turned out to be 0.0 in terms of the final reality.

 

I realize odds is like a modern version of an oracle that tries to predict the future. But the numbers it generates never coordinate with the 100% certainty of the final results. The question I have is there any math that can bridge the probability prediction with the causality of 20/20 hindsight?

Link to comment
Share on other sites

Thanks for the explanation. It may very well be easier to assume that every number has the same probability of being drawn as any other number--therefore any particular desired set has the same probability of being drawn as any other set of same size.

 

My point is: This is not true.

It’s more than a convenient assumption – it’s both mathematically true and empirically demonstrable.

 

It’s important to be clear on what question is actually being asked.

 

The probability of randomly drawing the sequence (not set! Sets are not ordered, so cannot be cannot support the idea of sequential anything!) 1-2-3-4-5-6 in that order from a collection of 48 balls labeled 1 to 48 is exactly the same as that of drawing 9-14-17-26-41-43 in that order. Both have probability [math]\frac{42!}{48!}[/math]

 

The probability of randomly drawing 6 balls that are labled with consecutive numbers - that is, one of the 43 sequences 1-2-3-4-5-6, 2-3-4-5-6-7, … 43-44-45-46-47-48 – is exactly [math]\frac{42!}{48!}[/math]

 

This is not the same question as “what is the probability of drawing one of the 43 sequences above?”. That probability is simply 43 times the probability of drawing one of them, [math]\frac{43 \cdot 42!}{48!}[/math]. As with any complement, the probability of drawing a sequence that is not one of the 43 sequences is [math]1 - \frac{43 \cdot 42!}{48!}[/math]

 

The same would be true of any collection of 43 sequences, regardless of whether they are labeled with consecutive numbers or not.

 

In short, there’s nothing special about the probability of any pick in the above lottery, whether is involve consecutive numbers or any other pattern.

 

I believe lawcat’s misunderstanding is illustrated by this quote:

But the range of six consecutive numbers has the smallest area.
The “area of the range” of six consecutive numbers is the size of the set of all possible sequences drawn from the pool that consist of sequences of balls labeled with consecutive numbers, which is 43. It is exactly the same as the “area of the range” of any selection of 43 elements.

 

Another useful mental aid in understanding this is to imagine that, rather than being labeled with the representations of numbers, the 48 balls were labeled with something we don’t think of as having an order, such as pictures of fruits. How could this change, or our inability to imagine some sequence of them as being consecutive, have any effect on a probability? Or imaging that the sequences are being chosen by a person unfamiliar with our numeral system, and incorrectly believes that their order is 1, 2, 3, 4, 0, 9, 5, 7, 6, 8. Could his incorrect selection of a consecutive number labeled sequence of balls have any effect on the actual probability of that sequence being drawn?

Link to comment
Share on other sites

Everything is ordered--even fruits, or colors, or some other numerical system. Order is not imposed by us, but is intrinsic in any system. The fact that you choose to make all 49 balls unmarked (all are plain grey balls), does not remove the order from the system; only, you've chosen to ignore the order.

Link to comment
Share on other sites

Everything is ordered--even fruits, or colors, or some other numerical system. Order is not imposed by us, but is intrinsic in any system. The fact that you choose to make all 49 balls unmarked (all are plain grey balls), does not remove the order from the system; only, you've chosen to ignore the order.
Lawcat, please don’t attribute statements to me that I’ve not made, such as “you chose to make all 49 balls unmarked”. I don’t wish to engage in legalistic debate over mathematical concepts well understood by mathematicians, but to explain these concepts to non-mathematicians.

 

For the sake of clarity, Lawcat, are you claiming that the probability of drawing the sequence 1-2-3-4-5-6 from a collection of 48 balls labeled with the numbers 1-48 is lower than the probability of drawing the sequence 9-14-17-26-41-43? I’m not asking for a defense of your claim, only a yes/no response to the preceding question.

 

In the meanwhile, I'll attempt another mental aid in understanding why one should answer this question “no”:

 

Each member of the set of all sequences of six balls that can be drawn as described can be mapped to an integer between 1 and 8835488640 (48!/42!). A partial list of a possible such mapping is:

1-2-3-4-5-6-7 -> 1
1-2-3-4-5-6-8 -> 2
...
1-2-3-4-5-6-48 -> 43
1-2-3-4-6-5 -> 44
...
1-2-3-4-7-19 -> 100
...
1-2-3-4-28-15 -> 1000
...
1-2-3-9-17-29 -> 10000
...
1-2-4-11-42-29 -> 100000
...
1-2-14-37-27-40 -> 1000000
...
1-4-28-23-21-8 -> 10000000
...
1-27-26-28-7-19 -> 100000000
...
2-3-4-5-6-7 -> 188076197
...
3-4-5-6-7-8 -> 376152393
...
6-22-17-19-11-46 -> 1000000000
...
9-14-17-26-41-43 -> 1520813890
...
43-44-45-46-47-48 -> 7899200233
...
48-47-46-45-44-43 -> 8835488640

No uniformly randomly selected (how a fair lottery machine selects) integer from 1 to 8835488640 is more probable than another. So, due to the existence of the above described mapping (also known as an isomorphism), we can conclude the same thing about the sequences to which these integers map. Just as the integer 7899200233, which maps to a sequence of consecutive integers, is no more special, as relates to probability, than 1520813890, which maps to a sequence of non-consecutive integers, a sequence of consecutive integers is no more special in terms of probability than a sequence of non-consecutive integers.

 

PS: Here’s MUMPS code that implements the above mapping:

s M=48,N=6,D=","
s C0=1 f I=M-N+1:1:M s C0=C0*I
f  r !,A q:'A  s A=A-1,C=C0,P=P0_D,WD=" -> " f I=M:-1:M-N+1 s C=C/I,J=AC#I+1 w WD,$p(P,D,J) s $p(P,D,J,J+1)=$p(P,D,J+1),A=A#C,WD="-" ;number to sequence
f  r !,A q:'A  s C=C0,X=1,WD="-",P=P0_D x "f I=1:1:6 s C=C/(M-I+1),J=$l($e(P,1,$f(P,$p(A,WD,I))),D)-1,X=J-1*C+X,$p(P,D,J,J+1)=$p(P,D,J+1)" w " -> ",X ;sequence to number

Link to comment
Share on other sites

Again, we are drawing individual numbers independently. But, we are also drawing sets, as if we were grabbing a 6-number set with one hand. A separation in time between drawings is of no moment.

 

Your mapping, above, says nothing of range and its probability.

 

Take for example a 3-number set, out of the 1 through 6 domain. We have the following possible sets:

 

123 (range 1-3)

124 134 (range 1-4)

125 135 145 (range 1-5)

126 136 146 156 (range 1-6)

 

234

235 245

236 246 256

 

345

346 356

 

456

 

As you can see, the range of consecutive numbers has lower probability than any other range; even though, seemingly, 123 has the same probability as 246.

 

Range 1-6 has higher probability than the range 1-3. Even if we draw 1 and 2, the chance that the last number in the set will be 4,5 or 6 is higher (75%) than that the number will be a 3.

 

Thus, we do not know which range the set will come from, but we do know that a consecutive set has the smallest probability. This information must be accounted for.

It is a relationship between the desired set, and the size of our domain.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...