Suited Probability Conundrum

Discussion in 'Sidewalk Cafe' started by London Colin, Jun 15, 2014.

  1. London Colin

    London Colin Top Member

    I've once again managed to totally confuse myself, trying to figure out how to do a particular probability calculation. I'm not sure if what I am attempting is in fact trivial (but I'm going about it all wrong), or if it is genuinely as tricky as I am finding it, or perhaps not possible at all.

    What I have in mind is this -

    Suppose you know the distribution (by rank) of the remaining cards in a blackjack shoe, but you have no information about their suits. A two-card hand is about to be dealt. It's easy to calculate the probability of any particular blackjack hand being dealt, but (how) can you split this into the probability that the hand will be suited/non-suited?

    My first, naive thought was that the suited probaility is simply 1/4 of the hand probability. (i.e. any suit for the first card and a 1/4 chance that the next card will be of the same suit.) But then its struck me that if the starting point is an assumption that there is an even distribution of the four suits, then, after the first card is dealt, this no longer holds true for the remaining cards; so an adjustment to allow for this may be needed.

    Also, I can't figure out if the adjustment for a pair would need to be different to that for a non-pair. That is, ought the assumption of an even distribution of the suits apply to each rank, as well as to the shoe as a whole? (In which case, dealing one card of a given rank has a greater impact on the probaility that the next card will be of the same suit, if it is also of the same rank.)

    Any thoughts? I hope all the above makes sense. I can add some examples and forumlae I considered, but I'm wary of leading you too far down the path of my own, seemingly faulty reasoning.
     
    Last edited: Jun 15, 2014
  2. gronbog

    gronbog Top Member

    Here are my thoughts. Hopefully someone will correct me if I'm wrong. We all know that one of the reasons I prefer simulation is that I find probability calculations to be tricky and error prone!

    So you know the exact distribution of the cards by rank. You therefore know the total number of cards remaining. Without any information of the distribution by suit, I think that the best you can do is to assume that they are evenly distributed, which is what you have done. Now this is only actually possible if there are a multiple of 4 cards remaining. So I think that the best you can do is to perform the calculations allowing averages or fractions of cards of each suit. So with N cards remaining, you would perform the calculations assuming that N/4 cards of each suit remain, even if that is not an integer.

    With respect to adjustments for particular hand compositions, like pairs -- since you don't know the distribution of the suits, then the second card of a pair is just as likely to be suited as any other particular card. I don't believe that any adjustment is necessary for this. It would be necessary if you knew the distribution of the suits within the ranks, however.

    So the probability of a particular 2 card hand is P(the first card makes the hand possible) x P(The second card completes the hand after the first card is dealt) x 2 which will end up looking something like (a/N) x (b/(N-1)) x 2. For a particular 2 card total, you would repeat this for each composition that could represent that total and add them all up.

    For a suited hand, I think that you should use (N/4) - 1 cards to choose from for the second card, even if N/4 is not am integer, so the calculation would be something like
    (a/N) x (b/((N/4) - 1)) x 2

    Note that this breaks down for N < 4, but then, so does any realistic assumption that the suits are evenly distributed, even on average.

    I hope this makes sense.
     
    Last edited: Jun 16, 2014
  3. London Colin

    London Colin Top Member

    Thanks for the response, Gronbog. My initital thoughts were broadly similar to yours. However, plugging my formulae into the code I am working on gave results that were slightly off. After some trial and error, I was able to get the correct results (at least, I know they are definitely correct for the full deck). Working backwards from this success, I think I can explain why the new formulae work.

    The key factor is that the starting point, before any cards are removed, is not just an equal number of each suit in total, it is also an equal number of each suit for each rank. So when you account for a card of a particular rank being dealt, you know it is only the suit distribuition within that rank that is affected.

    As a result, it turns out you don't have to make an adjustment for the non-pair case, but you do for the pair.

    The second term above isn't right; you've made the probability bigger, when you meant to make it smaller. The formula should be -

    (a/N) x (b/4)/(N-1) x 2

    And since this is just a sequence of multiplication and division, it is equivalent to taking the probability of the hand and dividing by 4.


    But in the case of a pair, we have -

    (a/N) x (a/4 - 1)/(N-1)

    [Before the first card is dealt, 1/4 of the cards in this rank are of this particular suit. So after it is dealt, a/4-1 of them remain.]
    This is what still worries/confuses me. (It affects the pair case in my formulae.) I can't quite tell if I've created a precise calculation which just suddenly breaks down when you get down to 4 cards, or some kind of approximation which degrades as the number of cards get smaller.

    It seems to me a precise calculation ought to be possible. I don't think that the assumption that the suits are evenly distributed exactly breaks down. It's just a mental model, another way of saying that the probability of any card being of a particular suit is 1/4.
     
  4. S. Yama

    S. Yama Active Member

    I will take a shot at it (but don’t laugh too hard if it is off beam), lol.

    Firstly, for some reason, it reminded me of Monty Hall problem but it doesn’t apply here, but had to think for a moment if we assume an even distribution of suits before removing on card (the first dealt card) or after.

    Some confusion may arise that we have two distinct probabilities looking for a suited two cards – one is for a pair and another for suited different ranks.

    In Colin’s formula:
    (a/N) x (b/4)/(N-1) x 2
    This applies to suited cards of different ranks. Also, I don’t see the reason for the last part “x 2”.

    My less elegant approach, sorry for the poor notations, would be to weigh it by their frequencies and add the contributions:
    Suited pair: first card x number of the rank cards divided by four minus one seen card, times number of the rank cards minus the seen card divided by total cards less one.

    (a/N) x (a/4 – 1) x [(a-1)/(N-1)]

    plus, for non-pair suited hand:

    (a/N) x (N/4-2)/(N-a) x (N-a)/(N-1)

    Minus 2 is because we exclude both the first card and the suited pair card.

    Colin, you said: “It affects the pair case in my formulae”, how much it breaks the formula?
    Perhaps we may have to include the original number of decks to get the numbers fit the formulas. Let's bring it to somewhat extreme example.
    There were dealt all but the last two cards in eight-deck shoe. Your chances of them being suited are the same as the first two cards of the top of the deck being suited. It is 103/415 = 24.819%
    If you played single deck it would be 12/51 = 23.53%

    S. Yama
     
  5. London Colin

    London Colin Top Member

    Thanks for weighing in, Yama. At first glance, I think that you have made some mistakes, but I may be misunderstanding some things. It might be helpful if I expand on what I've produced, and provide some concrete examples...

    Suppose we have at total of 200 cards remaining, containing 13 aces and 14 deuces. (i.e. a=13, b=14, N=200). Ignoring suits altogether, the probabilities of the hands A,2 and A,A are -
    p(A,A) = 13/200 * 12/199 = 0.0039195
    p(A,2) = 13/200 * 14/199 * 2 = 0.0091456
    [the * 2 is because order does not matter; 2,A is the same hand as A,2.]

    For suited hands -
    p(A,2, suited) = 13/200 * (14/4) /199 * 2 [which is simply p(A,2)/4] = 0.0022864
    p(A,A,suited) = 13/200 * (13/4 - 1)/199 = 0.0007349

    [In both cases, the first card considered can be any of the 13 possible aces. The second card is then constrained to be of the same suit. So for the deuces that means 14/4 of them are available (still an even distribution), but for the aces, one has been used up, so (13/4 - 1) remain.


    To calculate the non-suited hands, we could subtract the suited values from the "don't care about suit" values, or we can calculate them directly -
    p(A,2, non-suited) = 13/200 * (14*3/4) /199 * 2 [which is simply p(A,2) *3/4] = 0.0068592
    p(A,A,non-suited) = 13/200 * (13*3/4 )/199 = 0.0031846

    [This time, for A,A, three quarters of the original 13 aces still remain to be chosen from for the second card.]


    So the above are specific examples of my generalised formulae -

    'Don't care about suit' -
    non-pair:
    a/N * b/(N-1) *2

    pair:
    a/N * (a-1)/(N-1)

    Suited -
    non-pair:
    a/N * (b/4)/(N-1) * 2

    pair:
    (a/N) * (a/4 - 1)/(N-1)

    Non-Suited -
    non-pair:
    a/N * (b *3/4)/(N-1) * 2
    pair:
    (a/N) * (a*3/4)/(N-1)

    I've highlighted where the division by 4 or multiplication by 3/4 occurs in relation to the 'don't care' formulae. These are the points where I am effectively assuming an even distribution of suits, separately among both of the ranks, a and b, before the first of the two cards is dealt.


    The trouble is, when 'a' gets down to 4, the probability of a suited pair is apparently zero. And if 'a' is less than 4 it goes negative! Something is clearly not right! So is the formula 100% accurate when a=5, but then suddenly drops of a cliff at a=4, or does its accuracy gradually fall away for smaller and smaller numbers of cards? (Feel free to view that as a rhetorical question.:))

    Clearly the probability of a suited pair would be zero for a=4 if the starting point was a single deck, but not so otherwise. So I think you are right that some sort of accounting for the initial number of decks may be needed.
     
  6. gronbog

    gronbog Top Member

    I think that it's a sudden threshold and it is for the reason I stated earlier. With the assumption of evenly distributed suits, then the probability of any suited hand is actually zero with 4 cards remaining, since at that point we know that there is exactly one card of each suit remaining.

    Similarly, the formula then breaks down completely for less than 4 cards remaining, because the cardinality of at least one suit must be zero at that point, which is to say that the assumption of being evenly distributed is grossly violated.

    For less than 4 cards, a completely different model is required. It also seems likely to me that whatever model works for less than 4 cards will also work for any deck size and composition.
     
    Last edited: Jun 17, 2014
  7. gronbog

    gronbog Top Member

    As I said above, I now think that a completely new approach may be warranted. Your original problem statement stated "but you have no information about their suits". We both then appear to have gone down the road of assuming that the suits are evenly distributed. However, by doing that, we have limited the possible suit composition of the remaining deck to a very small subset of what is possible, where the suits are distributed approximately evenly. In reality, "but you have no information about their suits" means that any composition of suits is possible, including the extreme possibility that all of the cards are the of same suit (for small remaining decks). That possibility is not represented by the original formulas and that is what lead to the failure of the original formulas for 4 or fewer cards remaining.

    It also occurs to me that, because there are only a fixed number of cards of each rank in each suit in the original shoe, there must be an adjustment made for the probabilities of all required suited second cards of all hands, not just the pairs. The adjustment for each is similar but slightly different.

    For the discussion below, let's assume that
    • we've already selected the first card rank of the hand and computed its probability as P1
    • r is the number of cards of the required rank remaining for the second card (after the first card has been dealt) and is not zero (otherwise the probability of our target hand, suited or not, is already zero)
    • n is the total number of cards remaining (after the first card has been dealt) and is not zero (otherwise we cannot even deal the 2 card hand.
    • P2 is the probability of getting the required second card
    Then the probability for the target hand will be P1 x P2 x 2. I'll start with some concrete examples and then try to extrapolate to the general formulas for P2.

    The trivial example is an initial "shoe" of 1 deck. We know that there is one card of each rank in each suit to start with, and also one card of each rank in each suit for any depleted deck thereafter.
    • For the target hand in any suit, the probability of getting the required second card is P2 = r/n
    • For a suited non-pair, for each of the r available cards of the required rank for the second card, at most 1 of them can be of the required suit and it has a 1/4 chance of actually being suited. So the probability of getting the required card is P2 = 1/r/4
    • For a suited pair, for each of the r available cards of the required rank for the second card, at most 1 - 1 or zero of them can be of the required suit, so the probability of getting the required card is P2 = 0/r/4 or 0.
    So far so good. Let's try it for an initial "shoe" of 2 decks.
    • For the target hand in any suit, the probability of getting the required second card is still P2 = r/n
    • For a suited non-pair, for each of the r available cards of the required rank for the second card, at most 2 of them can be of the required suit and they each have a 1/4 chance of actually being suited. So the probability of getting the required card is P2 = min(2,r)/r/4
    • For a suited pair, for each of the r available cards of the required rank for the second card, at most 2 - 1 or 1 of them can be of the required suit, and it has a 1/4 chance of actually being suited, so the probability of getting the required card is P2 = 1/r/4
    Now let's generalize to an initial shoe of d decks.
    • For the target hand in any suit, the probability of getting the required second card is still P2 = r/n
    • For a suited non-pair, for each of the r available cards of the required rank for the second card, at most d of them can be of the required suit and they each have a 1/4 chance of actually being suited. So the probability of getting the required card is P2 = min(d,r)/r/4
    • For a suited pair, for each of the r available cards of the required rank for the second card, at most d - 1 of them can be of the required suit, and they each have a 1/4 chance of actually being suited, so the probability of getting the required card is P2 = min(d-1,r)/r/4
    As you can see, the general formulas break down to be equivalent to the ones for the special cases of 1 and 2 initial shoes. The adjustment for all of the suited second cards is that there is a limit on how many of them there can be out of the available cards of the required rank which is defined by the initial size of the shoe. Also because of the min() terms, the formulas scale to card rank cardinalities of less than the number of initial decks, where the probability of the second suited card correctly decays to 1/4 of the probability of getting the required rank.
     
    Last edited: Jun 17, 2014
  8. London Colin

    London Colin Top Member

    I think the breakdown is a clue that I've made a fundamental mistake in my mathematics / logic. And after a little more thought, I think I see where I've gone wrong. As Yama and I managed to reason between us, any correct formula has to have the number of decks as a parameter. With one deck, there really is a zero probability of a suited pair. The more decks you start with, the closer you get to the infinte deck case.

    I think the key is that there is only ever one 'seen' card with regard to suit, and the impact of its removal has to be weighed against the constant number of unseen cards of the same rank. E.g., With 4 decks you have 16 aces, meaning 15 unseen aces (as far as suit is concerned), regardless of whether those aces have been played or are still in the shoe.

    So, given that both cards are the same rank, I think the probability that the second is of the same suit as the first is (d - 1)/(4*d - 1), where d is the number of decks -

    1D : (1 - 1)/3 = 0
    i.e. 0 of the remaining 3 aces is of the same suit.

    2D : (2 - 1)/7 = 0.1429
    i.e. 1 of the remaining 7 aces is of the same suit.

    4D : (4 - 1)/15 = 0.20
    i.e. 3 of the remaining 15 aces are of the same suit.

    [And infinite deck would be 0.25]

    So I think I need to apply this condition to the formula for a pair -
    a/N * (a-1)/(N-1) * (d - 1)/(4*d -1)

    I've not checked yet, but hopefully this gives the same result as my first attempt in the case of a full-shoe , but differs more and more as the shoe is depleted. (That's if I haven't made yet another blunder.:rolleyes:)
     
    Last edited: Jun 17, 2014
  9. gronbog

    gronbog Top Member

    Heh -- looks like we were both working at the same time and that both of our revised approaches consider the initial number of decks. Great minds think alike or the blind leading the blind?

    I will try to find some time to review your approach soon. It would be col if they were both equivalent!
     
  10. London Colin

    London Colin Top Member

    Thanks. I haven't quite taken in all the detail of your post yet. I've been looking at the screen too long, and my brainpower is fading! But I think you are right that both pair and non-pair need a similar approach. So I only did half the job in my version.
     
  11. London Colin

    London Colin Top Member

    I really should take a break from this; I'm going round in circles, forever contradicting myself! ...

    Right now, I'm back to thinking that I don't need to modify my non-pair formula.

    The logic is that since the only 'seen' card with respect to suit is from a different rank to the second card, the stock of each suit is definitely un-depleted with respect to the second card. So there is exactly a 1/4 chance of each suit, including the suit that matches the first card. And this is true regardless of the initial number of decks.
     
  12. gronbog

    gronbog Top Member

    I've thought about it some more and I think I now agree with you. At first I thought that fewer cards of a given rank remaining, then the higher the probability of getting a suited one. However, it is almost always useful to study the extreme cases and, in the extreme case of all of the cards of a given rank remaining then we know for a fact that the probability of getting a suited on is 1/4. The same holds true if only 1 card of that rank remains. Since the function representing the intermediate probabilities would be linear, then those probabilities are also 1/4.
     
  13. KenSmith

    KenSmith Administrator Staff Member

    I'm not sure I want to go down this rabbit-hole. :D

    But, here's how I would approach the problem...
    For each rank, calculate the probability of each possible distribution of suits of that particular rank.
    Using that information you could arrive at a probability of each possible distribution of suits overall.

    For example, let's say we started with 6 decks and now there are only four aces left of the original 24, along with whatever other ranks remain.
    How many ways can you select 4 aces from the original 24? COMBIN(24,4) = 10,626

    There are 35 distinct ways those 4 aces could be split into the various suits.
    For each one of those ways, we can easily figure the exact probability.
    How likely are there to be two hearts, one diamond, and one spade?
    COMBIN(6,2) * COMBIN(6,1) * COMBIN(6,1) * COMBIN(6,0) = 540
    (There were six Aces of hearts to begin with, so how many ways can we choose the two that remain, etc.)

    Thus, this particular suit distribution for the Aces has probability 540 / 10,626.
    The same probability is true for any 2/1/1/0 group of the suits.
    Different probabilities apply for the 4/0/0/0, 3/1/0/0, 2/2/0/0, and 1/1/1/1 groups.

    You could use this process for each rank that remains in the deck, and then mash all the combinations together to get a big list of all the suit distribution possibilities that ignores the independent ranks.
    (No need to concern ourselves with pairs/non-pairs. They're just suits now.)

    Unfortunately, this approach is not something you can plug into a spreadsheet with the rank distributions and get an answer.
    It will take software. Unless I am overlooking a major simplification.
     
    gronbog likes this.
  14. London Colin

    London Colin Top Member

    Thanks, Ken. That's the kind of complexity I started out thinking might be necessary, but then I convinced myself that there really is almost nothing to this problem. It seemed to me the thing to recognise is that, once you have the probability of selecting an ace from those remaining in the shoe, the ace that you get is then equally likely to be any of the aces whose suit you have not seen. (The fact that they are in two piles, the shoe and the discards, doesn't change anything.)

    So in the non-pair case you are selecting one of d aces from the total of (d*4), which is obviously always 1/4, regardless of the number of decks.

    And in the pair case, you have already accounted for one ace of the required suit, so you are now selecting the second from (d-1)/(4*d-1).

    p(A,2, suited) = p(A,2)/4
    p(A,A, suited) = p(A,A) * (d-1)/(4*d-1)

    Sorry to drag you further down the rabbit hole, but can you see a flaw in the above logic?
     
    Last edited: Jun 20, 2014
  15. gronbog

    gronbog Top Member

    It could be that after summing the probabilities of each suit distribution, then results come out the same. It would be interesting to find out.

    Colin, you implied in an earlier post that you were working on some "code" associated with this. If so, could you add Ken's algorithm to it as a double check?
     
  16. London Colin

    London Colin Top Member

    I'm all for gathering empirical evidence, but the the reasoning behind what I (finally) came up with is so straightforward and fundamental that if its wrong in some way, I think it should be very evident at a theoretical level. (And very important for me to understand the nature of the error.)

    Not easily, I don't think.

    However, at some point I would like to enhance the combinatorial code I have, so that it can deal with suited cards. That is, at every point where I would currently get the prob of dealing a particular card [1..10], I would instead separately get the 4 probs of the suited versions of each card. (And it would also be useful to be able to split the 10s into T,J,Q,K as well.)

    With that, I could effectively do what Ken is describing, going through the probability of every possible suited/non-suited hand, one by one.

    It's the fact that my code doesn't currently support this that lead me down the path of this thread. (But then, also, I realised that such code would take a lot longer to run; so it's better to do it the way I have outlined, if I can prove that my formulae are correct.)

    In the shorter term, I also have some simulation code on the go, which would provide another cross-reference possibility.
     
    Last edited: Jun 20, 2014
  17. KenSmith

    KenSmith Administrator Staff Member

    Colin, what you describe makes sense and I do not see the flaw if one exists.
    But, like you, I am disturbed by the breakdown at small subsets.
    I don't see a reason for that to happen if the idea is sound.

    Still thinking.
     
  18. London Colin

    London Colin Top Member

    There was a breakdown at small subsets in my earlier attempt, where I thought I had to relate the selection of the second card to the average number of remaining cards of the given rank and suit.

    But I wasn't aware of any breakdown in the new version. Have you spotted something?
     
  19. KenSmith

    KenSmith Administrator Staff Member

    You're right. There is no breakdown once you brought in the original number of decks.

    I believe your current version to be exactly correct. I consider the question resolved.
     
    gronbog and London Colin like this.

Share This Page