Just be glad that you didn't go for the 2 envelopes problem which was her next. ...

hansvm · 2024-12-09T20:55:00 1733777700

The two reals are selected via some distribution, and the only way you can do better than chance is if you have some knowledge of that distribution.

The question leaves that distribution completely hidden, and your answer smuggles it back in. That feels less like a counter-intuitive math/stats question and more like a badly worded gotcha.

ColinWright · 2024-12-09T23:03:57 1733785437

This turns out not to be the case.

Let's play this game exactly once.

You choose two unequal real numbers. I don't know what they are, and I don't know the distribution from which you choose them. You write them down and put them in separate envelopes.

I'm allowed to choose one envelope and open it to see the number inside, and my job is then to say which envelope holds the larger number.

I claim I have a strategy now which lets me win strictly more than 50% of the time. My strategy is this.

I choose a real number R at random from a distribution that has dense support. In other words, for any two reals, L and U with L<U, P(L<R<U) > 0. This is easy to do ... one method is to list the rationals, positive and negative, then roll a die, discarding numbers until you get a 6.

Now I flip a coin and thereby choose an envelope at random. I proceed by assuming my chosen number is between your two numbers. There is a non-zero chance this is true ... call it e. So e>0.

If I'm wrong then my choice is 50% ... probability is 1-e.

If I'm right then my choice is 100%. ... probability is e.

Combined, my chance of being right is 0.5(1-e) + e = 0.5+e/2, which is strictly greater than 50%.

You can make it as small as you like, and if we play the game repeatedly then you can make it approach 50%. But as it stands, with a one-off game, I can win with a probability that depends on your chosen numbers, but which is strictly bigger than 50%.

ajnin · 2024-12-10T08:25:34 1733819134

I'm not a mathematician so please bear with me here, but I think a problem stems from the fact that the set of reals is "infinite". So, whatever interval you choose, there are infinitely more reals outside the interval as inside (by that I mean that you can fit an infinite number of copies of that interval up to infinity). So the probability e is not >0, it is effectively 0. The second problem is, what does it mean to choose a real at random ? There is an implication that you can choose such number, but as a human living in the finite universe there are limitations to your choice. Any number you can write using all the atoms in the universe is infinitely outnumbered by all numbers that you can't. So effectively it is impossible to pick a random real number. You have to pick a real in some interval, implicitly the interval of reals you can write in an envelope. Which is a different problem than stated originally and for which your "e" can be >0.

raincole · 2024-12-10T12:36:26 1733834186

> The second problem is, what does it mean to choose a real at random ?... So effectively it is impossible to pick a random real number

Yes, it's established there isn't "uniform distribution over all real numbers" without violating axiom of probability. You're 100% correct on this.

But it doesn't make Colin's solution wrong, because e > 0 for any* well-defined distribution.

> Which is a different problem than stated originally

There are two ways to inteprete the original problem:

A. The numbers are truly randomly picked over all real numbers.

B. The numbers are picked from a well-defined distribution which is unknown to the player.

Since A. is invalid mathematically speaking (without changing the commonly accepted definition of probability), it's reasonable to only consider B., in which case, Colin's solution is correct.

I made a more intuitive explantion on why a strategy better than coin toss exists here: https://news.ycombinator.com/item?id=42372972

*: More strictly, any distribution that guarantees the probability that the two numbers in envelope are the same = 0.

btilly · 2024-12-10T15:49:33 1733845773

How the numbers in the envelopes were picked doesn't matter. What's important is that they exist, and are specific numbers.

ColinWright · 2024-12-10T09:58:46 1733824726

I'm not a mathematician so please bear with me here

I am a mathematician, so please bear with me when I try to explain how this can work.

The rational numbers are countable, and that means that I can write a list of them. There are several ways of doing this, but personally I like the Calkin-Wilf tree[0]. That only gives the positive ones, but we can include zero and the negative ones by interleaving them.

So, whatever interval you choose, there are infinitely more reals outside the interval as inside (by that I mean that you can fit an infinite number of copies of that interval up to infinity). So the probability e is not >0, it is effectively 0.

One you have chosen the two numbers, L and U, I note that there are rational numbers in between. Choose one of those numbers, call it M.

M is in my list above. Now I roll a die, discarding numbers from the list until I get a 6. There is a non-zero probability that the number retained is M, so there is a non-zero probability that my chosen number is between L and U. So e is definitely non-zero.

The second problem is, what does it mean to choose a real at random?

It doesn't have to be uniformly at random -- that's the mistake nearly everyone makes -- and the above process does it perfectly well. It only ever chooses a rational number, but that's OK. It's still a real number, it's still a random number, and for any non-empty interval, there is a non-zero chance the chosen number is inside.

... as a human living in the finite universe there are limitations to your choice.

Yes, but that is accounted for in the explicit description of how to choose the number.

Any number you can write using all the atoms in the universe is infinitely outnumbered by all numbers that you can't.

Again, this is accounted for by the fact that we are not choosing uniformly at random.

[0] https://en.wikipedia.org/wiki/Calkin%E2%80%93Wilf_tree

vbarrielle · 2024-12-10T17:12:40 1733850760

Am I right to assume one could also sample from a Gaussian distribution for the method to work? Of course, the probability e of sampling between the two real numbers would be very small, but it would be nonzero.

ColinWright · 2024-12-11T20:19:00 1733948340

Yes, any distribution with dense support works. Your problem is to prove that your method of number selection really does have a non-zero probability of being between any two distinct reals.

So ... yes, choose from a Gaussian, but then you have to tell me exactly how.

That's tricky.

hansvm · 2024-12-10T00:34:30 1733790870

Why, exactly, are you allowed to know why some random guess is between the two numbers or not when computing your choice?

hansvm · 2024-12-10T02:57:12 1733799432

Sorry, too late to edit, I see you're describing how you know your choice was good/bad (I thought you were describing a probabilistic choice).

raincole · 2024-12-09T21:00:10 1733778010

Yes, this two real numbers question suffers from hidden conditions as the original version of Monty Hall problem does, but even more explicitly.

We can't have an uniform distribution over all real numbers, so it's quite pointless to discuss if looking into the envelope gives any new information, cause we don't even know the distribution yet.

btilly · 2024-12-09T21:49:22 1733780962

You don't need a uniform distribution to get guaranteed better than even odds. And many nonuniform distributions work just fine.

There are no hidden conditions. It is just a shocking result that we don't expect.

raincole · 2024-12-09T23:02:21 1733785341

https://www.alexirpan.com/2015/09/09/the-other-two-envelope-...

If your solution is the same as this article's, it's plain wrong. Even the natural number case is plain strong.

It's very easy to demostrate as well: consider a trivia case where the distribution is just {P(1)=1/3, P(2)=1/3, P(3)=1/3} and you see 2 in the first envelope. There is no strategy to get a better chance than 50%. Therefore, any strategy that gives a better chance than 50% must implicitly make an assumption over the initial distribution (and therefore excludes a distribution like {P(1)=1/3, P(2)=1/3, P(3)=1/3})

Actually the article is even "wronger" than this, because "started A" and "switched" aren't independent and one can't simply use the product of their probability. The above example is a quick way to demonstrate it's not a general strategy without assumption to get >50% winning chance. Similarily, one can just use {P(1)=1/3, P(2)=1/3, P(3)=1/3} (this is a valid distribution over real numbers!) to demonstrate the real number strategy isn't general.

Again, for both natural number and real number case, the discussion over strategies is only meaningful is we know something about the distribution.

Interestingly, this article is wrong more or less in the same way as believing switching does give you more expected value in the original "twice money in another envelope" variation.

Edit: For people who are interested in the switching strategy, check Randomized Switching in the Two-Envelope Problem (2009). Spoiler: full of discussion over the initial distribution.

LudwigNagasena · 2024-12-10T01:00:06 1733792406

> It's very easy to demostrate as well: consider a trivia case where the distribution is just {P(1)=1/3, P(2)=1/3, P(3)=1/3} and you see 2 in the first envelope.

Seeing 2 is only one of the many possible cases. You haven’t calculated the total probability.

btilly · 2024-12-09T23:52:06 1733788326

Sorry, but you're simply wrong. You can read the answer I gave. You can read the answer Colin Wright gave. You can trust that we both have math degrees and know what we are talking about. Or, aw heck, you can try it with an actual program at https://www.perlmonks.org/?node_id=39630. (Yes, I wrote that piece of hackery about a decade ago.)

I don't actually care how you convince yourself. But the explanation is right. If your random number is outside of the range, you've got even odds. If it is inside of the range, you've got 100% odds. As long as there is a positive probability of being between, you've got strictly better than even, by half of the probability of being between.

Many, many distributions guarantee positive odds of being in between. The one I chose for my program was:

    (log(rand) * (flip_coin() ? 10 : -10 ))

Which is the log of a random number between 0 and 1, times 10 times + or - with even odds. The various factors were chosen to fit well with normal human choices that most seek to test it with.

raincole · 2024-12-10T01:47:20 1733795240

You're correct. I realized that I completely misunderstoond the original problem after reading McDonnell's paper more carefully.

I thought you meant the strategy can make the winning chance always >50% even after the player opens the first envelope, which isn't possible.

However you actually meant the strategy can make the expected winning chance >50% before the player opens the first envelope, for any well-defined distribution of real number, even the distribution is not known to the player, which now I realize is true.

(I haven't thought through some edge case like Cantor distribution, but now I incline to it's true not just for "many distributions". Of course for a discrete distributions, we need to specifiy the two envelopes can't have the same number. Besides that, it seems to hold true for any distribution?)

btilly · 2024-12-10T15:51:57 1733845917

Exactly right. After you've picked the envelope, you may be nearly guaranteed to be wrong. For example both numbers are large positives and you picked the smaller. Now you're almost guaranteed to guess larger, and be wrong.

But before you pick, your odds were still bigger than 50%. Just not by much.

hansvm · 2024-12-10T00:44:46 1733791486

That code has the same problem that Colin Wright's explanation does. Your computation of the success rate explicitly uses the fact that you're between the two values, but till you've seen the second value you can't possibly know whether you are or are not and thus can't adjust your guess based on that fact.

LudwigNagasena · 2024-12-10T01:39:56 1733794796

The success is checked after the decision is made to, well, verify the decision. The guess isn't adjusted on whether the number is between two values.

The strategy is straightforward and bulletproof (if you allow a random generator of real numbers, otherwise you may keep tossing coins indefinitely): keep tossing coins until you get tails. If the number you saw is less than the number of heads you got, you don't switch.

For the simplest case assume that one envelope always contains 1 and another always contains 2. You choose one envelope randomly, so in 50% of cases you get 1, which you switch in 50% of cases. And in 50% of cases you get 2, which you switch in 25% of cases. Hence, you pick the higher number in 62.5% of cases. The same works with any numbers N, M; or any complex distributions; or even real numbers with a bit more complicated strategy. You don't have to know whether you are between two values in advance, you just have to guess.

btilly · 2024-12-10T15:54:03 1733846043

No, I'm not doing anything other than tracking it. I'm doing that to show the user, "How much of this success was chance 50% choices, how much was guaranteed?" And so people can see how much of a good or bad result was blind luck versus the strategy successfully sticking a thumb on the scales.

In other words, I'm merely trying to be informative.

raincole · 2024-12-10T01:48:27 1733795307

Edit Edit: I was wrong. I completely misunderstood the problem. See my other comment below.

hansvm · 2024-12-09T22:25:12 1733783112

You need some _known_ distribution though, and it's shocking because the distribution is ommitted from the question, and the presence of the same distribution is snuck into the answer.

ColinWright · 2024-12-09T23:05:03 1733785503

I think you are wrong ... see my answer here:

https://news.ycombinator.com/item?id=42371563

hansvm · 2024-12-10T00:48:23 1733791703

Not to be too dismissive, but the title text of this xkcd [0] seems relevant. No matter how nice the explanation is, the fact that the conclusion is wrong suggests that the reasoning has a flaw. I took a stab at what I think that flaw is when I responded to your rebuttal [1].

[0] https://xkcd.com/2217/

[1] https://news.ycombinator.com/item?id=42372285

raincole · 2024-12-10T02:11:03 1733796663

I dimissed it at first (you can see I made some embarrassing comments above), but the strategy is correct. I'd argue it's even intuitively correct once the problem in question is clarified.

It's not saying that after the player see the number in the first envelope, the strategy guarantees a >50% outcome.

It's saying that give any distribution, over all possible outcomes, >50% times the strategy will end up pick the larger number. You can say this >50% is the expected winning chance before the player see the number in the first envelope.

I'd say this is "intuitve" because, if your strategy can guarantee "when the player see a large number in the first envelope, he's less likely to switch than if he saw a small number", it would be better than blindly switching by coin toss. So intuitively such a strategy exists.

The only "trick" here is that since the player doesn't know the initial distribution, they can't tell "how large counts as large?" therefore they needs something that preserves some property over the whole real number line. That's why the strategy involves sampling from a another distribution whose PDF is non-zero everywhere.

hansvm · 2024-12-10T02:38:46 1733798326

Sorry, I was too dismissive. I'm not crazy about the explanation, but I think you're right.

raincole · 2024-12-09T20:13:56 1733775236

"Two Envelopes Problem" seems to be usaully referred to a very different problem: https://en.wikipedia.org/wiki/Two_envelopes_problem

btilly · 2024-12-09T20:22:12 1733775732

I was dealing with a more general variant of the version that Vos Savant gave. Which is that you were able to look at the contents of the envelope that you got. She claimed you didn't gain useful information. Calling it "information" is problematic, but it certainly is useful, because the stated strategy then works!

This is one reason that probability theorists have learned to be more careful in stating the double money version to rule out the solution that I gave. But my version has been in the literature since, I believe the 1950s. Under, as I first encountered it, the same exact name. (I encountered it from Dr Laurie Snell at Dartmouth College in the mid 1990s.)