Closed-box collecting

Summary: Collecting objects in sets is a popular pastime that can require a great deal of effort and such collections have inspired mathematical investigations.

A set of closed-box collectibles is a set of similar objects that are sold interchangeably. The objects might be anything; for example, cards, figurines, or trinkets. The term “closed-box” means that the consumer purchases each item without knowing exactly which thing the consumer will get—the package will contain a random item from the collection. Because some collectibles may be more rare or more valuable than others and because individual preferences vary, side markets for these collectibles may emerge, with an identified item selling for many multiples of the price of a random one. Baseball cards, collectible card games, and other trading cards give a familiar example of closed-box collectibles. Toy prizes in cereal boxes and Kinder Surprise eggs are other examples. This problem is one of the classics of probability theory. It has many extensions and can be solved by many methods, including combinatorics and generating functions. It is also known as the Coupon Collector Problem.

Promotional Contests

There have been many contests based on closed-box collecting used for promotional purposes by various businesses and products. Two well-known examples include McDonald’s annual Monopoly game and Subway’s Scrabble game. In these cases, certain purchases come with one or more random game pieces, which will be one of a large number of types. The game pieces come in various groups; a complete group of collectibles can be exchanged for a contest prize.

From the perspective of the business running such a promotion, the contest design creates certain mathematical problems. What proportion of the game pieces should be manufactured of each type? A main goal is to minimize unpredictability. If too many grand prizes are collected, the company may have to pay out a substantial amount of money; this might be too great a risk to tolerate even if it is very rare. On the other hand, if too few major prizes are awarded, the public may become dissatisfied, negating the public relations goals of the promotion. The usual solution for the significant prizes is to make one type of piece in each group extremely rare, manufacturing only as many as they intend to pay out prizes. The other types can then be made relatively common without risk. This system generally has the effect of maintaining public interest by giving a large number of people the “feeling” of getting closer to winning a big prize as they accumulate common tokens in the group, without risking a huge payout.

Expectations in Closed-Box Collecting

Suppose that a consumer is interested in one particular collectible from a set, and the consumer decides to purchase collectibles one at a time until getting the desired one. Assume each collectible purchased will be the desired kind with probability p, independently of the others. (In real life, this assumption will not be strictly valid, but the discrepancy is negligible if the number of collectibles purchased by an individual is small compared to the total number in existence.) The chance that it is not the kind desired is then 1-p, and this scenario is modeled by a geometric random variable. The probability of getting the desired item on the first try is p, on the second try is p(1-p), on the third try is p(1-p)2, and so on. Then the expectation is

Standard techniques of basic analysis now show that the expected number of purchases needed is 1/p.

It should be emphasized that this is the expectation in the sense of probability theory and that there are some common misconceptions about what it means. If the probability of getting the desired item is 1/100, this does not mean that 100 is the most likely number of purchases, nor that the 100th item is any more likely to be the desired type than any other. It means that on average—in the long run—it will take 100 tries to get the desired item. This also means, for example, that when rolling a fair die, it will take an average of six tries to roll a 1, squaring well with intuition.

Another important issue in understanding the dynamics of closed-box collecting is the expected number of purchases to collect a complete set. Suppose that there is a set of 100 collectibles, each item purchased being equally likely to be any of the hundred types. If a consumer purchases collectibles one at a time until obtaining a complete set, how many purchases will be made? It will take one purchase to get one item. With one item type, each purchase will add to the collection with probability 0.99, so the consumer expects to purchase 100/99 more items to get the second item. With two item types, each purchase will add to the collection with probability 0.98, so the consumer expects to purchase 100/98 more items to get the third item. This process continues until the consumer has all the items but one; then each purchase will complete the collection with probability .01, so the consumer expects to purchase 100/1 more items to get the last type, completing the collection. This process indicates a total of

purchases, about 519. In general, if there are n types, then the expectation is

For large n a good approximation is

to collect all n objects. The constant γ≂ 0.5772156649 is the Euler–Mascheroni constant, named for Leonhard Euler and Lorenzo Mascheroni, while o(1) is a constant used in computer science meaning a function that converges to zero for very large inputs, such that the value is effectively zero for very large n.

This illustration gives insight into why it seems harder and harder to make further progress in collecting, the further you get. In the example of collecting a complete set of 100 collectibles, with each purchase equally likely to be any of the hundred types, the expected purchases needed is about 519. Suppose now that one has accumulated a collection of 50 different items; is that really halfway to a complete collection? By a similar analysis, the expected number of additional purchases to collect the remaining 50 items is about 450. So there is a meaningful sense in which 450/519 of the collecting task is still undone; a more accurate description of the progress is that the collection is 13.3% completed. In the sense of expectation, one is not really halfway through collecting 100 items until obtaining the 93rd item. While the assumption that all types are equally likely does not usually hold in practice (some types are rarer, some more common), the qualitative conclusion applies in general, unless a few of the items are extremely rare.

Bibliography

Myers, Amy, and Herbert Wilf. “Some New Aspects of the Coupon-Collector’s Problem.” http://arxiv.org/abs/math/0304229v1.

Ross, Sheldon. A First Course in Probability. 8th ed. Upper Saddle River, NJ: Prentice Hall, 2010.