Binomial mass distribution
For a discrete random variable define the probability mass function
If can be only one of then for and if . We also must have .
A single Bernoulli random variable is one for which (success) and (failure).
We construct the generating function (in statistical physics this is the partition function)for multiple trials of a single Bernoulli process (such as shooting arrows at a bullseye) using as a simple place-holder or enumerator
and
Then since all trials are independent, all probabilities are multiplicative and
This should look very familiar; this is the binomial distribution. Note that in a generating function the coefficient of the the probability of event characterized by .
Bernoulli-binomial random variable. The Bernoulli random variable (a non-negative integer) is also called the binomial. Calling probability mass function , and
gives the probability of successes and failures in independent trials with individual success probability . Note that (if )
Example. An archer has a probability of making a bull’s eye of with each shot. What is her average score in a set of six shots?
The probability of bull’s eyes in the set is
so her average will be
You can see that performing the sum could be mathematically difficult if is complicated.
Example The probability that a given atom out of trapped in volume will be in subvolume is , . This was part of a qualifier problem.
Geometric random variable Consider independent trials, each with a success probability , are performed until success occurs. is the number of trials required until success,
Note that
is a geometric random variable. A distribution is memory-less if . The only distribution taking values in the positive integers that is memory-less is the geometric, since
and clearly satisfies this.
Poisson random variable The Poisson variable (a non-negative integer) is
This is the limit of and , of the Bernoulli variable.
Consider the simplest experiment that can be conceived of; within a vanishingly small time interval , a process either occurs (such as a nuclear decay), with probability , or does not, with probability . The probability that within a macroscopic time , that such events occur, is generated by
resulting in the binomial distribution
Notice that we say rather than for a discrete distribution.
We call the generating function of the distribution, which is very useful in obtaining moments.
The expectation or mean of this distribution is
This PDF has two important limits.
First if we consider the case of but with such that remains constant, we get
and then
results in the Poisson distribution
with parameter .
Example The probability that an ISP gets a call from a subscriber in any given hour is . The ISP has subscribers. What is the probability that the ISP will get calls in the next hour?
First we see that the most probable number of calls is
in an hour. Then
The second important limit of the binomial theorem is used when we want to explore the region around the maximum value , which is the most probable value. We will expand the natural log of the distribution around by treating the discrete variable as if it were continuous, calling it for consistency with our previous examples;
however the second term vanishes since is an extreme point of the distribution. Take the logarithm of the probability
To evaluate the derivatives we use Stirling’s approximation
and so
which we can normalize via
which we recognize as the Bell curve (the Normal distribution), peaked around with standard deviation . Notice that the Maxwell-Boltzmann distribution is a normal distribution in velocity.
Fun fact: if , then .
Example (old format qualifier problem) A researcher performs counting of nuclear decays from a sample, making ten counts for ten seconds each, getting
counts. For how many seconds should she count in order to measure the decay rate accurate to ?
Nuclear decays are Poissonian random deviates. We want to have , so we should count long enough so that the average number of counts gotten is . We are getting now, so we should count for , .
Example (Gibbs distribution in Physics 415/715).
Let the universe consist of a system and a reservoir , so . The reservoir has boxes that can each hold or particle, and the system has such boxes. How many states does a universe containing particles have?
The number of ways that objects can be put into locations is
which is the coefficient of in
Carefully expand the left side;
By comparing these two expressions and picking out the coefficient of on each side we obtain
where is the number of states of the system when populated with particles and is the number of states of the reservoir when populated with particles. These are mass functions since is a discrete variable, we would use if were continuous.
What is the probability that contains exactly particles?
This will be the ratio of the number of states with particles in and in to the total number of states
The Boltzmann entropy of the system containing particles is , and {\bf Boltzmann’s maximal entropy principle} says that the equilibrium population is the most probable. Use Stirling’s formula to evaluate this: maximize the entropy with respect to
taking anti-logs
we obtain the result that the most likely distribution of particles between system and reservoir is the one that makes their {\bf densities} and equal! Define the {\bf chemical potential}
which represents the work needed to move a particle from to when and are in thermal and matter equilibrium, and is at temperature . Then for
This is an interesting and exact result. Notice that the equilibrium condition says equilibrium between and occurs when
then there is no energy cost to move particles from to . Another observation is that if , then the density of (the number of particles in ) is low. Under those conditions matter would migrate from into until , matter migrates from high to low .
Stirling’s approximation
The gamma function is an analytic continuation of the factorial to non-integer values
If an integer, then , which is easy to prove by integration by parts or by induction.
Recall
find the -value that makes minimal, it contributes the most to the integral, and expand around it
put this back into , and use the integral to obtain an approximate formula for
when is very large. Finally you will need to use
for very large.
Cumulants
Let denote an event . Cumulative distribution function
is non-decreasing, approaches for and for and is right-continuous
Note that
or
and calculation that is a simple limit
Example (Problem 4.1)
Two balls are randomly chosen from an urn containing 8 white, 4 black and 2 orange balls. You win dollar for each black, and lose dollar for each white selected. Let denote your winnings. What are the possible values of , and find the mass function for the distribution of this variable.
Probability of white (-2): , probability of black (4): , probability of white no black (-1): , probability of black no white (2): , probability of no white no black (0): , probability of white black (1): .
Example
Two fair dice are rolled. Let equal the product of the values. Compute for each .
Example
Suppose that the number of accidents occurring on a highway each day is a Poisson random variable with parameter .
A. Find the probability of 3 or more accidents per day.
B. Find the probability of 3 or more accidents per day provided at least one accident has occurred.
Let be the even of or more accidents, and the event of or more accidents
Example
If you buy a lottery ticket in lotteries, in each of which your chances of winning a prize is , what is the probability that you will win a prize at least once, exactly once, at east twice?
This is the limit of a binomial (Bernoulli) random number; many trials each with small probability
Example
From the generic grand canonical partition function for identical particles with single-particle partition functions determine the distribution function for the number of particles in the system when in equilibrium with a heat and particle reservoir. What is , the average value?
This is a common qualifier problem these days.