Thursday, December 17, 2009

Poisson Processes

Goal

Probability distribution, at time $t$ for $N$ events in a subsequent time interval $[t, t+\tau]$: $P[N,\tau,t]$. As we shall see, as a practical matter, we primarily focus on calculating $P[1, \tau, t]$.

Assumptions

  1. For a small time interval $h$, as $h \to 0$, $P[1,h,t] \to \lambda h$ for all $t$ and $P[N>1,h,t] \to 0$. Therefore $P[0,h,t] \to 1 - \lambda h$.
  2. $P$ is independent of $t$: $P[N, \tau, t] = P[N, \tau]$ for all $t$. In other words, it is stationary.
  3. The probabilities of events in disjoint intervals are independent. If $t_2-t_1>\tau_1$ then,
    $P[N_2,\tau_2,t_2 ; N_1,\tau_1,t_1] = P[N_2,\tau_2,t_2] \cdot P[N_1,\tau_1,t_1]$
    Alternatively, consider four successive time points $t_4 \geq t_3 \geq t_2 \geq t_1$. Then:
    $P[N_2,t_4-t_3,t_3 ; N_1,t_2-t_1,t_1] = P[N_2,t_4-t_3,t_3] \cdot P[N_1,t_2-t_1,t_1]$
  4. $P$ is memoryless. We might feel as though having already waited a long time for an event to occur that the next event should happen sooner. However, since $P$ is memoryless, it does not remember how long it took for the last event to occur.
    Note that this is not the same as being stationary.

Derivation of $P[0,\tau]$

We wamt to derive an expression for $P[N,\tau]$ using the assumed properties listed above. To do so, we start by looking at probabilities for 0 events in time intervals. As an aid in understanding, think about the following situation: you start standing at a bus stop at a certain time $t_1$. You wait until a later time $t_2$ and observe that no bus has arrived. For some reason, you stick around until an even later time $t_3$ and notice that a bus has still not arrived. This is strange, because you know that the buses arrive in a way that obeys to assumptions of the Poisson process. Mathematically, consider three successive times $t_3>t_2>t_1$, and the joint probability
$P[0,t_3-t_1 ; 0, t_2 - t_1]$
By the independence of events in disjoint intervals,
$P[0,t_3-t_1 ; 0, t_2 - t_1] = P[0,t_3-t_2] \cdot P[0,t_2-t_1]$

Alternatively, due to the memoryless assumption, we have


$P[0,t_3-t_1 ; 0, t_2 - t_1] = P[0,t_3-t_1]$
This may seem counterintuitive, but it can be understood if we think abou the fact that we have arbitrarily divide the interval $t_3-t_1$ into two by inserting $t_2$. Doing so should not alter the probability! Therefore, we have the above expression.
Combining the two above, we have
$P[0,t_3-t_1] = P[0,t_3-t_2] \cdot P[0,t_2-t_1]$
This expression in hand, we now derive the analytical form for $P[0,\tau]$. To do so, we differentiate the equation by $t_2$. Along the way, we adopt some abbreviated notation that should be clear enough
$0 = p_0'(t_3-t_2) \cdot p_0(t_2-t_1) + p_0'(t_3-t_2) \cdot p_0(t_2-t_1)$

From this we conclude that


$\frac{1}{p_0(t_3-t_2)}\frac{dp_0(t_3-t_2)}{d(t_2)} = \frac{1}{p_0(t_2-t_1)}\frac{dp_0(t_2-t_1)}{d(t_2)}$

Using a standard technique in separation of variables, we note that the two sides of the equation are respectively functions of only $p_0(t_3-t_2)$ and $p_0(t_2-t_1)$. Therefore each side is separately equal to a constant, $\lambda$. Writing $t_2-t_1$ as $\tau$,


$\frac{1}{p_0(\tau)}\frac{dp_0(\tau)}{d\tau} = \lambda$

which has the unique solution


$p_0(\tau) = e^{-\lambda \tau + C}$

Since $p_0(0) = 0$,

$p_0(\tau) = e^{-\lambda \tau + C}$

Derivation of $P[N,\tau]$

From $p_0(\tau)$ we can proceed to calculate an analytical expression for $p_n(\tau)$ (we will use this notation from now on).
There are many approaches to deriving an analytical expression for $p_n(\tau)$. One way is to solve the "master equation", or "birth and death" equation by one of many methods. One can use generating functions, induction, or other approaches. I found a much more satisfying approach online notes.
Let $p_n(\tau)$ denote the probability of n events in an interval $\tau$. We have just derived that

$p_0(\tau) = e^{-\lambda \tau}$

Next, we derive $p_1(\tau)$ from $p_0(\tau)$. Consider a time interval $\tau=t_3-t_1$. Let the single event occur at a time $t_2$. We don't need to be explicit about when the event occurs. Therefore we can place the event, which takes place in a vanishingly small time $\delta \tau$ anywhere in the interval. Based on the independence of disjoint intervals, we can immediately write

$p_1(t_3-t_1 | \text{event at } t_2) = p_0(t_2-t_1) \cdot p_1(\delta t) \cdot p_0(t-3-t_2)$

To remove the conditioning, we integrate over $t_2$ from $0 \ldots \tau$.

$p_1(t_3-t_1) = \int_0^\tau p_1(t_3-t_1 | \text{event at } t_2) dt_2$

$=\int_0^\tau e^_{-\lambda t_2} \cdot \lambda \cdot e^{-\lambda(\tau-t_2)} dt_2$

$=\lambda \cdot e^_{-\lambda \tau} \cdot \int_0^\tau dt_2$

$=\lambda \tau e^_{-\lambda \tau}$

We can now extend this process for $N$ events by further subdividing the interval. Let $\tau = t_N-t_0$ and $t_i$ be the time of the $i^{th}$ event.

$p_N(\tau | t_1 \ldots t_N) = p_0(t_1-t_0)\Pi_{n \in 2\ldots N} \lambda \cdot p_0(t_n-t_{n-1})$

$=\lambda^N e^{-\lambda \tau}$

Next we remove the conditioning on the $t_1 \ldots t_N$.

$p_N(\tau) = \lambda^N e^{-\lambda \tau} int_{t_0}^{t_2}dt'_1 \int_{t_1}^{t_3}dt'_2 \cdots \int_{t_{n-1}}^{t_{n+1}}dt'_n \cdots \int_{t_{N-2}}^{\tau}dt'_{n-1}$

This integral is going to be nasty do evaluate. Therefore, we use a trick. Although I numbered the times of the events, $t_n$, consecutively, I will consider the indices to be arbitrary markers. This means that the $t_n$ can actually happen in any order. In order to keep the calculation correct, I will now also need to divide by the $N!$ different orders in which the events can happen. Now all of the $t_n$ can range from $t_0$ to $\tau$. This simplifies the integral so that we have

$p_N(\tau) = \frac{1}{N!}\lambda^N e^{-\lambda \tau} \times \ldots$

$int_{t_0}^{t_2}dt'_1 \int_{t_0}^{\tau}dt'_2 \cdots \int_{t_0}^{\tau}dt'_n \cdots \int_{t_0}^{\tau}dt'_{n-1}

which evaluates to

$ = \frac{1}{N!} \lambda^N \tau^N e^{-\lambda \tau} = \frac{(\lambda \tau)^N}{N!}e^{-\lambda \tau}$

Conclusion

This concludes a derivation of the Poisson process. From the three assumptions stated initially, we derived the probability distribution for $N$ events in a time interval $\tau$:

$p_N(\tau) = frac{(\lambda \tau)^N}{N!}e^{-\lambda \tau}$

The Poisson process has many useful applications and also many interesting and useful properties, some of which we will discuss in later posts.

Appendix A -- quick and dirty $p_0(\tau)$

I found a quicker approach to $p_0(\tau)$ here. Actually, I think it is the more common one.

$p_0(\tau + \Delta \tau) = P[\text{no event in } \tau ; \text{ no event in } \Delta \tau]$

By independence,

$p_0(\tau + \Delta \tau) = P[\text{no event in } \tau \cdot P[\text{ no event in } \Delta \tau]$

$\approx p_0(\tau) \cdot (1 - \lambda \Delta \tau)$

From this we can write

$p_0(\tau + \Delta \tau) - p_0(\tau) = - \lambda \Delta \tau \cdot p_0(\tau)$

which is leads to the differential equation

$\frac{dp_0(\tau)}{d\tau} = - \lambda \cdot p_0(\tau)$

with solution

$p_0(\tau) = e^{-\lambda \tau)$

I guess the main difference between this and what I did is $t_2 \to \tau-\Delta \tau$.

No comments:

Post a Comment