Book Search

Download this chapter in PDF format

Chapter34.pdf

Table of contents

How to order your own hardcover copy

Wouldn't you rather have a bound book instead of 640 loose pages?
Your laser printer will thank you!
Order from Amazon.com.

Chapter 34: Explaining Benford's Law

Solving in the Frequency Domain

Figure 34-5 is what we have been working toward, a systematic way of understanding the operation of Benford's law. The left three signals, the logarithmic domain, are pdf(g), sf(g) and ost(g). The particular examples in this figure are the same ones we used previously (i.e., Fig. 34-4). These three signals are related by convolution (Eq. 34-3), a mathematical operation that is not especially easy to deal with. To overcome this we move the problem into the frequency domain by taking the Fourier transform of each signal. Using standard DSP notation, we will represent the Fourier transforms of pdf(g), sf(g), and ost(g), as PDF(f), SF(f), and OST(f), respectively. These are shown on the right side of Fig. 34-5.

By moving the problem into the frequency domain we replace the difficult operation of convolution with the simple operation of multiplication. That is, the six signals in Fig. 34-5 are related as follows:

A small detail: The Fourier transform of pdf(g) is PDF(f), while the Fourier transform of pdf(-g) is PDF*(f). The star in PDF*(f) means it is the complex conjugate of PDF(f), indicating that all of the phase values are changed in sign. However, notice that Fig. 34-5 only shows the magnitudes; we are completely ignoring the phases. The reason for this is simple– the phase does not contain information we are interested in for this particular problem. This makes it unimportant if we use pdf(g) vs. pdf(-g), or PDF(f) vs. PDF*(f).

Notice how these signals represent the key components of Benford's law. First, there is a group of numbers or a probability density function that can generate a group of numbers. This is represented by pdf(g) and PDF(f). Second, we modify each number in this group or distribution by taking its leading digit. This action is represented by convolving pdf(g) with sf(g), or by multiplying PDF(f) by SF(f). Third, we observe that the leading digits often have an unusual property. This unusual characteristic is seen in ost(g) and OST(f).

All six of these signals have specific characteristics that are fixed by the definition of the problem. For instance, the value at f=0 in the frequency domain always corresponds to the average value of the signal in the logarithmic domain. In particular, this means that PDF(0) will always be equal to one, since the area under pdf(g) is unity. In this example we are using a Gaussian curve for pdf(g). One of the interesting properties of the Gaussian is that its Fourier Transform is also a Gaussian, one-sided in this case, as shown in Fig. (d). These are related by σf = 1/(2πσg).

Since sf(g) is periodic with a period of one, SF(f) consists of a series of spikes at f = 0, 1, 2, 3, ..., with all other values being zero. This is a standard transform pair, given by Fig. 13-10 in chapter 13. The zeroth spike, SF(0), is the average value of sf(g). This is equal to the fraction of the time that the signal is in the high state, or log(2) - log(1) = 0.301. The remaining spikes have amplitudes: SF(1) = 0.516, SF(2) = 0.302, SF(3) = 0.064, and so on, as calculated from the above reference.

Lastly we come to ost(g) and OST(f). If Benford's law is being followed, ost(g) will be a flat line with a value of 0.301. This corresponds to OST(0) = 0.301, with all other values in OST(f) being zero. However, if Benford's law is not being followed, then ost(g) will be periodic with a period of one, as show in Fig. (c). Therefore, OST(f) will be a series of spikes at f = 0, 1, 2, 3, ..., with the space between being zero.

Next Section: Solving Mystery #1