Physics307L:People/Smith/Notebook/7: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 66: Line 66:
**I created a histogram vector from our data: the probability of seeing k number of events.  To calculate the "probability" of seeing k number of events, I divided the frequency of k number of events by the total number of events.  I think this makes sense, since the sum of this vector is 1.
**I created a histogram vector from our data: the probability of seeing k number of events.  To calculate the "probability" of seeing k number of events, I divided the frequency of k number of events by the total number of events.  I think this makes sense, since the sum of this vector is 1.
**I plotted the three distributions (our data pdf, Poisson pdf, and Gaussian pdf) vs. k number of events.  I created separate figures for each run.  These figures can be seen on this page as Figures 6-20.
**I plotted the three distributions (our data pdf, Poisson pdf, and Gaussian pdf) vs. k number of events.  I created separate figures for each run.  These figures can be seen on this page as Figures 6-20.
**For each run, I found what I think is the Chi-Square goodness-of-fit of the Poisson PDF to our data, and the Chi-Square goodness-of-fit of the Gaussian PDF to our data.
**For each run, I found what I think is the Chi-Square goodness-of-fit of the Poisson PDF to our data, and the Chi-Square goodness-of-fit of the Gaussian PDF to our data. I had a bit of trouble figuring out how to use the Chi-Square distribution to measure the goodness-of-fit of distributions to our data, and I may have ended up doing it wrong, but heres what I did:
***In order to calculate the Chi-Square goodness-of-fit of the two different probability functions to our data, I used <math>\chi^2 = \sum_{i=1}^k \frac{(X_i - \mu_i)^2}{\sigma_i^2}</math>
***In order to calculate the Chi-Square goodness-of-fit of the two different probability functions to our data, I used <math>\chi^2 = \sum_{i=1}^k \frac{(X_i - \mu_i)^2}{\sigma_i^2}</math>, where <math>X_i</math> is the probability of seeing k events we measured,<math>\mu_i</math> is the probability 'prediction' for that number of events and <math>\sigma_i</math> is the standard deviation.
 
***In order to determine whether the goodness-of-fit of Poisson distribution and the goodness-of-fit of a Gaussian distribution varies with &lambda;, I created a log-log plot of Chi-Square vs. Lambda.  This plot can be seen as Figure 21.
***In order to determine whether the goodness-of-fit of Poisson distribution and the goodness-of-fit of a Gaussian distribution varies with &lambda;, I created a log-log plot of Chi-Square vs. Lambda.  This plot can be seen as Figure 21.
*In order to compare the standard deviation of our data to the standard deviation of a Poisson distribution, I created a log-log plot of the standard deviation of our data and the standard deviation of a Poisson distribution vs. dwell time.  This figure can be seen as Figure 22.
*In order to compare the standard deviation of our data to the standard deviation of a Poisson distribution, I created a log-log plot of the standard deviation of our data and the standard deviation of a Poisson distribution vs. dwell time.  This figure can be seen as Figure 22.

Revision as of 15:37, 6 December 2007

Lab 7: Poisson Statistics

Lab Partner: Kyle Martin

Preface

The lab manual we have been using for this class, which is last year's lab manual written by Dr. Gold, has a very sparse section for the Poisson Statistics lab. Kyle and I referred to it for the basic premise of the measurements we took and I referred to it for some basic ideas for data analysis.

Purpose

The overall goal of this lab is familiarization with multichannel analyzers and Poissonian data. Reading through my colleague's notebooks (yes, there is something to be said for open science, and for the use of wikis), I came up with some important questions I wanted to answer in the course of this lab. Specifically,

  • Are the random, independent events of muons (see here for more information about muons and muon sources) striking a scintillation detector in our laboratory described accurately by a Poisson distribution?
    • Is the standard deviation of the number of events we measured described accurately by the standard deviation of a Poisson distribution?
    • Does the goodness-of-fit of the Poisson distribution change with the anticipated number of events?
  • Does a Gaussian distribution accurately represent random, independent events?
    • Does the goodness-of-fit of the Gaussian distribution change with the anticipated number of events?

Materials

For this lab, we used

  • A box of lead bricks
  • A Thallium doped sodium iodine crystal scintillator (see here, here and here for more about this)
  • A Photomultiplier tube (see here for more about this)
  • A preamp, amplifier and discriminator ("PAD")
  • A PC data acquisition card with "hydra breakout cable" (which is a connector with many "heads") (a photo would be nice here... I don't have one)
  • Multichannel Analyzer Software
  • A NIM-bin
  • A High-Voltage DC power supply for the PMT
  • Mathworks' MATLAB software for analyzing the data collected and answering my questions above

Setup

The photomultiplier tube (PMT) and scintillator are placed in the box of lead bricks, in order to block sources of radiation we don't wish to measure. The PMT has a large potential across it, provided by the high voltage DC power supply. The scintillator is attached to the PMT. Muons striking the scintillator will create ultraviolet photons which are "detected" by the PMT - incident photons will cause a drop in voltage across the PMT. The scintillation detector (the phototube and attached scintillator) is connected with a BNC cable to the amplifier and discriminator module, which in turn is connected to the data acquisition card in the PC. Voltage drops across the PMT, corresponding to incident muons, will be recorded by the multichannel analyzer software. We don't care about the energy of the muons here, so we don't bother to record the current produced by the PMT, we just record when they occur.

For acquiring data, Kyle and I turned the amplifier down ("minimum gain") and didn't mess with the discriminator much. I wish I had remembered to record what settings we used, but they aren't really all that important. The goal here is to gather the times when some random events happen and examine their distribution; setting the amplifier higher would probably just increase the sensitivity of the muon detection, making the number of events we record higher for any given time interval, and setting the discriminator would do something similar by selecting the threshold voltage of events to pass along to the PC data acquisition card.

Methods

  • The multichannel analyzer software will record events it receives for a given time interval ("dwell time") in one "channel". Setting the MCA software to collect 512 channels at a dwell time of 1 millisecond, for instance, will record events which happen between 0-1ms in channel 1, events that happen between 1-2ms in channel 2 and so on until it records 512 channels. The MCA software will display the results on screen as a plot of number of events vs. channel number. Saving the results of a measurement of 512 channels will create an ASCII (or text, if you're like me and don't care for high-falutin' nonsense) file. This file will have several lines of text to start with which record some parameters of the MCA software, followed by 512 rows of 3 columns of numbers, separated by commas. The first column of this file is the channel number, the second column is the number of events recorded and the third column is something else (I've got not idea what it's for, maybe for pulse height analysis.)
  • Using the multichannel analyzer software on the PC, Kyle and I took several measurements using our setup. We measured the number of events occurring for dwell times of 1 ms, 2 ms, 4 ms, 8 ms, 10 ms, 20 ms, 40 ms, 80 ms, 100 ms, 200 ms, 400 ms, 800 ms, 1 s, 2 s and 4 s. The 1 ms, 2 ms, 4 ms, 8 ms, 10 ms, 20 ms, and 40 ms dwell-time measurements were taken with 1024 channels, the 80 ms, 100 ms, 200 ms, 400 ms, 800 ms, 1 s, 2 s, and 4 s dwell-time measurements were taken with 256 channels. The output of each of these measurements were saved to file.
  • I loaded these files in MATLAB to examine the distribution of events we recorded.

Data and Analysis

Using MATLAB, I wrote an "M-file", or MATLAB script, to load and examine our data. I used MATLAB's "Publish to HTML" function to save both the code and output (with figures, saved as .png files) as HTML files. I zipped these files, and uploaded the zip file. It can be downloaded here. I went a little bit overboard in scripting this: the script itself is more than 250 lines long. I'm sure if I were actually good at coding in MATLAB, this script would be significantly different (i.e. better!), but I'm just learning MATLAB.

I will post a description of what I did in my MATLAB script, along with relevant figures from its output. But, first, I will try to describe some relevant details about Poisson and Gaussian distributions.

About Gaussian Distributions

The Gaussian (or "normal") probability density function is

[math]\displaystyle{ \frac1{\sigma\sqrt{2\pi}}\; \exp\left(-\frac{\left(x-\mu\right)^2}{2\sigma^2} \right) \! }[/math]

Where [math]\displaystyle{ \sigma }[/math] is the standard deviation of the data, and [math]\displaystyle{ \mu }[/math] is the expected value. In our case, I believe the "expected value" is the mean of our data and the "standard deviation" is the standard deviation of our data.

About Poisson Distributions

The Poisson probability mass function (analogous to the probability density function, but for discrete values) is

[math]\displaystyle{ f(k;\lambda )=\frac{e^{-\lambda } \lambda^k}{k!},\,\! }[/math]

Where [math]\displaystyle{ \lambda }[/math] is the number of events per time interval and k is the number of events.

MATLAB Script Summary and Output

  • I first calculated the standard deviations of the number of events we recorded (separately for each run of different dwell times, of course.) These standard deviations will later be compared to the standard deviations of a Poisson distribution with parameters found from our data (λ , or number of events per time interval).
    • For calculating standard deviation, I used the equation [math]\displaystyle{ s = \left( \frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2 \right) ^{\frac{1}{2}} }[/math], where s is the standard deviation, n is the number of items in the sample, [math]\displaystyle{ x_i }[/math] is the ith item in the sample x, and [math]\displaystyle{ \bar{x} }[/math] is the mean of the sample x.
  • I then plotted our data. I created 3 figures with 4 subplots and 1 figure with 3 subplots. The subplots show something similar to what was displayed by the MCA software: a plot of the number of events vs. the channel.
    • These figures are shown on this page as Figures 1-4.
  • In order to compare the distribution of our data to the a Poisson distribution and Gaussian distribution, I did the following:
    • I estimated the lambda for each run. These numbers can be seen in Table 1, and I made of plot of these numbers vs. dwell time (see Figure 5).
      • In order to estimate the lambda, I used the equation [math]\displaystyle{ \lambda_{MLE} = \frac{1}{n} \sum_{i=1}^n x_i }[/math], where [math]\displaystyle{ \lambda_{MLE} }[/math] is the "Maximum Likelihood Estimate" of lambda, n is the number of items in the sample, and [math]\displaystyle{ x_i }[/math] is the ith item in sample x. I tend to use [math]\displaystyle{ \lambda }[/math] in place of [math]\displaystyle{ \lambda_{MLE} }[/math] in my figures, so keep that in mind. If my method of determining the [math]\displaystyle{ \lambda_{MLE} }[/math] produces an inaccurate estimate of [math]\displaystyle{ \lambda }[/math], most of my figures will be affected.
    • I used the lambda estimated above to create vectors of the probability of seeing k number of events using a Poisson probability density function, and to create vectors of the probability of seeing k number of events using a Gaussian probability density function.
      • In order to estimate the probability of seeing k number of events using a Poisson MDF with [math]\displaystyle{ \lambda }[/math] events per time interval, I used [math]\displaystyle{ y = \frac{\lambda^k}{k!}e^{-k} }[/math], where [math]\displaystyle{ \lambda }[/math] is [math]\displaystyle{ \lambda_{MLE} }[/math] determined earlier and k is the number of events I wish to evaluate at.
      • In order to estimate the probability of seeing k number of events using a Gaussian PDF with [math]\displaystyle{ \lambda }[/math] as the mean and the standard deviation I calculated earlier, I used [math]\displaystyle{ y = \frac{1}{\sigma \sqrt{2\pi}} e^{\frac{-(k-\mu)^2}{2\sigma ^2}} }[/math], where [math]\displaystyle{ \mu }[/math] is the mean (also, interestingly, [math]\displaystyle{ \lambda_{MLE} }[/math]), [math]\displaystyle{ \sigma }[/math] is the standard deviation, and k is the number of events I wish to evaluate at.
    • I created a histogram vector from our data: the probability of seeing k number of events. To calculate the "probability" of seeing k number of events, I divided the frequency of k number of events by the total number of events. I think this makes sense, since the sum of this vector is 1.
    • I plotted the three distributions (our data pdf, Poisson pdf, and Gaussian pdf) vs. k number of events. I created separate figures for each run. These figures can be seen on this page as Figures 6-20.
    • For each run, I found what I think is the Chi-Square goodness-of-fit of the Poisson PDF to our data, and the Chi-Square goodness-of-fit of the Gaussian PDF to our data. I had a bit of trouble figuring out how to use the Chi-Square distribution to measure the goodness-of-fit of distributions to our data, and I may have ended up doing it wrong, but heres what I did:
      • In order to calculate the Chi-Square goodness-of-fit of the two different probability functions to our data, I used [math]\displaystyle{ \chi^2 = \sum_{i=1}^k \frac{(X_i - \mu_i)^2}{\sigma_i^2} }[/math], where [math]\displaystyle{ X_i }[/math] is the probability of seeing k events we measured,[math]\displaystyle{ \mu_i }[/math] is the probability 'prediction' for that number of events and [math]\displaystyle{ \sigma_i }[/math] is the standard deviation.
      • In order to determine whether the goodness-of-fit of Poisson distribution and the goodness-of-fit of a Gaussian distribution varies with λ, I created a log-log plot of Chi-Square vs. Lambda. This plot can be seen as Figure 21.
  • In order to compare the standard deviation of our data to the standard deviation of a Poisson distribution, I created a log-log plot of the standard deviation of our data and the standard deviation of a Poisson distribution vs. dwell time. This figure can be seen as Figure 22.

Figures and Tables

Table 1: Maximum Likelihood Estimate
of λ and standard deviations
Dwell Time [math]\displaystyle{ \lambda_{MLE} }[/math] Standard Deviation
of Data
Standard Deviation of Poisson
Distribution with [math]\displaystyle{ \lambda_{MLE} }[/math]
1ms 0.00586 0.088 0.077
2ms 0.0117 0.125 0.108
4ms 0.0264 0.183 0.162
8ms 0.0684 0.330 0.252
10ms 0.0723 0.311 0.269
20ms 0.147 0.503 0.384
40ms 0.287 0.644 0.536
80ms 0.641 1.02 0.800
100ms 0.648 1.05 0.805
200ms 1.55 1.46 1.24
400ms 2.75 2.22 1.66
800ms 5.86 2.90 2.42
1s 7.14 3.28 2.67
2s 14.6 4.83 3.82
4s 28.6 7.03 5.35
Table 2: [math]\displaystyle{ \chi^2 }[/math] goodness-of-fit for
Poisson and Gaussian distributions
Dwell Time Poisson PDF [math]\displaystyle{ \chi^2 }[/math]
to data histogram
Gaussian PDF [math]\displaystyle{ \chi^2 }[/math]
to data histogram
1ms 0.0007 2267
2ms 0.0014 461.1
4ms 0.0023 62.49
8ms 0.0132 2.530
10ms 0.0092 2.498
20ms 0.0268 0.023
40ms 0.0323 0.1548
80ms 0.0259 0.1242
100ms 0.0318 0.1312
200ms 0.0081 0.0154
400ms 0.0027 0.0037
800ms 0.0007 0.0008
1s 0.0005 0.0007
2s 0.0003 0.0003
4s 0.0001 0.0001

Conclusions and Remarks

Links to other entries

My Wednesday Labs
  • [[../1|Oscilloscope Lab]]
  • [[../2|Balmer Series]]
  • [[../3|Planck's Constant]]
  • [[../4|Speed of Light]]
  • [[../5|Millikan Oil Drop]]
  • [[../6|Electron Spin Resonance]]
  • [[../7|Poisson Statistics]]