In an unusual turn of events, I haven't been hammering on about the importance of uncertainties yet in this lesson. We're going to look at them in more detail here.
Without an estimate of uncertainty, a scientific result is basically meaningless. I could tell you that the length of my dog is 2 meters. That sounds unlikely. But if I told you the length of my dog was $2 \pm 1$ meters because I tried to measure him while he was moving, you'd now know how reliable my measurement was.
For reference, this is the dog in question. He is not 2 m long.
While knowing the uncertainty on the length of my dog may not be the most important thing, uncertainties are the most important thing.
One example of this is measurements of the Hubble constant (the expansion rate of the Universe). There are several ways of measuring this value, each of which has its own pro's and con's, and it's own uncertainties. Right now, the two main ways of measuring it (using supernovae and the cosmic microwave background) give different answers. The techniques have improved so much in the last few years that the uncertainties on each of these techniques are now very small. So small that we can confidently say that the results don't agree with eachother.
There are several teams working on repeating the measurements and taking new measurements to try to resolve the problem. One reason for the difference could be that one of the measurements is wrong - there could have been a mistake in the analysis for example. But the other reason is more exciting; we can make the measurements agree if we change some of the physics that goes into the model we use to fit the data. We could have discovered something completely unexpected about how the universe evolves! BUT. Before we can claim that we've found some new physics, we have to be really sure that the results and their uncertainties are correct.
If you want to know more about what's happening with the Hubble constant there's a nice article here
We usually assume that the uncertainties follow a Gaussian distribution. For a Gaussian distribution, the probability of measuring a value $x$ when the true value is $\langle x \rangle$ with an uncertainty $\sigma$ is shown in the figure below:
What this shows is that (statistically speaking) it's quite likely for us to measure a value that is 1.25$\sigma$ away from the "true" value. There's approximately a 21% chance of measuring a difference greater than $\pm$ 1.25 $\sigma$. Note: If you want to calculate the probability of measuring a give $\sigma$ deviation you can use a statistical table, such as this one).
So when is the difference significant? When would we say that the values don't agree? It's common in astronomy to use 3$\sigma$ as the limit for statistical significance. As the figure shows, there's only a 0.3% chance of a measurement that is more than 3$\sigma$ away from the mean occurring by chance. So we say that a measurement is consistent if it is within 3$\sigma$ of the expected value, and discrepant if if lies further away than 3$\sigma$.
So when you're comparing measurements you make in the lab to the "true" values, you can check if your measurement agrees statistically with the published value by checking how many $\sigma$'s away it is.
When we use curve_fit
we get two arrays back: popt
and pcov
. popt
has the values of the parameters we're fitting, and pcov
is the covariance matrix.
Covariance measures the joint probability of two variables. A positive covariance means that high values in variable 1 corresponds to a high value in variable 2 (e.g. $y \propto x$). A negative covariance means the opposite; high values for variable 1 correspond to low values in variable 2. A covariance of zero means there is no correlation.
If we fit a function with just one free parameter, we'd get a $1\times 1$ covariance matrix back. The value of pcov[0]
would be $\sigma^2$ (the variance) so we take the square root to get the uncertainty.
If we were fitting two parameters we'd get a $2 \times 2$ matrix back, where the elements on the diagonal
pcov[0][0]
pcov[1][1]
are $\sigma^2$ for the parameters popt[0]
and popt[1]
.