%config InlineBackend.figure_format = 'retina'
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
plt.rcParams.update({'font.size':16})
If you've completed all of the last notebook, been and had a cup of tea and watched Emmerdale, you'll need to start here by reading in the big_cepheids_table.csv
file into a dataframe. Note: This isn't a file for you to download, you need to create it yourself.
cepheids_df = pd.read_csv('./data/big_cepheids_table.csv')
cepheids_df
cepheids_df
has the parallax measurement for each star. Parallax is a way to measure distance using trigonometry.
Gaia takes measurements of each star (almost 2 billion stars actually, but I'm not making you look at all of them, not this week anyway...) at different points in the year. It measures the apparent change in position of the nearby stars compared to a background of really distant objects (the angle $\theta$ in the figure). It also knows it's own position in the sky, so knows the length of the base of the triangle (2 astronomical units for observations taken six months apart). Because the angles involved are very small, the distance to the star in question goes to
$$ d = \dfrac{1}{\varpi}$$where $d$ is in parsecs and $\varpi$ is the parallax angle ($\varpi = \theta/2$) in arcseconds. The parallax_mas
column in cepheids_df
is $\varpi$ in milli-arcseconds ($10^{-3}$ arcseconds).
You should end up with your dataframe looking something like this with a new distance_pc
column:
cepheids_df
The dataframe contains two columns that tell us about reddening and extinction:
A_V
is the extinction, $A_V$ - it's how much fainter an object appears in the V bandE_B_V
is the colour excess, $E(B-V)$ - it tells us how much redder the object appears.The colour excess is defined as
$$ E(B-V) = A_B - A_V$$Combining these two pieces of information with some information about dust behaves, we can correct for extinction at any wavelength. The information about how dust interacts with light is contained in a reddening law. Reddening law's can get quite complex, but for our purposes we can approximate the reddening law at optical wavelengths as
$$ \dfrac{A_{\lambda}}{A_V} = \dfrac{0.7}{ \lambda} - 0.273 $$where $A_{\lambda}$ is the extinction at wavelength $\lambda$, and $\lambda$ is in microns ($10^{-6} \text{ m}$).
The wavelengths of the three bands we're looking at are
To correct for extinction, all we have to do is subtract the extinction value from the corresponding magnitude, i.e.
$$ V_0 = V_{app} - A_V $$and so on for $B$ and $I$.
(remember magnitudes go backwards, so subtracting the extinction makes the star brighter)
You should end up with something like this:
cepheids_df.head()
We now have all the ingredients we need to find the absolute magnitude of each star.
The absolute magnitude in the $V$ band is given by
$$ V_{abs} = V_{app} - A_V - \mu$$and similar for the $B$ and $I$ bands.
$\mu$ is the distance modulus, defined as
$$\mu = 5 \log_{10} d - 5 $$First we can define a function to calculate the distance modulus:
def dist_mod(distance):
mu = 5.*np.log10(distance) - 5.
return(mu)
We can define another function to calculate the absolute magnitude:
def abs_mag(app_mag, extinction, mu):
absolute_mag = app_mag - extinction - mu
return(absolute_mag)
Now we're going to use the apply
function in pandas
to calcuate the absolute magnitudes.
apply
makes it easier to to more complex calculations on dataframes.
One of the key parts of apply
is the lambda
parameter. Defining
apply(lambda x: ...)
means we can now refer to the dataframe just as x
. This makes things much easier when we've got a lot of things to include.
We can now combine the functions to find the absolute magnitudes:
cepheids_df['V_abs'] = cepheids_df.apply(lambda x: abs_mag(x.mag_V, x.A_V, dist_mod(x.distance_pc)), axis=1)
cepheids_df
Notice in the apply
call above, we're creating a new column not just by running a function, but calling another function within that function. Doing all that while writing out the full dataframe name every time is just asking for mistakes.
There's lots of other uses for apply
that you can read more about here. For now we'll stick with the simple case.
Your dataframe should now have 67 rows and 17 columns:
cepheids_df.columns
Our final task is to make a nice plot of the period-luminosity relation. When we did this with the Magellanic Clouds data before we were assuming all the stars were at the same distance. This time we've corrected for distance instead, and we'll be looking at the absolute magnitudes.
We can just make a quick plot:
fig = plt.figure(figsize=(10,6))
ax = fig.add_subplot(1,1,1)
ax.plot(cepheids_df['logP'], cepheids_df['B_abs'], 'o', label='B')
ax.plot(cepheids_df['logP'], cepheids_df['V_abs'], 'o', label='V')
ax.plot(cepheids_df['logP'], cepheids_df['I_abs'], 'o', label='I')
ax.set_xlabel('log P (days)')
ax.set_ylabel('Absolute Magnitude')
ax.invert_yaxis()
ax.legend()
plt.show()
That doesn't look very nice. It doesn't look how I expect it to. Let's look at just one set of data for now -- the $I$ band data
fig = plt.figure(figsize=(10,6))
ax = fig.add_subplot(1,1,1)
ax.plot(cepheids_df['logP'], cepheids_df['I_abs'], 'o', label='I')
ax.set_xlabel('log P (days)')
ax.set_ylabel('Absolute Magnitude')
ax.invert_yaxis()
ax.legend()
plt.show()
There's a lot more data in our dataframe than we're showing here. Perhaps something else is affecting the absolute magnitudes? My first guess is that it's extinction. We can look at this by colour coding the points using plt.scatter()
.
First we need to choose a colour map - this is done with get_cmap
. You can see all the available colour maps here. I'm using the plasma
colour map as it's a"perceptually uniform sequential" map - this means that a colour-blind person (or someone with a black and white printer) would still be able to interpret the colour scale.
We also need to have a colour bar so we can translate the colours into actual numbers when we look at the plot. We do this in two steps. First:
sc = ax.scatter(....
assigns the output of ax.scatter
to the variable sc
.
Then
cbar = plt.colorbar(sc)
makes the colour bar that corresponds to our data.
fig = plt.figure(figsize=(10,6))
ax = fig.add_subplot(1,1,1)
cm = plt.cm.get_cmap('plasma')
sc = ax.scatter(cepheids_df['logP'], cepheids_df['I_abs'], c=cepheids_df.A_I, marker='o', label='I', cmap=cm, edgecolor='Grey')
ax.set_xlabel('log P (days)')
ax.set_ylabel('Absolute Magnitude')
ax.invert_yaxis()
ax.legend()
cbar = plt.colorbar(sc)
cbar.set_label('$A_I$')
plt.show()
Now we can see that the stars that had a larger extinction correction (the red to yellow points) seem to lie farther away from the trend than the blue points.