Solutions¶

Mass histograms

Below is an example solution for stars with $M < M_{\odot}$. First I created a new dataframe with just the low mass stars. I then created the histograms and the caption.

In [11]:

low_mass_df = planet_df.where(planet_df['star_mass'] <=1.0).dropna()

In [12]:

low_mass_df

Out[12]:

	name	orbital_period	semi_major_axis	detection_type	star_mass
3	14 Her b	1773.4000	2.770000	Radial Velocity	0.900
6	1SWASP J1407 b	3725.0000	3.900000	Primary Transit	0.900
7	24 Boo b	30.3506	0.190000	Radial Velocity	0.990
13	42 Dra b	479.1000	1.190000	Radial Velocity	0.980
24	61 Vir b	4.2150	0.050201	Radial Velocity	0.950
...	...	...	...	...	...
3798	eps Ind A b	16509.0000	11.550000	Radial Velocity	0.762
3823	tau Cet e	162.8700	0.538000	Radial Velocity	0.783
3824	tau Cet f	636.1300	1.334000	Radial Velocity	0.783
3825	tau Cet g	20.0000	0.133000	Radial Velocity	0.783
3826	tau Cet h	49.4100	0.243000	Radial Velocity	0.783

1074 rows × 5 columns

In [13]:

fig = plt.figure(figsize=(10,6))
ax = fig.add_subplot(1,1,1)
ax.hist(low_mass_df['star_mass'].where(low_mass_df['detection_type']=='Primary Transit'), bins=15, label='Primary Transit', alpha=0.6, color='DeepPink', edgecolor='DeepPink')
ax.hist(low_mass_df['star_mass'].where(low_mass_df['detection_type']=='Radial Velocity'), bins=15, label='Radial Velocity', alpha=0.7, color='DodgerBlue', edgecolor='DodgerBlue')
ax.set_xlabel('Host star mass ($M_\odot$)')
ax.set_ylabel('Count')
ax.legend()
ax.set_title('Number of exoplanets detected using transit and radial velocity\n  techniques for host star masses below 1 $M_{\odot}$')
plt.show()

Number of exoplanets detected as a function of host star mass and detection technique for low mass ($M < 1 M_\odot$) stars. Pink bars denote exoplanets detected using primary transit observations. Blue bars denote those detected using radial velocity measurements. The fraction of stars detected using primary transit is higher for low mass stars.

low_mass_df = planet_df.where(planet_df['star_mass'] <=1.0).dropna()

low_mass_df = planet_df.where(planet_df['star_mass'] <=1.0).dropna()

low_mass_df

low_mass_df

fig = plt.figure(figsize=(10,6))
ax = fig.add_subplot(1,1,1)
ax.hist(low_mass_df['star_mass'].where(low_mass_df['detection_type']=='Primary Transit'), bins=15, label='Primary Transit', alpha=0.6, color='DeepPink', edgecolor='DeepPink')
ax.hist(low_mass_df['star_mass'].where(low_mass_df['detection_type']=='Radial Velocity'), bins=15, label='Radial Velocity', alpha=0.7, color='DodgerBlue', edgecolor='DodgerBlue')
ax.set_xlabel('Host star mass ($M_\odot$)')
ax.set_ylabel('Count')
ax.legend()
ax.set_title('Number of exoplanets detected using transit and radial velocity\n  techniques for host star masses below 1 $M_{\odot}$')
plt.show()

fig = plt.figure(figsize=(10,6))
ax = fig.add_subplot(1,1,1)
ax.hist(low_mass_df['star_mass'].where(low_mass_df['detection_type']=='Primary Transit'), bins=15, label='Primary Transit', alpha=0.6, color='DeepPink', edgecolor='DeepPink')
ax.hist(low_mass_df['star_mass'].where(low_mass_df['detection_type']=='Radial Velocity'), bins=15, label='Radial Velocity', alpha=0.7, color='DodgerBlue', edgecolor='DodgerBlue')
ax.set_xlabel('Host star mass ($M_\odot$)')
ax.set_ylabel('Count')
ax.legend()
ax.set_title('Number of exoplanets detected using transit and radial velocity\n  techniques for host star masses below 1 $M_{\odot}$')
plt.show()

Number of exoplanets detected as a function of host star mass and detection technique for low mass ($M < 1 M_\odot$) stars. Pink bars denote exoplanets detected using primary transit observations. Blue bars denote those detected using radial velocity measurements. The fraction of stars detected using primary transit is higher for low mass stars.

Clean your data

In [14]:

kepler_df = planet_df.where((planet_df['orbital_period'] <= 10000) & (planet_df['star_mass'] > 0.7)).dropna()

In [15]:

kepler_df

Out[15]:

	name	orbital_period	semi_major_axis	detection_type	star_mass
0	11 Com b	326.03000	1.2900	Radial Velocity	2.70
1	11 UMi b	516.22000	1.5400	Radial Velocity	1.80
2	14 And b	185.84000	0.8300	Radial Velocity	2.20
3	14 Her b	1773.40000	2.7700	Radial Velocity	0.90
4	16 Cyg B b	799.50000	1.6800	Radial Velocity	1.01
...	...	...	...	...	...
3827	tau Gem b	305.50000	1.1700	Radial Velocity	2.30
3828	ups And b	4.61711	0.0590	Radial Velocity	1.27
3829	ups And c	240.93700	0.8610	Radial Velocity	1.27
3830	ups And d	1281.43900	2.5500	Radial Velocity	1.27
3831	ups And e	3848.86000	5.2456	Radial Velocity	1.27

1719 rows × 5 columns

Check that we've actually got what we want by looking at the data:

In [16]:

kepler_df['orbital_period'].max()

Out[16]:

9018.0

In [17]:

kepler_df['star_mass'].min()

Out[17]:

0.707

In [18]:

kepler_df['orbital_period'].hist()

Out[18]:

<AxesSubplot:>

In [19]:

kepler_df['star_mass'].hist()

Out[19]:

<AxesSubplot:>

(Yes, I am aware the plots aren't pretty. These aren't supposed to be. Get in the habit of making quick and dirty plots to look at your data)

kepler_df = planet_df.where((planet_df['orbital_period'] <= 10000) & (planet_df['star_mass'] > 0.7)).dropna()

kepler_df = planet_df.where((planet_df['orbital_period'] <= 10000) & (planet_df['star_mass'] > 0.7)).dropna()

kepler_df

kepler_df

Check that we've actually got what we want by looking at the data:

kepler_df['orbital_period'].max()

kepler_df['orbital_period'].max()

9018.0

kepler_df['star_mass'].min()

kepler_df['star_mass'].min()

0.707

kepler_df['orbital_period'].hist()

kepler_df['orbital_period'].hist()

<AxesSubplot:>

kepler_df['star_mass'].hist()

kepler_df['star_mass'].hist()

<AxesSubplot:>

(Yes, I am aware the plots aren't pretty. These aren't supposed to be. Get in the habit of making quick and dirty plots to look at your data)

Add another column

The process is the same as for adding the $T^2$ column.

In [22]:

kepler_df['a_3_m'] = kepler_df['semi_major_axis']**3 / kepler_df['star_mass']

In [23]:

kepler_df

Out[23]:

	name	orbital_period	semi_major_axis	detection_type	star_mass	T_sq	a_3_m
0	11 Com b	326.03000	1.2900	Radial Velocity	2.70	1.062956e+05	0.795070
1	11 UMi b	516.22000	1.5400	Radial Velocity	1.80	2.664831e+05	2.029036
2	14 And b	185.84000	0.8300	Radial Velocity	2.20	3.453651e+04	0.259903
3	14 Her b	1773.40000	2.7700	Radial Velocity	0.90	3.144948e+06	23.615481
4	16 Cyg B b	799.50000	1.6800	Radial Velocity	1.01	6.392002e+05	4.694685
...	...	...	...	...	...	...	...
3827	tau Gem b	305.50000	1.1700	Radial Velocity	2.30	9.333025e+04	0.696353
3828	ups And b	4.61711	0.0590	Radial Velocity	1.27	2.131770e+01	0.000162
3829	ups And c	240.93700	0.8610	Radial Velocity	1.27	5.805064e+04	0.502581
3830	ups And d	1281.43900	2.5500	Radial Velocity	1.27	1.642086e+06	13.056201
3831	ups And e	3848.86000	5.2456	Radial Velocity	1.27	1.481372e+07	113.653232

1719 rows × 7 columns

kepler_df['a_3_m'] = kepler_df['semi_major_axis']**3 / kepler_df['star_mass']

kepler_df['a_3_m'] = kepler_df['semi_major_axis']**3 / kepler_df['star_mass']

kepler_df

kepler_df

In [26]:

kepler_df['orbital_period_sec'] = kepler_df['orbital_period'] * 24. * 60. * 60.
kepler_df['a_metres'] = kepler_df['semi_major_axis'] * 1.496E11
kepler_df['mass_kg'] = kepler_df['star_mass'] * 1.989E30

In [27]:

kepler_df['a_3_m'] = kepler_df['a_metres']**3 / kepler_df['mass_kg']
kepler_df['T_sq'] = kepler_df['orbital_period_sec']**2

kepler_df['orbital_period_sec'] = kepler_df['orbital_period'] * 24. * 60. * 60.
kepler_df['a_metres'] = kepler_df['semi_major_axis'] * 1.496E11
kepler_df['mass_kg'] = kepler_df['star_mass'] * 1.989E30

kepler_df['orbital_period_sec'] = kepler_df['orbital_period'] * 24. * 60. * 60.
kepler_df['a_metres'] = kepler_df['semi_major_axis'] * 1.496E11
kepler_df['mass_kg'] = kepler_df['star_mass'] * 1.989E30

kepler_df['a_3_m'] = kepler_df['a_metres']**3 / kepler_df['mass_kg']
kepler_df['T_sq'] = kepler_df['orbital_period_sec']**2

kepler_df['a_3_m'] = kepler_df['a_metres']**3 / kepler_df['mass_kg']
kepler_df['T_sq'] = kepler_df['orbital_period_sec']**2

Estimate $G \pm \sigma_G$

We're using curve_fit, so we need to import it. Remember to add the import statement to the top of your notebook if you don't already have it loaded.

First things first, define the function that we want to fit. We're estimating $G$ by rearranging the equation for Kepler's 3rd law:

$$ T^2 = \dfrac{4 \pi^2}{G} \dfrac{a^3}{M_*} $$

Our function is then

In [29]:

def keplers_law(a_3_m, g):
    return (4.0 * np.pi**2  * a_3_m / g)

Now we use curve_fit to find $G$ and $\sigma_G$

Note: The first (few...) times I tried this it didn't work. curve_fit got a very strange answer. By providing a starting guess with the p0 parameter the fit worked properly. The problem was the very large numbers ($10^6$ and $10^{17}$). As we can see from the units in the $T^2$ equation, $G$ will be approximately $\dfrac{10^6}{10^{17}} = 10^{-11}$ m$^{3}$ kg$^{-1}$ s$^{-2}$, so p0=1e-11 is a reasonable starting guess.

In [30]:

popt, pcov = curve_fit(keplers_law, (kepler_df['a_3_m']), (kepler_df['T_sq']), p0=1e-11)
g = popt[0] 
g_unc = np.sqrt(pcov[0][0])
print("G = {0:.3e} +/- {1:.3e}".format(g, g_unc))

G = 6.753e-11 +/- 1.591e-13

In [31]:

fig = plt.figure(figsize=(10,6))
ax = fig.add_subplot(1,1,1)
ax.plot(kepler_df['a_3_m']/1e6, kepler_df['T_sq'] / 1e17, 'k.',ls='None', label='Data')
ax.plot(kepler_df['a_3_m']/1e6, keplers_law(kepler_df['a_3_m'], g)/1e17, 'r--', label='Best fit')
ax.set_ylabel('$T^2$ ($10^{17}$ s$^2$)')
ax.set_xlabel('$a^3 / M_*$ ($10^6$ m$^3$ kg$^{-1}$)')
title_string = (r'$G = ({0:.3f} \pm {1:.3f}) \times 10^{{-11}}$ m$^{{3}}$ kg$^{{-1}}$ s$^{{-2}}$'.format((g / 1e-11), (g_unc / 1e-11)))
ax.set_title(title_string)
ax.legend()
plt.show()

def keplers_law(a_3_m, g):
    return (4.0 * np.pi**2  * a_3_m / g)

def keplers_law(a_3_m, g):
    return (4.0 * np.pi**2  * a_3_m / g)

Now we use curve_fit to find $G$ and $\sigma_G$

Note: The first (few...) times I tried this it didn't work. curve_fit got a very strange answer. By providing a starting guess with the p0 parameter the fit worked properly. The problem was the very large numbers ($10^6$ and $10^{17}$). As we can see from the units in the $T^2$ equation, $G$ will be approximately $\dfrac{10^6}{10^{17}} = 10^{-11}$ m$^{3}$ kg$^{-1}$ s$^{-2}$, so p0=1e-11 is a reasonable starting guess.

popt, pcov = curve_fit(keplers_law, (kepler_df['a_3_m']), (kepler_df['T_sq']), p0=1e-11)
g = popt[0] 
g_unc = np.sqrt(pcov[0][0])
print("G = {0:.3e} +/- {1:.3e}".format(g, g_unc))

popt, pcov = curve_fit(keplers_law, (kepler_df['a_3_m']), (kepler_df['T_sq']), p0=1e-11)
g = popt[0] 
g_unc = np.sqrt(pcov[0][0])
print("G = {0:.3e} +/- {1:.3e}".format(g, g_unc))

G = 6.753e-11 +/- 1.591e-13

fig = plt.figure(figsize=(10,6))
ax = fig.add_subplot(1,1,1)
ax.plot(kepler_df['a_3_m']/1e6, kepler_df['T_sq'] / 1e17, 'k.',ls='None', label='Data')
ax.plot(kepler_df['a_3_m']/1e6, keplers_law(kepler_df['a_3_m'], g)/1e17, 'r--', label='Best fit')
ax.set_ylabel('$T^2$ ($10^{17}$ s$^2$)')
ax.set_xlabel('$a^3 / M_*$ ($10^6$ m$^3$ kg$^{-1}$)')
title_string = (r'$G = ({0:.3f} \pm {1:.3f}) \times 10^{{-11}}$ m$^{{3}}$ kg$^{{-1}}$ s$^{{-2}}$'.format((g / 1e-11), (g_unc / 1e-11)))
ax.set_title(title_string)
ax.legend()
plt.show()

fig = plt.figure(figsize=(10,6))
ax = fig.add_subplot(1,1,1)
ax.plot(kepler_df['a_3_m']/1e6, kepler_df['T_sq'] / 1e17, 'k.',ls='None', label='Data')
ax.plot(kepler_df['a_3_m']/1e6, keplers_law(kepler_df['a_3_m'], g)/1e17, 'r--', label='Best fit')
ax.set_ylabel('$T^2$ ($10^{17}$ s$^2$)')
ax.set_xlabel('$a^3 / M_*$ ($10^6$ m$^3$ kg$^{-1}$)')
title_string = (r'$G = ({0:.3f} \pm {1:.3f}) \times 10^{{-11}}$ m$^{{3}}$ kg$^{{-1}}$ s$^{{-2}}$'.format((g / 1e-11), (g_unc / 1e-11)))
ax.set_title(title_string)
ax.legend()
plt.show()