Solutions

Checking which stars are missing data

You can check for missing data using isna().

In [6]:
optical_df.isna()
Out[6]:
Star_ID logP mag_B err_B mag_V err_V mag_I err_I
0 False False False False False False False False
1 False False False False False False False False
2 False False False False False False False False
3 False False False False False False False False
4 False False False False False False False False
5 False False False False False False False False
6 False False False False False False False False
7 False False False False False False False False
8 False False False False False False False False
9 False False False False False False False False
10 False False False False False False True True
11 False False False False False False False False
12 False False False False False False False False
13 False False False False False False False False
14 False False False False False False False False
15 False False False False False False False False
16 False False False False False False False False
17 False False False False False False False False
18 False False False False False False False False
19 False False False False False False False False
20 False False False False False False False False
21 False False False False False False False False
22 False False False False False False False False
23 False False False False False False False False
24 False False False False False False False False
25 False False False False False False False False
26 False False False False False False False False
27 False False False False False False False False
28 False False False False False False False False
29 False False False False False False False False
30 False False False False False False False False
31 False False False False False False False False
32 False False True True False False True True
33 False False False False False False False False
34 False False False False False False False False
35 False False False False False False False False
36 False False False False False False False False
37 False False False False False False False False
38 False False False False False False False False
39 False False False False False False False False
40 False False False False False False False False
41 False False False False False False False False
42 False False False False False False False False
43 False False False False False False False False
44 False False False False False False False False
45 False False False False False False False False
46 False False False False False False False False
47 False False False False False False False False
48 False False False False False False False False
49 False False False False False False False False
50 False False False False False False False False
51 False False False False False False False False
52 False False False False False False False False
53 False False False False False False False False
54 False False False False False False False False
55 False False False False False False False False
56 False False False False False False False False
57 False False False False False False False False
58 False False False False False False False False

To just display the rows with missing data, we need use this as a conditional selection to the dataframe, i.e.

In [7]:
optical_df[optical_df.isna().any(axis=1)]
Out[7]:
Star_ID logP mag_B err_B mag_V err_V mag_I err_I
10 delta Cep 0.730 4.684 0.018 3.990 0.012 NaN NaN
32 V340 Nor 1.053 NaN NaN 8.407 0.005 NaN NaN

So the stars that are missing data are delta Cep and V340 Nor.

Read in the Gaia data and the reddening data

In [8]:
gaia_df = pd.read_csv("./data/gaia_distances.csv")
reddenings_df = pd.read_csv("./data/reddenings.csv")
In [9]:
gaia_df
Out[9]:
Star_ID parallax_mas
0 XX Cen 0.564
1 T Mon 0.733
2 TW Nor 0.362
3 CV Mon 0.602
4 RY Sco 0.754
5 TT Aql 0.994
6 QZ Nor 0.471
7 Y Oph 1.349
8 VW Cen 0.256
9 V340 Nor 0.490
10 GY Sge 0.346
11 WZ Car 0.281
12 CD Cyg 0.393
13 FF Aql 1.920
14 Y Lac 0.430
15 BB Sgr 1.194
16 Z Lac 0.510
17 BG Lac 0.582
18 UU Mus 0.308
19 KN Cen 0.240
20 RU Sct 0.521
21 DL Cas 0.581
22 RS Pup 0.587
23 CE Cas B 0.331
24 SV Vul 0.405
25 VY Car 0.564
26 V367 Sct 0.470
27 U Nor 0.622
28 LS Pup 0.213
29 AQ Pup 0.290
... ... ...
37 CS Vel 0.274
38 V Cen 1.413
39 CF Cas 0.315
40 T Vel 0.936
41 X Cyg 0.909
42 VZ Cyg 0.541
43 CE Cas A 0.331
44 SW Vel 0.409
45 SZ Aql 0.521
46 S Vul 0.231
47 RZ Vel 0.660
48 GH Lup 0.864
49 RY Vel 0.377
50 X Sgr 2.822
51 V496 Aql 0.976
52 V350 Sgr 0.806
53 U Vul 1.291
54 W Sgr 2.354
55 SU Cyg 1.036
56 S Sge 1.685
57 beta Dor 2.917
58 U Car 0.559
59 l Car 1.942
60 zeta Gem 3.064
61 Y Sgr 2.003
62 delta Cep 3.556
63 U Aql 1.748
64 eta Aql 3.674
65 RT Aur 1.841
66 SU Cru 0.211

67 rows × 2 columns

In [10]:
reddenings_df
Out[10]:
Star_ID E_B_V A_V
0 RT Aur 0.1844 0.5717
1 QZ Nor 1.2364 3.8329
2 SU Cyg 0.9989 3.0967
3 Y Lac 0.3880 1.2028
4 T Vul 0.1702 0.5278
5 FF Aql 0.4996 1.5486
6 T Vel 0.8183 2.5367
7 VZ Cyg 0.2543 0.7883
8 V350 Sgr 0.3277 1.0158
9 BG Lac 0.3374 1.0461
10 delta Cep 1.4914 4.6233
11 CV Mon 1.4373 4.4556
12 V Cen 1.1718 3.6327
13 Y Sgr 1.7647 5.4707
14 CS Vel 1.9023 5.8970
15 BB Sgr 0.2405 0.7455
16 V Car 0.1886 0.5846
17 U Sgr 0.7314 2.2672
18 V496 Aql 0.3182 0.9863
19 U Aql 0.3334 1.0337
20 eta Aql 0.1936 0.6001
21 W Sgr 0.5578 1.7292
22 S Sge 0.2888 0.8953
23 GH Lup 1.1761 3.6460
24 S Mus 0.2424 0.7513
25 S Nor 0.3401 1.0544
26 beta Dor 0.0539 0.1669
27 zeta Gem 0.0699 0.2166
28 Z Lac 0.6702 2.0775
29 XX Cen 0.5223 1.6191
30 V340 Nor 1.2831 3.9778
31 UU Mus 0.9659 2.9944
32 BN Pup 0.6313 1.9569
33 TT Aql 0.9672 2.9983
34 LS Pup 0.7634 2.3665
35 VW Cen 1.3431 4.1638
36 Y Oph 0.8138 2.5227
37 SZ Aql 1.3786 4.2736
38 VY Car 1.0360 3.2115
39 RY Sco 1.2228 3.7907
40 RZ Vel 1.1769 3.6484
41 SW Vel 0.9399 2.9137
42 T Mon 0.5997 1.8590
43 RY Vel 1.4953 4.6355
44 AQ Pup 0.7818 2.4236
45 KN Cen 1.5576 4.8286
46 l Car 0.2190 0.6788
47 RS Pup 0.8197 2.5412
48 S Vul 1.9095 5.9195
49 X Cyg 0.6879 2.1326
50 DL Cas 0.7936 2.4602
51 CE Cas A 0.8207 2.5440
52 CE Cas B 0.8207 2.5440
53 CF Cas 0.8207 2.5440
In [11]:
gaia_df.shape
Out[11]:
(67, 2)
In [12]:
gaia_df.describe()
Out[12]:
parallax_mas
count 67.000000
mean 0.968881
std 0.831380
min 0.211000
25% 0.399000
50% 0.602000
75% 1.242500
max 3.674000
In [13]:
reddenings_df.shape
Out[13]:
(54, 3)
In [14]:
reddenings_df.describe()
Out[14]:
E_B_V A_V
count 54.000000 54.000000
mean 0.791817 2.454624
std 0.492727 1.527478
min 0.053900 0.166900
25% 0.334400 1.036800
50% 0.787700 2.441900
75% 1.175025 3.642675
max 1.909500 5.919500
In [15]:
gaia_df[gaia_df.isna().any(axis=1)]
Out[15]:
Star_ID parallax_mas
In [16]:
reddenings_df[reddenings_df.isna().any(axis=1)]
Out[16]:
Star_ID E_B_V A_V
In [17]:
df1 = pd.DataFrame({'l_id': ['foo', 'bar', 'baz', 'foo'],'number': [1, 2, 3, 5]})
df2 = pd.DataFrame({'r_id': ['foo', 'bar', 'baz', 'foo'],'number': [5, 6, 7, 8]})

Add in the reddening columns

We want to do the same kind of merge again to add the columns from the reddening dataframe. This time our left frame will be cepheids_df and the right will be reddenings_df:

In [30]:
cepheids_df = pd.merge(left=cepheids_df, right=reddenings_df, on='Star_ID', how='outer')
In [31]:
cepheids_df
Out[31]:
Star_ID parallax_mas logP mag_B err_B mag_V err_V mag_I err_I E_B_V A_V
0 XX Cen 0.564 1.040 8.882 0.019 7.855 0.012 6.754 0.008 0.5223 1.6191
1 T Mon 0.733 1.432 7.436 0.022 6.187 0.014 5.005 0.010 0.5997 1.8590
2 TW Nor 0.362 NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 CV Mon 0.602 0.731 11.681 0.015 10.314 0.010 8.653 0.006 1.4373 4.4556
4 RY Sco 0.754 1.308 9.568 0.018 8.037 0.012 6.271 0.008 1.2228 3.7907
5 TT Aql 0.994 1.138 8.560 0.022 7.185 0.014 5.745 0.009 0.9672 2.9983
6 QZ Nor 0.471 0.578 9.782 0.007 8.875 0.004 7.871 0.003 1.2364 3.8329
7 Y Oph 1.349 1.234 7.573 0.011 6.161 0.007 4.543 0.005 0.8138 2.5227
8 VW Cen 0.256 1.177 11.754 0.022 10.306 0.014 8.802 0.009 1.3431 4.1638
9 V340 Nor 0.490 1.053 NaN NaN 8.407 0.005 NaN NaN 1.2831 3.9778
10 GY Sge 0.346 NaN NaN NaN NaN NaN NaN NaN NaN NaN
11 WZ Car 0.281 1.362 10.614 0.027 9.343 0.018 8.002 0.011 NaN NaN
12 CD Cyg 0.393 1.232 10.422 0.025 9.023 0.016 7.525 0.010 NaN NaN
13 FF Aql 1.920 0.650 6.159 0.007 5.383 0.005 4.503 0.004 0.4996 1.5486
14 Y Lac 0.430 0.636 9.921 0.016 9.163 0.011 8.312 0.007 0.3880 1.2028
15 BB Sgr 1.194 0.822 7.987 0.013 6.965 0.008 5.855 0.006 0.2405 0.7455
16 Z Lac 0.510 1.037 9.623 0.019 8.448 0.012 7.212 0.008 0.6702 2.0775
17 BG Lac 0.582 0.727 9.878 0.013 8.900 0.009 7.825 0.005 0.3374 1.0461
18 UU Mus 0.308 1.066 11.040 0.020 9.839 0.013 8.502 0.008 0.9659 2.9944
19 KN Cen 0.240 1.532 11.604 0.021 9.918 0.015 8.024 0.011 1.5576 4.8286
20 RU Sct 0.521 1.294 11.291 0.023 9.526 0.015 7.486 0.009 NaN NaN
21 DL Cas 0.581 NaN NaN NaN NaN NaN NaN NaN 0.7936 2.4602
22 RS Pup 0.587 1.617 8.580 0.019 7.057 0.013 5.507 0.009 0.8197 2.5412
23 CE Cas B 0.331 NaN NaN NaN NaN NaN NaN NaN 0.8207 2.5440
24 SV Vul 0.405 1.653 8.810 0.018 7.267 0.012 5.719 0.009 NaN NaN
25 VY Car 0.564 1.277 8.744 0.015 7.510 0.012 6.301 0.007 1.0360 3.2115
26 V367 Sct 0.470 NaN NaN NaN NaN NaN NaN NaN NaN NaN
27 U Nor 0.622 1.102 10.939 0.020 9.273 0.013 7.366 0.008 NaN NaN
28 LS Pup 0.213 1.151 11.790 0.020 10.500 0.013 9.087 0.008 0.7634 2.3665
29 AQ Pup 0.290 1.479 10.197 0.022 8.756 0.015 7.178 0.010 0.7818 2.4236
... ... ... ... ... ... ... ... ... ... ... ...
37 CS Vel 0.274 0.771 13.071 0.017 11.728 0.011 10.080 0.007 1.9023 5.8970
38 V Cen 1.413 0.740 7.769 0.017 6.857 0.011 5.805 0.007 1.1718 3.6327
39 CF Cas 0.315 NaN NaN NaN NaN NaN NaN NaN 0.8207 2.5440
40 T Vel 0.936 0.667 9.001 0.014 8.047 0.009 6.971 0.006 0.8183 2.5367
41 X Cyg 0.909 1.214 7.659 0.023 6.434 0.014 5.254 0.009 0.6879 2.1326
42 VZ Cyg 0.541 0.687 9.900 0.015 8.983 0.010 7.970 0.006 0.2543 0.7883
43 CE Cas A 0.331 NaN NaN NaN NaN NaN NaN NaN 0.8207 2.5440
44 SW Vel 0.409 1.370 9.435 0.022 8.189 0.014 6.875 0.009 0.9399 2.9137
45 SZ Aql 0.521 1.234 10.227 0.025 8.697 0.016 7.103 0.010 1.3786 4.2736
46 S Vul 0.231 NaN NaN NaN NaN NaN NaN NaN 1.9095 5.9195
47 RZ Vel 0.660 1.310 8.408 0.026 7.170 0.017 5.898 0.011 1.1769 3.6484
48 GH Lup 0.864 0.967 8.851 0.004 7.632 0.002 6.360 0.002 1.1761 3.6460
49 RY Vel 0.377 1.449 9.872 0.021 8.421 0.014 6.846 0.009 1.4953 4.6355
50 X Sgr 2.822 0.846 5.334 0.013 4.569 0.009 3.663 0.006 NaN NaN
51 V496 Aql 0.976 0.833 8.937 0.008 7.769 0.005 6.491 0.004 0.3182 0.9863
52 V350 Sgr 0.806 0.712 8.449 0.015 7.504 0.010 6.453 0.006 0.3277 1.0158
53 U Vul 1.291 0.903 8.492 0.016 7.149 0.011 5.609 0.007 NaN NaN
54 W Sgr 2.354 0.881 5.472 0.017 4.694 0.011 3.863 0.007 0.5578 1.7292
55 SU Cyg 1.036 0.585 7.493 0.015 6.890 0.011 6.208 0.007 0.9989 3.0967
56 S Sge 1.685 0.923 6.488 0.016 5.641 0.011 4.782 0.006 0.2888 0.8953
57 beta Dor 2.917 0.993 4.586 0.013 3.758 0.009 2.946 0.006 0.0539 0.1669
58 U Car 0.559 1.589 7.625 0.025 6.342 0.016 5.076 0.010 NaN NaN
59 l Car 1.942 1.551 5.048 0.017 3.749 0.011 2.564 0.007 0.2190 0.6788
60 zeta Gem 3.064 1.007 4.735 0.011 3.901 0.007 3.100 0.005 0.0699 0.2166
61 Y Sgr 2.003 0.761 6.657 0.016 5.766 0.010 4.801 0.006 1.7647 5.4707
62 delta Cep 3.556 0.730 4.684 0.018 3.990 0.012 NaN NaN 1.4914 4.6233
63 U Aql 1.748 0.847 7.536 0.016 6.457 0.011 5.279 0.007 0.3334 1.0337
64 eta Aql 3.674 0.856 4.744 0.017 3.918 0.011 3.036 0.007 0.1936 0.6001
65 RT Aur 1.841 0.571 6.120 0.017 5.487 0.011 4.822 0.006 0.1844 0.5717
66 SU Cru 0.211 1.109 11.613 0.015 9.802 0.008 7.658 0.003 NaN NaN

67 rows × 11 columns

We should check for missing data again too:

In [32]:
cepheids_df[cepheids_df.isna().any(axis=1)]
Out[32]:
Star_ID parallax_mas logP mag_B err_B mag_V err_V mag_I err_I E_B_V A_V
2 TW Nor 0.362 NaN NaN NaN NaN NaN NaN NaN NaN NaN
9 V340 Nor 0.490 1.053 NaN NaN 8.407 0.005 NaN NaN 1.2831 3.9778
10 GY Sge 0.346 NaN NaN NaN NaN NaN NaN NaN NaN NaN
11 WZ Car 0.281 1.362 10.614 0.027 9.343 0.018 8.002 0.011 NaN NaN
12 CD Cyg 0.393 1.232 10.422 0.025 9.023 0.016 7.525 0.010 NaN NaN
20 RU Sct 0.521 1.294 11.291 0.023 9.526 0.015 7.486 0.009 NaN NaN
21 DL Cas 0.581 NaN NaN NaN NaN NaN NaN NaN 0.7936 2.4602
23 CE Cas B 0.331 NaN NaN NaN NaN NaN NaN NaN 0.8207 2.5440
24 SV Vul 0.405 1.653 8.810 0.018 7.267 0.012 5.719 0.009 NaN NaN
26 V367 Sct 0.470 NaN NaN NaN NaN NaN NaN NaN NaN NaN
27 U Nor 0.622 1.102 10.939 0.020 9.273 0.013 7.366 0.008 NaN NaN
34 WZ Sgr 0.607 1.339 9.584 0.023 8.098 0.015 6.584 0.009 NaN NaN
39 CF Cas 0.315 NaN NaN NaN NaN NaN NaN NaN 0.8207 2.5440
43 CE Cas A 0.331 NaN NaN NaN NaN NaN NaN NaN 0.8207 2.5440
46 S Vul 0.231 NaN NaN NaN NaN NaN NaN NaN 1.9095 5.9195
50 X Sgr 2.822 0.846 5.334 0.013 4.569 0.009 3.663 0.006 NaN NaN
53 U Vul 1.291 0.903 8.492 0.016 7.149 0.011 5.609 0.007 NaN NaN
58 U Car 0.559 1.589 7.625 0.025 6.342 0.016 5.076 0.010 NaN NaN
62 delta Cep 3.556 0.730 4.684 0.018 3.990 0.012 NaN NaN 1.4914 4.6233
66 SU Cru 0.211 1.109 11.613 0.015 9.802 0.008 7.658 0.003 NaN NaN

Looks like there are some with missing reddening data too