You can check for missing data using isna()
.
optical_df.isna()
Star_ID | logP | mag_B | err_B | mag_V | err_V | mag_I | err_I | |
---|---|---|---|---|---|---|---|---|
0 | False | False | False | False | False | False | False | False |
1 | False | False | False | False | False | False | False | False |
2 | False | False | False | False | False | False | False | False |
3 | False | False | False | False | False | False | False | False |
4 | False | False | False | False | False | False | False | False |
5 | False | False | False | False | False | False | False | False |
6 | False | False | False | False | False | False | False | False |
7 | False | False | False | False | False | False | False | False |
8 | False | False | False | False | False | False | False | False |
9 | False | False | False | False | False | False | False | False |
10 | False | False | False | False | False | False | True | True |
11 | False | False | False | False | False | False | False | False |
12 | False | False | False | False | False | False | False | False |
13 | False | False | False | False | False | False | False | False |
14 | False | False | False | False | False | False | False | False |
15 | False | False | False | False | False | False | False | False |
16 | False | False | False | False | False | False | False | False |
17 | False | False | False | False | False | False | False | False |
18 | False | False | False | False | False | False | False | False |
19 | False | False | False | False | False | False | False | False |
20 | False | False | False | False | False | False | False | False |
21 | False | False | False | False | False | False | False | False |
22 | False | False | False | False | False | False | False | False |
23 | False | False | False | False | False | False | False | False |
24 | False | False | False | False | False | False | False | False |
25 | False | False | False | False | False | False | False | False |
26 | False | False | False | False | False | False | False | False |
27 | False | False | False | False | False | False | False | False |
28 | False | False | False | False | False | False | False | False |
29 | False | False | False | False | False | False | False | False |
30 | False | False | False | False | False | False | False | False |
31 | False | False | False | False | False | False | False | False |
32 | False | False | True | True | False | False | True | True |
33 | False | False | False | False | False | False | False | False |
34 | False | False | False | False | False | False | False | False |
35 | False | False | False | False | False | False | False | False |
36 | False | False | False | False | False | False | False | False |
37 | False | False | False | False | False | False | False | False |
38 | False | False | False | False | False | False | False | False |
39 | False | False | False | False | False | False | False | False |
40 | False | False | False | False | False | False | False | False |
41 | False | False | False | False | False | False | False | False |
42 | False | False | False | False | False | False | False | False |
43 | False | False | False | False | False | False | False | False |
44 | False | False | False | False | False | False | False | False |
45 | False | False | False | False | False | False | False | False |
46 | False | False | False | False | False | False | False | False |
47 | False | False | False | False | False | False | False | False |
48 | False | False | False | False | False | False | False | False |
49 | False | False | False | False | False | False | False | False |
50 | False | False | False | False | False | False | False | False |
51 | False | False | False | False | False | False | False | False |
52 | False | False | False | False | False | False | False | False |
53 | False | False | False | False | False | False | False | False |
54 | False | False | False | False | False | False | False | False |
55 | False | False | False | False | False | False | False | False |
56 | False | False | False | False | False | False | False | False |
57 | False | False | False | False | False | False | False | False |
58 | False | False | False | False | False | False | False | False |
To just display the rows with missing data, we need use this as a conditional selection to the dataframe, i.e.
optical_df[optical_df.isna().any(axis=1)]
Star_ID | logP | mag_B | err_B | mag_V | err_V | mag_I | err_I | |
---|---|---|---|---|---|---|---|---|
10 | delta Cep | 0.730 | 4.684 | 0.018 | 3.990 | 0.012 | NaN | NaN |
32 | V340 Nor | 1.053 | NaN | NaN | 8.407 | 0.005 | NaN | NaN |
So the stars that are missing data are delta Cep
and V340 Nor
.
gaia_df = pd.read_csv("./data/gaia_distances.csv")
reddenings_df = pd.read_csv("./data/reddenings.csv")
gaia_df
Star_ID | parallax_mas | |
---|---|---|
0 | XX Cen | 0.564 |
1 | T Mon | 0.733 |
2 | TW Nor | 0.362 |
3 | CV Mon | 0.602 |
4 | RY Sco | 0.754 |
5 | TT Aql | 0.994 |
6 | QZ Nor | 0.471 |
7 | Y Oph | 1.349 |
8 | VW Cen | 0.256 |
9 | V340 Nor | 0.490 |
10 | GY Sge | 0.346 |
11 | WZ Car | 0.281 |
12 | CD Cyg | 0.393 |
13 | FF Aql | 1.920 |
14 | Y Lac | 0.430 |
15 | BB Sgr | 1.194 |
16 | Z Lac | 0.510 |
17 | BG Lac | 0.582 |
18 | UU Mus | 0.308 |
19 | KN Cen | 0.240 |
20 | RU Sct | 0.521 |
21 | DL Cas | 0.581 |
22 | RS Pup | 0.587 |
23 | CE Cas B | 0.331 |
24 | SV Vul | 0.405 |
25 | VY Car | 0.564 |
26 | V367 Sct | 0.470 |
27 | U Nor | 0.622 |
28 | LS Pup | 0.213 |
29 | AQ Pup | 0.290 |
... | ... | ... |
37 | CS Vel | 0.274 |
38 | V Cen | 1.413 |
39 | CF Cas | 0.315 |
40 | T Vel | 0.936 |
41 | X Cyg | 0.909 |
42 | VZ Cyg | 0.541 |
43 | CE Cas A | 0.331 |
44 | SW Vel | 0.409 |
45 | SZ Aql | 0.521 |
46 | S Vul | 0.231 |
47 | RZ Vel | 0.660 |
48 | GH Lup | 0.864 |
49 | RY Vel | 0.377 |
50 | X Sgr | 2.822 |
51 | V496 Aql | 0.976 |
52 | V350 Sgr | 0.806 |
53 | U Vul | 1.291 |
54 | W Sgr | 2.354 |
55 | SU Cyg | 1.036 |
56 | S Sge | 1.685 |
57 | beta Dor | 2.917 |
58 | U Car | 0.559 |
59 | l Car | 1.942 |
60 | zeta Gem | 3.064 |
61 | Y Sgr | 2.003 |
62 | delta Cep | 3.556 |
63 | U Aql | 1.748 |
64 | eta Aql | 3.674 |
65 | RT Aur | 1.841 |
66 | SU Cru | 0.211 |
67 rows × 2 columns
reddenings_df
Star_ID | E_B_V | A_V | |
---|---|---|---|
0 | RT Aur | 0.1844 | 0.5717 |
1 | QZ Nor | 1.2364 | 3.8329 |
2 | SU Cyg | 0.9989 | 3.0967 |
3 | Y Lac | 0.3880 | 1.2028 |
4 | T Vul | 0.1702 | 0.5278 |
5 | FF Aql | 0.4996 | 1.5486 |
6 | T Vel | 0.8183 | 2.5367 |
7 | VZ Cyg | 0.2543 | 0.7883 |
8 | V350 Sgr | 0.3277 | 1.0158 |
9 | BG Lac | 0.3374 | 1.0461 |
10 | delta Cep | 1.4914 | 4.6233 |
11 | CV Mon | 1.4373 | 4.4556 |
12 | V Cen | 1.1718 | 3.6327 |
13 | Y Sgr | 1.7647 | 5.4707 |
14 | CS Vel | 1.9023 | 5.8970 |
15 | BB Sgr | 0.2405 | 0.7455 |
16 | V Car | 0.1886 | 0.5846 |
17 | U Sgr | 0.7314 | 2.2672 |
18 | V496 Aql | 0.3182 | 0.9863 |
19 | U Aql | 0.3334 | 1.0337 |
20 | eta Aql | 0.1936 | 0.6001 |
21 | W Sgr | 0.5578 | 1.7292 |
22 | S Sge | 0.2888 | 0.8953 |
23 | GH Lup | 1.1761 | 3.6460 |
24 | S Mus | 0.2424 | 0.7513 |
25 | S Nor | 0.3401 | 1.0544 |
26 | beta Dor | 0.0539 | 0.1669 |
27 | zeta Gem | 0.0699 | 0.2166 |
28 | Z Lac | 0.6702 | 2.0775 |
29 | XX Cen | 0.5223 | 1.6191 |
30 | V340 Nor | 1.2831 | 3.9778 |
31 | UU Mus | 0.9659 | 2.9944 |
32 | BN Pup | 0.6313 | 1.9569 |
33 | TT Aql | 0.9672 | 2.9983 |
34 | LS Pup | 0.7634 | 2.3665 |
35 | VW Cen | 1.3431 | 4.1638 |
36 | Y Oph | 0.8138 | 2.5227 |
37 | SZ Aql | 1.3786 | 4.2736 |
38 | VY Car | 1.0360 | 3.2115 |
39 | RY Sco | 1.2228 | 3.7907 |
40 | RZ Vel | 1.1769 | 3.6484 |
41 | SW Vel | 0.9399 | 2.9137 |
42 | T Mon | 0.5997 | 1.8590 |
43 | RY Vel | 1.4953 | 4.6355 |
44 | AQ Pup | 0.7818 | 2.4236 |
45 | KN Cen | 1.5576 | 4.8286 |
46 | l Car | 0.2190 | 0.6788 |
47 | RS Pup | 0.8197 | 2.5412 |
48 | S Vul | 1.9095 | 5.9195 |
49 | X Cyg | 0.6879 | 2.1326 |
50 | DL Cas | 0.7936 | 2.4602 |
51 | CE Cas A | 0.8207 | 2.5440 |
52 | CE Cas B | 0.8207 | 2.5440 |
53 | CF Cas | 0.8207 | 2.5440 |
gaia_df.shape
(67, 2)
gaia_df.describe()
parallax_mas | |
---|---|
count | 67.000000 |
mean | 0.968881 |
std | 0.831380 |
min | 0.211000 |
25% | 0.399000 |
50% | 0.602000 |
75% | 1.242500 |
max | 3.674000 |
reddenings_df.shape
(54, 3)
reddenings_df.describe()
E_B_V | A_V | |
---|---|---|
count | 54.000000 | 54.000000 |
mean | 0.791817 | 2.454624 |
std | 0.492727 | 1.527478 |
min | 0.053900 | 0.166900 |
25% | 0.334400 | 1.036800 |
50% | 0.787700 | 2.441900 |
75% | 1.175025 | 3.642675 |
max | 1.909500 | 5.919500 |
gaia_df[gaia_df.isna().any(axis=1)]
Star_ID | parallax_mas |
---|
reddenings_df[reddenings_df.isna().any(axis=1)]
Star_ID | E_B_V | A_V |
---|
df1 = pd.DataFrame({'l_id': ['foo', 'bar', 'baz', 'foo'],'number': [1, 2, 3, 5]})
df2 = pd.DataFrame({'r_id': ['foo', 'bar', 'baz', 'foo'],'number': [5, 6, 7, 8]})
We want to do the same kind of merge again to add the columns from the reddening dataframe. This time our left frame will be cepheids_df
and the right will be reddenings_df
:
cepheids_df = pd.merge(left=cepheids_df, right=reddenings_df, on='Star_ID', how='outer')
cepheids_df
Star_ID | parallax_mas | logP | mag_B | err_B | mag_V | err_V | mag_I | err_I | E_B_V | A_V | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | XX Cen | 0.564 | 1.040 | 8.882 | 0.019 | 7.855 | 0.012 | 6.754 | 0.008 | 0.5223 | 1.6191 |
1 | T Mon | 0.733 | 1.432 | 7.436 | 0.022 | 6.187 | 0.014 | 5.005 | 0.010 | 0.5997 | 1.8590 |
2 | TW Nor | 0.362 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
3 | CV Mon | 0.602 | 0.731 | 11.681 | 0.015 | 10.314 | 0.010 | 8.653 | 0.006 | 1.4373 | 4.4556 |
4 | RY Sco | 0.754 | 1.308 | 9.568 | 0.018 | 8.037 | 0.012 | 6.271 | 0.008 | 1.2228 | 3.7907 |
5 | TT Aql | 0.994 | 1.138 | 8.560 | 0.022 | 7.185 | 0.014 | 5.745 | 0.009 | 0.9672 | 2.9983 |
6 | QZ Nor | 0.471 | 0.578 | 9.782 | 0.007 | 8.875 | 0.004 | 7.871 | 0.003 | 1.2364 | 3.8329 |
7 | Y Oph | 1.349 | 1.234 | 7.573 | 0.011 | 6.161 | 0.007 | 4.543 | 0.005 | 0.8138 | 2.5227 |
8 | VW Cen | 0.256 | 1.177 | 11.754 | 0.022 | 10.306 | 0.014 | 8.802 | 0.009 | 1.3431 | 4.1638 |
9 | V340 Nor | 0.490 | 1.053 | NaN | NaN | 8.407 | 0.005 | NaN | NaN | 1.2831 | 3.9778 |
10 | GY Sge | 0.346 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
11 | WZ Car | 0.281 | 1.362 | 10.614 | 0.027 | 9.343 | 0.018 | 8.002 | 0.011 | NaN | NaN |
12 | CD Cyg | 0.393 | 1.232 | 10.422 | 0.025 | 9.023 | 0.016 | 7.525 | 0.010 | NaN | NaN |
13 | FF Aql | 1.920 | 0.650 | 6.159 | 0.007 | 5.383 | 0.005 | 4.503 | 0.004 | 0.4996 | 1.5486 |
14 | Y Lac | 0.430 | 0.636 | 9.921 | 0.016 | 9.163 | 0.011 | 8.312 | 0.007 | 0.3880 | 1.2028 |
15 | BB Sgr | 1.194 | 0.822 | 7.987 | 0.013 | 6.965 | 0.008 | 5.855 | 0.006 | 0.2405 | 0.7455 |
16 | Z Lac | 0.510 | 1.037 | 9.623 | 0.019 | 8.448 | 0.012 | 7.212 | 0.008 | 0.6702 | 2.0775 |
17 | BG Lac | 0.582 | 0.727 | 9.878 | 0.013 | 8.900 | 0.009 | 7.825 | 0.005 | 0.3374 | 1.0461 |
18 | UU Mus | 0.308 | 1.066 | 11.040 | 0.020 | 9.839 | 0.013 | 8.502 | 0.008 | 0.9659 | 2.9944 |
19 | KN Cen | 0.240 | 1.532 | 11.604 | 0.021 | 9.918 | 0.015 | 8.024 | 0.011 | 1.5576 | 4.8286 |
20 | RU Sct | 0.521 | 1.294 | 11.291 | 0.023 | 9.526 | 0.015 | 7.486 | 0.009 | NaN | NaN |
21 | DL Cas | 0.581 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.7936 | 2.4602 |
22 | RS Pup | 0.587 | 1.617 | 8.580 | 0.019 | 7.057 | 0.013 | 5.507 | 0.009 | 0.8197 | 2.5412 |
23 | CE Cas B | 0.331 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.8207 | 2.5440 |
24 | SV Vul | 0.405 | 1.653 | 8.810 | 0.018 | 7.267 | 0.012 | 5.719 | 0.009 | NaN | NaN |
25 | VY Car | 0.564 | 1.277 | 8.744 | 0.015 | 7.510 | 0.012 | 6.301 | 0.007 | 1.0360 | 3.2115 |
26 | V367 Sct | 0.470 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
27 | U Nor | 0.622 | 1.102 | 10.939 | 0.020 | 9.273 | 0.013 | 7.366 | 0.008 | NaN | NaN |
28 | LS Pup | 0.213 | 1.151 | 11.790 | 0.020 | 10.500 | 0.013 | 9.087 | 0.008 | 0.7634 | 2.3665 |
29 | AQ Pup | 0.290 | 1.479 | 10.197 | 0.022 | 8.756 | 0.015 | 7.178 | 0.010 | 0.7818 | 2.4236 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
37 | CS Vel | 0.274 | 0.771 | 13.071 | 0.017 | 11.728 | 0.011 | 10.080 | 0.007 | 1.9023 | 5.8970 |
38 | V Cen | 1.413 | 0.740 | 7.769 | 0.017 | 6.857 | 0.011 | 5.805 | 0.007 | 1.1718 | 3.6327 |
39 | CF Cas | 0.315 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.8207 | 2.5440 |
40 | T Vel | 0.936 | 0.667 | 9.001 | 0.014 | 8.047 | 0.009 | 6.971 | 0.006 | 0.8183 | 2.5367 |
41 | X Cyg | 0.909 | 1.214 | 7.659 | 0.023 | 6.434 | 0.014 | 5.254 | 0.009 | 0.6879 | 2.1326 |
42 | VZ Cyg | 0.541 | 0.687 | 9.900 | 0.015 | 8.983 | 0.010 | 7.970 | 0.006 | 0.2543 | 0.7883 |
43 | CE Cas A | 0.331 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.8207 | 2.5440 |
44 | SW Vel | 0.409 | 1.370 | 9.435 | 0.022 | 8.189 | 0.014 | 6.875 | 0.009 | 0.9399 | 2.9137 |
45 | SZ Aql | 0.521 | 1.234 | 10.227 | 0.025 | 8.697 | 0.016 | 7.103 | 0.010 | 1.3786 | 4.2736 |
46 | S Vul | 0.231 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.9095 | 5.9195 |
47 | RZ Vel | 0.660 | 1.310 | 8.408 | 0.026 | 7.170 | 0.017 | 5.898 | 0.011 | 1.1769 | 3.6484 |
48 | GH Lup | 0.864 | 0.967 | 8.851 | 0.004 | 7.632 | 0.002 | 6.360 | 0.002 | 1.1761 | 3.6460 |
49 | RY Vel | 0.377 | 1.449 | 9.872 | 0.021 | 8.421 | 0.014 | 6.846 | 0.009 | 1.4953 | 4.6355 |
50 | X Sgr | 2.822 | 0.846 | 5.334 | 0.013 | 4.569 | 0.009 | 3.663 | 0.006 | NaN | NaN |
51 | V496 Aql | 0.976 | 0.833 | 8.937 | 0.008 | 7.769 | 0.005 | 6.491 | 0.004 | 0.3182 | 0.9863 |
52 | V350 Sgr | 0.806 | 0.712 | 8.449 | 0.015 | 7.504 | 0.010 | 6.453 | 0.006 | 0.3277 | 1.0158 |
53 | U Vul | 1.291 | 0.903 | 8.492 | 0.016 | 7.149 | 0.011 | 5.609 | 0.007 | NaN | NaN |
54 | W Sgr | 2.354 | 0.881 | 5.472 | 0.017 | 4.694 | 0.011 | 3.863 | 0.007 | 0.5578 | 1.7292 |
55 | SU Cyg | 1.036 | 0.585 | 7.493 | 0.015 | 6.890 | 0.011 | 6.208 | 0.007 | 0.9989 | 3.0967 |
56 | S Sge | 1.685 | 0.923 | 6.488 | 0.016 | 5.641 | 0.011 | 4.782 | 0.006 | 0.2888 | 0.8953 |
57 | beta Dor | 2.917 | 0.993 | 4.586 | 0.013 | 3.758 | 0.009 | 2.946 | 0.006 | 0.0539 | 0.1669 |
58 | U Car | 0.559 | 1.589 | 7.625 | 0.025 | 6.342 | 0.016 | 5.076 | 0.010 | NaN | NaN |
59 | l Car | 1.942 | 1.551 | 5.048 | 0.017 | 3.749 | 0.011 | 2.564 | 0.007 | 0.2190 | 0.6788 |
60 | zeta Gem | 3.064 | 1.007 | 4.735 | 0.011 | 3.901 | 0.007 | 3.100 | 0.005 | 0.0699 | 0.2166 |
61 | Y Sgr | 2.003 | 0.761 | 6.657 | 0.016 | 5.766 | 0.010 | 4.801 | 0.006 | 1.7647 | 5.4707 |
62 | delta Cep | 3.556 | 0.730 | 4.684 | 0.018 | 3.990 | 0.012 | NaN | NaN | 1.4914 | 4.6233 |
63 | U Aql | 1.748 | 0.847 | 7.536 | 0.016 | 6.457 | 0.011 | 5.279 | 0.007 | 0.3334 | 1.0337 |
64 | eta Aql | 3.674 | 0.856 | 4.744 | 0.017 | 3.918 | 0.011 | 3.036 | 0.007 | 0.1936 | 0.6001 |
65 | RT Aur | 1.841 | 0.571 | 6.120 | 0.017 | 5.487 | 0.011 | 4.822 | 0.006 | 0.1844 | 0.5717 |
66 | SU Cru | 0.211 | 1.109 | 11.613 | 0.015 | 9.802 | 0.008 | 7.658 | 0.003 | NaN | NaN |
67 rows × 11 columns
We should check for missing data again too:
cepheids_df[cepheids_df.isna().any(axis=1)]
Star_ID | parallax_mas | logP | mag_B | err_B | mag_V | err_V | mag_I | err_I | E_B_V | A_V | |
---|---|---|---|---|---|---|---|---|---|---|---|
2 | TW Nor | 0.362 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
9 | V340 Nor | 0.490 | 1.053 | NaN | NaN | 8.407 | 0.005 | NaN | NaN | 1.2831 | 3.9778 |
10 | GY Sge | 0.346 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
11 | WZ Car | 0.281 | 1.362 | 10.614 | 0.027 | 9.343 | 0.018 | 8.002 | 0.011 | NaN | NaN |
12 | CD Cyg | 0.393 | 1.232 | 10.422 | 0.025 | 9.023 | 0.016 | 7.525 | 0.010 | NaN | NaN |
20 | RU Sct | 0.521 | 1.294 | 11.291 | 0.023 | 9.526 | 0.015 | 7.486 | 0.009 | NaN | NaN |
21 | DL Cas | 0.581 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.7936 | 2.4602 |
23 | CE Cas B | 0.331 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.8207 | 2.5440 |
24 | SV Vul | 0.405 | 1.653 | 8.810 | 0.018 | 7.267 | 0.012 | 5.719 | 0.009 | NaN | NaN |
26 | V367 Sct | 0.470 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
27 | U Nor | 0.622 | 1.102 | 10.939 | 0.020 | 9.273 | 0.013 | 7.366 | 0.008 | NaN | NaN |
34 | WZ Sgr | 0.607 | 1.339 | 9.584 | 0.023 | 8.098 | 0.015 | 6.584 | 0.009 | NaN | NaN |
39 | CF Cas | 0.315 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.8207 | 2.5440 |
43 | CE Cas A | 0.331 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.8207 | 2.5440 |
46 | S Vul | 0.231 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1.9095 | 5.9195 |
50 | X Sgr | 2.822 | 0.846 | 5.334 | 0.013 | 4.569 | 0.009 | 3.663 | 0.006 | NaN | NaN |
53 | U Vul | 1.291 | 0.903 | 8.492 | 0.016 | 7.149 | 0.011 | 5.609 | 0.007 | NaN | NaN |
58 | U Car | 0.559 | 1.589 | 7.625 | 0.025 | 6.342 | 0.016 | 5.076 | 0.010 | NaN | NaN |
62 | delta Cep | 3.556 | 0.730 | 4.684 | 0.018 | 3.990 | 0.012 | NaN | NaN | 1.4914 | 4.6233 |
66 | SU Cru | 0.211 | 1.109 | 11.613 | 0.015 | 9.802 | 0.008 | 7.658 | 0.003 | NaN | NaN |
Looks like there are some with missing reddening data too