Erhalten Sie unterschiedliche Ergebnisse von GroupBy für Datenrahmen in verschiedenen Größe
Posted: 30 Apr 2025, 22:22
Ich führe die gleichen Funktionen für diese beiden DFs aus, die identisch sind, außer dass sie unterschiedliche Längen (gleiche Anzahl von Spalten und Datentypen) haben. Wenn ich das größere ausführe, bekomme ich genau so, wie ich es erwarten würde, welches ein gruppierter DF zum Datum mit Index auf Interp eingestellt und dann die Interpolation der Spalte der Werte 1 eingestellt ist. Wenn ich es auf dem kleineren df ausführe, wird die Interpon Säulenheader. Gibt es irgendetwas, das eine Gruppe von GroupBy/set_index/Interpolat verursachen kann, die dazu führt, dass der DF unterschiedlich ist?
Code: Select all
df1 = merged_df.groupby('Dates').apply(
lambda group: group.set_index('InterpOn')['Values 1'].interpolate(
method='index',limit_direction='both', limit_area='inside'))
df2 = merged_df.groupby('Dates').apply(
lambda group: group.set_index('InterpOn')['Values 2'].interpolate(
method='index',limit_direction='both', limit_area='inside'))
< /code>
großes df: < /p>
print(merged_df)
print(merged_df.index)
PyDev console: starting.
InterpOn Date Values 1 Values2
0 0.02367 2025-05-02 NaN 5.138635
1 0.02370 2025-05-02 5.915301 NaN
2 0.04735 2025-05-02 NaN 4.630094
3 0.07102 2025-05-02 NaN 4.304858
4 0.07109 2025-05-02 4.953734 NaN
... ... ... ... ...
8606 2.13260 2035-04-30 0.290885 NaN
8607 2.22667 2035-04-30 0.287620 NaN
8608 2.47405 2035-04-30 0.276654 NaN
8609 2.72641 2035-04-30 0.268110 NaN
8610 2.96886 2035-04-30 0.264625 NaN
[8611 rows x 4 columns]
RangeIndex(start=0, stop=8611, step=1)
< /code>
Kleine df: < /p>
print(merged_df2)
print(merged_df2.index)
InterpOn Date Values 1 Values2
0 0.07102 2025-05-02 NaN 4.304858
1 0.07107 2025-05-02 4.839552 NaN
2 0.09469 2025-05-02 NaN 4.058323
3 0.09893 2025-05-02 4.519238 NaN
4 0.10000 2025-05-02 NaN 4.009879
... ... ... ... ...
1139 0.72500 2025-10-17 NaN 0.408334
1140 0.73387 2025-10-17 NaN 0.404007
1141 0.74206 2025-10-17 0.405744 NaN
1142 0.74570 2025-10-17 NaN 0.398373
1143 0.75000 2025-10-17 NaN 0.396369
[1144 rows x 4 columns]
RangeIndex(start=0, stop=1144, step=1)
< /code>
Ausgabe von df1 + df2 beim Ausführen von Funktion auf großer DF: < /p>
print(df1)
print(df2)
print(df1.reset_index())
print(df2.reset_index())
PyDev console: starting.
Dates InterOn
2025-05-02 0.02367 NaN
0.02374 5.795639
0.04735 5.327642
0.07102 4.858456
0.07122 4.854492
...
2035-04-30 2.13660 0.236831
2.23085 0.234302
2.47869 0.227912
2.73152 0.224016
2.97443 0.223111
Name: Values 1, Length: 8611, dtype: float64
Dates InterOn
2025-05-02 0.02367 5.138635
0.02374 5.137131
0.04735 4.630094
0.07102 4.304858
0.07122 4.302775
...
2035-04-30 2.13660 NaN
2.23085 NaN
2.47869 NaN
2.73152 NaN
2.97443 NaN
Name: Values 2, Length: 8611, dtype: float64
Dates InterpOn Values 1
0 2025-05-02 0.02367 NaN
1 2025-05-02 0.02374 5.795639
2 2025-05-02 0.04735 5.327642
3 2025-05-02 0.07102 4.858456
4 2025-05-02 0.07122 4.854492
... ... ... ...
8606 2035-04-30 2.13660 0.236831
8607 2035-04-30 2.23085 0.234302
8608 2035-04-30 2.47869 0.227912
8609 2035-04-30 2.73152 0.224016
8610 2035-04-30 2.97443 0.223111
[8611 rows x 3 columns]
Dates InterpOn Values 2
0 2025-05-02 0.02367 5.138635
1 2025-05-02 0.02374 5.137131
2 2025-05-02 0.04735 4.630094
3 2025-05-02 0.07102 4.304858
4 2025-05-02 0.07122 4.302775
... ... ... ...
8606 2035-04-30 2.13660 NaN
8607 2035-04-30 2.23085 NaN
8608 2035-04-30 2.47869 NaN
8609 2035-04-30 2.73152 NaN
8610 2035-04-30 2.97443 NaN
[8611 rows x 3 columns]
< /code>
Ausgabe von df1 + df2 beim Ausführen von Funktion auf kleinem df: < /p>
print(df1)
print(df2)
print(df1.reset_index())
print(df2.reset_index())
InterpOn 0.07102 0.07107 0.09469 ... 0.74206 0.74570 0.75000
Dates ...
2025-05-02 NaN 4.839552 4.567986 ... 1.471894 NaN NaN
2025-05-09 NaN 2.899251 2.735597 ... 0.863719 NaN NaN
2025-05-16 NaN 2.525711 2.380305 ... 0.706109 NaN NaN
2025-05-23 NaN 2.128388 2.007021 ... 0.629245 NaN NaN
2025-05-30 NaN 1.844957 1.740870 ... 0.572882 NaN NaN
2025-06-06 NaN 1.717018 1.620191 ... 0.543621 NaN NaN
2025-06-20 NaN 1.545027 1.458648 ... 0.506780 NaN NaN
2025-07-18 NaN 1.367541 1.290628 ... 0.461288 NaN NaN
2025-08-15 NaN 1.259102 1.188406 ... 0.442713 NaN NaN
2025-09-19 NaN 1.160901 1.095712 ... 0.419598 NaN NaN
2025-10-17 NaN 1.110341 1.046494 ... 0.405744 NaN NaN
[11 rows x 104 columns]
InterpOn 0.07102 0.07107 0.09469 ... 0.74206 0.74570 0.75000
Dates ...
2025-05-02 4.304858 4.304337 4.058323 ... 1.311007 1.297876 1.282278
2025-05-09 2.724090 2.723761 2.568512 ... 0.839582 0.831265 0.821344
2025-05-16 2.350192 2.349904 2.214085 ... 0.690618 0.683802 0.675786
2025-05-23 2.070852 2.070599 1.950801 ... 0.616408 0.610850 0.604349
2025-05-30 1.861961 1.861733 1.754223 ... 0.563564 0.558743 0.553115
2025-06-06 1.683924 1.683721 1.587589 ... 0.533107 0.528862 0.523914
2025-06-20 1.537651 1.537465 1.449704 ... 0.496104 0.492512 0.488336
2025-07-18 1.345252 1.345090 1.268564 ... 0.452838 0.450054 0.446827
2025-08-15 1.256296 1.256143 1.184143 ... 0.435486 0.433208 0.430574
2025-09-19 1.155226 1.155086 1.089053 ... 0.413487 0.411552 0.409319
2025-10-17 1.087805 1.087674 1.025609 ... 0.400107 0.398373 0.396369
[11 rows x 104 columns]
InterpOn Dates 0.07102 0.07107 ... 0.74206 0.7457 0.75
0 2025-05-02 NaN 4.839552 ... 1.471894 NaN NaN
1 2025-05-09 NaN 2.899251 ... 0.863719 NaN NaN
2 2025-05-16 NaN 2.525711 ... 0.706109 NaN NaN
3 2025-05-23 NaN 2.128388 ... 0.629245 NaN NaN
4 2025-05-30 NaN 1.844957 ... 0.572882 NaN NaN
5 2025-06-06 NaN 1.717018 ... 0.543621 NaN NaN
6 2025-06-20 NaN 1.545027 ... 0.506780 NaN NaN
7 2025-07-18 NaN 1.367541 ... 0.461288 NaN NaN
8 2025-08-15 NaN 1.259102 ... 0.442713 NaN NaN
9 2025-09-19 NaN 1.160901 ... 0.419598 NaN NaN
10 2025-10-17 NaN 1.110341 ... 0.405744 NaN NaN
[11 rows x 105 columns]
InterpOn Dates 0.07102 0.07107 ... 0.74206 0.7457 0.75
0 2025-05-02 4.304858 4.304337 ... 1.311007 1.297876 1.282278
1 2025-05-09 2.724090 2.723761 ... 0.839582 0.831265 0.821344
2 2025-05-16 2.350192 2.349904 ... 0.690618 0.683802 0.675786
3 2025-05-23 2.070852 2.070599 ... 0.616408 0.610850 0.604349
4 2025-05-30 1.861961 1.861733 ... 0.563564 0.558743 0.553115
5 2025-06-06 1.683924 1.683721 ... 0.533107 0.528862 0.523914
6 2025-06-20 1.537651 1.537465 ... 0.496104 0.492512 0.488336
7 2025-07-18 1.345252 1.345090 ... 0.452838 0.450054 0.446827
8 2025-08-15 1.256296 1.256143 ... 0.435486 0.433208 0.430574
9 2025-09-19 1.155226 1.155086 ... 0.413487 0.411552 0.409319
10 2025-10-17 1.087805 1.087674 ... 0.400107 0.398373 0.396369
[11 rows x 105 columns]