Ich weiß, dass ich group_by für Weather + Windy ausführen und dann aggregieren könnte, aber das kann ich nicht, da ich viele andere Aggregationen habe Ich muss nur auf der Wetter-Gruppe_by rechnen.
Code: Select all
import polars as pl
df = pl.DataFrame(
data = {
"Weather":["Rain","Sun","Rain","Sun","Rain","Sun","Rain","Sun"],
"Price":[1,2,3,4,5,6,7,8],
"Windy":["Y","Y","Y","Y","N","N","N","N"]
}
)
Code: Select all
df_agg = (df
.group_by("Weather")
.agg(
pl.col("Windy")
.value_counts()
.alias("Price")
)
)
Code: Select all
shape: (2, 2)
┌─────────┬────────────────────┐
│ Weather ┆ Price │
│ --- ┆ --- │
│ str ┆ list[struct[2]] │
╞═════════╪════════════════════╡
│ Sun ┆ [{"Y",2}, {"N",2}] │
│ Rain ┆ [{"Y",2}, {"N",2}] │
└─────────┴────────────────────┘
Code: Select all
df_agg =(df
.group_by("Weather")
.agg(
pl.col("Windy")
.custom_fun_on_other_col("Price",sum)
.alias("Price")
)
)
Code: Select all
shape: (2, 2)
┌─────────┬────────────────────┐
│ Weather ┆ Price │
│ --- ┆ --- │
│ str ┆ list[struct[2]] │
╞═════════╪════════════════════╡
│ Sun ┆ [{"Y",6},{"N",14}] │
│ Rain ┆ [{"Y",4},{"N",12}] │
└─────────┴────────────────────┘
Mobile version