Vorverarbeitung von Multivalue -Attributen in einem Datenrahmen, ähnlich wie nominal
Posted: 02 Apr 2025, 19:44
Beschreibung: < /strong> < /p>
Job Perks
Insurance Benefits
Online -Kurse; Zertifizierungsprogramme; Cross Training < /td>
Lebensversicherung; Zahnversicherung < /td>
< /tr>
Führungskräfteentwicklungsprogramme; Online -Kurse < /td>
Lebensversicherung; Unfallversicherung < /td>
< /tr>
< /tbody>
< /table> < /div>
Gender
Marital Status
Male
Single
Female
Married
Gender_Male
Gender_Female
Marital Status_Single
Marital Status_Married
1
0
1
0
0
1
0
1
Beispielcode:
#für den Nominal:
- Eingabe ist eine CSV -Datei. Spalten. Das Minimum beträgt 1, maximal 5 Werte. The input is similar to this:
Job Perks
Insurance Benefits
Online -Kurse; Zertifizierungsprogramme; Cross Training < /td>
Lebensversicherung; Zahnversicherung < /td>
< /tr>
Führungskräfteentwicklungsprogramme; Online -Kurse < /td>
Lebensversicherung; Unfallversicherung < /td>
< /tr>
< /tbody>
< /table> < /div>
- Multivalue erwartete Ausgabe: < /li>
< /ol>
Job Perks_Online Courses
Job Perks_Certification Programs
Job Perks_Cross Training
Job Perks_Leadership Development Programs
Insurance Benefits_Life Insurance
Insurance Benefits_Dental Insurance
Insurance Benefits_Accident Insurance
1
1
1
0
1
1
0
1
0
0
1
1
0
1
I have Nominal entries in the CSV file sample input is:
Gender
Marital Status
Male
Single
Female
Married
- Nominal Expected Output:
Gender_Male
Gender_Female
Marital Status_Single
Marital Status_Married
1
0
1
0
0
1
0
1
Beispielcode:
#für den Nominal:
Code: Select all
import pandas as pd
import numpy as np
import dtale #Better STDOUT for dataframes
nominalColumns = ["Gender", "Marital Status", "Educational Attainment", "Employment Status", "Company Bonus Structure", "Company Medical Plan Type"]
multivalueColumns = ["Job Perks", "Professional Development Opportunities", "Insurance Benefits"]
df = pd.read_csv('ECP_Unedited.csv')
#Convert Nominal Columns
newCols = pd.get_dummies(df[nominalColumns], dtype=int)
df = df.drop(columns=nominalColumns)
df = pd.concat([df, newCols], axis=1)
dtale.show(df)
#Convert Multivalue Columns
#INSERT CODE HERE!