ML_Exercises

Wine Classification

Using Plotly

Wine multiclass and binary, quality classification

Loan prediction

Loan Prediction Exercise with Random Forest and Pipeline

Cleaning

EN: For self_employed, we have 500 "no", 82 "yes". I can either use the mode to fill the null values, or check by myself which values are closer. In this case, it's more likely that the values are a "no".

IT: Per self_employed, dato che 500 non sono self employed e 82 si, controllo le statistiche per i valori nan e decido se rimpiazzarli con "No", dato che è più probabile che non siano self employed visti i numeri

df['Self_Employed'] = df['Self_Employed'].fillna('No')

EN: My null values in the Gender feature are only 13 and very different from the two genders: I decide to drop them. Otherwise, I could have used the mode in this case too.

IT: Decido di eliminare i record con valori nulli in Gender: hanno dei valori troppo alti e diversi dagli altri due, e sono solo 13.

df= df.dropna(subset = ['Gender'])

Label encoding

Since 3+ is a string, I have to change it to an int

df['Dependents'] = df['Dependents'].replace('3+', 3)
df['Dependents'] = df['Dependents'].astype('int')

from sklearn.preprocessing import LabelEncoder #I'm using the Label Encoder for my target
enc = LabelEncoder()

df['Loan_Status'] = enc.fit_transform(df['Loan_Status'])
enc_name_mapping = dict(zip(enc.classes_, enc.transform(enc.classes_)))
print(enc_name_mapping) #this is the dictionary with the values of my target

Categorical features

categorical_features = df[['Gender', 'Married', 'Education','Self_Employed',
       'Property_Area']] #cat featu without target

for col in categorical_features:
    print(df[col].unique())

['Male' 'Female']

['No' 'Yes']

['Graduate' 'Not Graduate']

['No' 'Yes']

['Urban' 'Rural' 'Semiurban']

I'll use map to change the categorical into numerical values:

df['Gender']= df['Gender'].map({'Male':0, 'Female':1})

I'll save them as dictionaries so I can have a legend:

Gender = {'Male':0, 'Female':1}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LOAN_Exercise_Random_Forest.ipynb		LOAN_Exercise_Random_Forest.ipynb
README.md		README.md
Wine_Multiclass_and_Binary.ipynb		Wine_Multiclass_and_Binary.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML_Exercises

Wine Classification

Loan prediction

Cleaning

Label encoding

Categorical features

EDA

Train Test

Model Evaluation

About

Releases

Packages

Languages

karanxhagiulia/ML_Exercises

Folders and files

Latest commit

History

Repository files navigation

ML_Exercises

Wine Classification

Loan prediction

Cleaning

Label encoding

Categorical features

EDA

Train Test

Model Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages