Using a dataset from kaggle to predict London houseprices. Based on 4 Conditions in the dataset.
The 4 features are
- Number of Bedrooms
- Area of houses in sq ft.
- Number of Bathrooms
Number of Receptions.
dataset can be found below
# importing pandas to store the data and view my data import pandas as pd # getting the filepath of my dataset file_path = './London.csv' london_house_data = pd.read_csv(file_path)
Depending on your type of data, deepnote offers up to three ways to view it, and they are
raw output, Preview and visualize.
# viewing the data # london_house_data.head() london_house_data.describe()
This is the "Prediction Target" (what i want my model to predict)
y = london_house_data.Price
The conditions that will be used to determine(predict) the price of a house.
features = ['No. of Receptions','Area in sq ft', 'No. of Bedrooms', 'No. of Bathrooms'] X = london_house_data[features]
Using the scikit-learn DecisionTreeRegressor to make a basic decision tree model.
# making my model from sklearn.tree import DecisionTreeRegressor from numpy import array # the random state ensures i get the same results each run .. or something like that. model = DecisionTreeRegressor(random_state=1) model.fit(X, y)
the values here are the 4 conditions and im using them to make a fake house and generate its value to evaluate the model.
# the test_house with the 4 conditions test_house = array([[2,700,2,2]])
Using the model, to estimate a price.
integer_value = model.predict(test_house) prediction_value = str(integer_value).strip(".]") print("The Predicted Price of this house is", prediction_value.replace("[","£"))