# Apartment Prices in Bogotá (Price - Size, Location, and Neighborhood)

In this project I am going to make a model to predict the apartment prices in Bogotá using the size, the location and the neighborhood of the apartment.

First I am going to explore the data.

Import the libraries

Read the data.

The data was obtained from the Properarti web page.

It is necessary to use the properties that are apartments, with a surface covered major than 0, those must be located in Bogotá, the currency used is COP, the operation type is Venta and the price is different from zero. The properties selected are that which are between quantile 0.1 and 0.9, this is to remove the outliers. The duplicates are erased.

## Explore

We explore the information of the dataframe.

We draw a scatter map box to see the location of the apartment. We can see that there are zones where the prices are higher.

We calculate the correlation matrix, and then ,we make a heatmap to see which variables are correlated.

There are no strong correlation between variables.

## Model

First the data is separated in the features X, surfaced covered, lat and lon; and the target y, price.

Then the X and y is split in X train, y train and X test and y test.

## Build the model

First the Baseline is calculated.

Now it is calculated the mean absolute error of the baseline.

## Iterate

We make a pipeline with a One Hot Encoder, a Simple Imputer and Ridge.

We fit the model

## Evaluate

First we predict with the X train.

It is calculated the mean absolute error of training.

We can see the model beats the baseline in 183441221.

Finally, we evaluate the model with the test data.

It is calculated the mean absolute error of test data.

## Communicate Results

Finally, the results are communicated. We make a function that can be used to make prediction using the data that the user have.