Importing necessary libraries install and import libraries
First, you need to install and import the libraries
!pip install pandas numpy matplotlib seaborn
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
Loading your real estate data
Upload Your Data
Drag and drop your real estate data CSV file into the file explorer on the left side of the Deepnote interface.
Load data into the data frame
Load the data by adding the following code
# Load the dataset
df = pd.read_csv('your_real_estate_data.csv')
df
Data cleaning and preparation
Check for missing values
Run the following to see if there are any missing values in your data
df.isnull().sum()
Handle missing values
Fill or drop missing values based on your needs:
df = df.fillna(method='ffill')
Convert data types
Ensure that your data types are correct, especially for dates:
df['date'] = pd.to_datetime(df['date'])
df.dtypes
Exploratory data analysis (EDA)
Summary statistics:
Generate summary statistics to understand your data:
df.describe()
Visualize data distributions:
Create a histogram of property prices:
plt.figure(figsize=(10, 6))
sns.histplot(df['price'], kde=True)
plt.title('Distribution of property prices')
plt.show()
Compare prices by location
Use a boxplot to visualize property prices by location
plt.figure(figsize=(14, 8))
sns.boxplot(x='location', y='price', data=df)
plt.title('Property prices by location')
plt.xticks(rotation=45)
plt.show()
Calculating key metrics
Gross rental yield
Calculate the gross rental yield for each property
df['gross_rental_yield'] = (df['annual_rent'] / df['price']) * 100
df[['price', 'annual_rent', 'gross_rental_yield']].head()
Cap rate
Calculate the cap rate for each property
df['cap_rate'] = (df['net_operating_income'] / df['price']) * 100
df[['price', 'net_operating_income', 'cap_rate']].head()
Visualizing investment opportunities
Scatter plot of price vs gross rental yield
Create a scatter plot to visualize the relationship between property price and gross rental yield
plt.figure(figsize=(10, 6))
sns.scatterplot(x='price', y='gross_rental_yield', data=df)
plt.title('Price vs gross rental yield')
plt.show()
Boxplot of cap rate by location
Visualize cap rate distributions by location
plt.figure(figsize=(14, 8))
sns.boxplot(x='location', y='cap_rate', data=df)
plt.title('Cap rate by location')
plt.xticks(rotation=45)
plt.show()
Making data-driven decisions
Identify the best Investment opportunities
Sort properties by gross rental yield to find the best opportunities:
best_yield_properties = df.sort_values(by='gross_rental_yield', ascending=False).head(10)
best_yield_properties
Filter properties based on criteria
Filter properties with a cap rate greater than a specified threshold
cap_rate_threshold = 5.0
good_investments = df[df['cap_rate'] > cap_rate_threshold]
good_investments
Saving and sharing your work
Export your results
Save your results to a new CSV file
df.to_csv('real_estate_analysis_results.csv', index=False)
Share your notebook or create a Deepnote app
- Click the “Share” button in Deepnote to share your notebook with others. You can provide view or edit access to your collaborators.
- Click on a create app on the right side of the notebook configure it and share it
Conclusion and next steps
Review findings
Summarize your key findings and insights from the analysis. Highlight the best investment opportunities and any significant patterns observed.
Further Analysis
Consider exploring more advanced analyses such as predictive modeling for property prices. Consider incorporating external factors like market trends or economic indicators to enhance your analysis. This walkthrough provides a solid foundation for analyzing real estate data and making informed investment decisions.