How old are S&P 500 companies?
import requests
import json
import pandas as pd
import plotly.express as px
import re
# Load the S&P 500 companies from a local CSV file
df_sp_500 = pd.read_csv("sp500.csv")
df_sp_500
df_sp_500["founded_number"] = df_sp_500["founded"].apply(lambda x: int(re.split(' |/', x)[0]))
fig = px.histogram(df_sp_500, x="founded_number", color="sector")
fig.update_layout(bargap=0.2)
fig.show()
Let's run an SQL query against the Pandas dataframe to get the youngest companies in the index:
SELECT symbol, name, sector, founded
FROM df_sp_500
ORDER BY founded_number DESC limit 10
Let's do the same for the oldest ones. Did you know that a company producing Colgate toothpastes is over 200 years old? 😲
SELECT symbol, name, sector, founded
FROM df_sp_500
ORDER BY founded_number ASC limit 10
How long do you have to hustle until your company makes it to the top?
from datetime import date
date.today().year - df_sp_500["founded_number"].median()
That doesn't sound good. Let's see what's the age median by sector:
df_sp_500["age"] = df_sp_500["founded_number"].apply(lambda x: date.today().year - x)
df_sp_500.groupby(by="sector").median()[["age"]]