Guide to manipulate data with Python
Environment: darribas/gds
Preliminary: Load libraries
import numpy as np
import pandas as pd
import geopandas as gpd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib as mpl
import libpysal as ps
Import data
sh = gpd.read_file("gadm36_THA_1.shp")
tdata = pd.read_csv("Testing Cleaned data.csv")
Plot simple map
fig, ax = plt.subplots(figsize=(6,6))
sh.plot(ax=ax, **{'edgecolor':'black', 'facecolor':'white'})
Part 1: Clean CSV
Slice rows
tdata = tdata[2:85]
Select column
tdata = tdata.iloc[:, [0, 13]]
Add column name
tdata.columns =['NAME_1', 'int_t']
Remove NAs
tdata = tdata.dropna()
tdata.head
Part 2: Clean shapefile
sh.head
Select column
sh = sh[['NAME_1','geometry']]
sh
Part 3: Merge shapefile and dataframe
df = tdata.merge(sh, on='NAME_1', how='left')
df
Part 4: Save gpkg
df = gpd.GeoDataFrame(df, geometry='geometry')
df.to_file("output.gpkg", driver="GPKG")
Others
sh = sh.drop_duplicates()