Working with Shapefiles

Shapefiles are an old file format, originally developed by ESRI, which have become a common way of working with Geospatial data; much to the chagrin of ESRI who have ever since been trying to migrate to a Geodatabase format. A shapefile is driven by is .shp extension but can contain upto 17 different files adding valuable information such as Z values. The four critical file extensions for a shapefile to function correctly are 

  • .shp 
  • .shx 
  • .prj – this contains the projection information of the shapefile
  • .dbf – this contains the data table. 

Geometry types

Here is the ESRI documentation on a shapefile. In essence it contains one Geometry type only, those are:

  • Points – literally 1 xy
  • Lines – two or more xy coordinates
  • Polygons – a start xy, any number of intermediate xy and a closing xy which completes the geometry. 

There are some more advanced types such as multipart polygons and donut polygons which you could read more about here as it is a genuinely interesting subject within Geospaital data.

5 Steps to creating a custom Shape file

Step 1 The easiest way to create a shapefile is to download the application QGIS, working on mac, linux and windows here

Step 2 Open up QGIS and you should see the shapefile creation dialogue

Step 3 Create a new folder to contain all of your shapefile and save the file name. Mine here is test003

Step 4 Click edit, add vertices and save the edits

Step 5 Go to the file system and you will see the new shape file.

I should say that there are many reasons why a shapefile is not the ideal data format but it is very useful for quick data edits or shaping a polygon. 

Writing a .shp file with python

To work with a shapefile programmatically, and outside of the ESRI ecosystem, you need to lean on a few libraries.

  • Shapely which deals with geometry operations
  • Fiona which handles the reading a writing and most terrifyingly
  • pyproj4 for all your projections and transformations

However we can also just go ahead and use Geopandas which combines all of the above libraries into the Pandas ecosystem for data munging.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: