Python Geospatial Development Essentials Utilize Python with open source libraries to build a lightweight, portable, and customizable GIS desktop application pdf pdf

  Python Geospatial Development Essentials Utilize Python with open source libraries to

build a lightweight, portable, and customizable

GIS desktop application Karim Bahgat BIRMINGHAM - MUMBAI

  Python Geospatial Development Essentials Copyright © 2015 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the

companies and products mentioned in this book by the appropriate use of capitals.

However, Packt Publishing cannot guarantee the accuracy of this information.

  First published: June 2015 Production reference: 1100615 Published by Packt Publishing Ltd. Livery Place

35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-78217-540-7

  

  

Credits

Author Copy Editor

  Karim Bahgat Charlotte Carneiro

  Reviewers Project Coordinator

  Gregory Giuliani Neha Bhatnagar Jorge Samuel Mendes de Jesus

  Proofreader

  Athanasios Tom Kralidis Safis Editing

  John Maurer Adrian Vu

  Indexer

  Rekha Nair

  Commissioning Editor

  Amarabha Banerjee

  Production Coordinator

  Manu Joseph

  Acquisition Editors

  Larissa Pinto

  Cover Work

  Rebecca Youé Manu Joseph

  Content Development Editor

  Merwyn D'souza

  Technical Editor

  Prajakta Mhatre

  About the Author Karim Bahgat holds an MA in peace and conflict transformation from the University of Tromsø in Norway, where he focused on the use of geographic information systems (GIS), opinion survey data, and open source programming tools in conflict studies. Since then, he has been employed as a research assistant for technical and geospatial work at the Peace Research Institute Oslo (PRIO) and the International Law and Policy Institute (ILPI). Karim was part of the early prototyping of the PRIO-GRID unified spatial data structure for social science and

  

  conflict research, and is currently helping develop a new updated version (

  

  His main use of technology, as a developer, has been with Python programming, geospatial tools and mapping, the geocoding of textual data, data visualization,

application development, and some web technology. Karim is the author of a journal

article publication, numerous data- and GIS-oriented Python programming libraries,

the Easy Georeferencer free geocoding software, and several related technical

  

  websites, including

  I am very grateful for the detailed feedback, suggestions, and

troubleshooting of chapters from the reviewers; the encouragement

and guidance from the publisher's administrators and staff, and the patience and encouragement from friends, family, colleagues, and loved ones (especially my inspirational sidekicks, Laura and Murdock). I also want to thank all my teachers at the Chapman

University and University of North Dakota, who got me here in the

first place. They helped me think out of the box and led me into this

wonderful world of geospatial technology.

  About the Reviewers Gregory Giuliani is a geologist with a PhD in environmental sciences (theme:

spatial data infrastructure for the environment). He is a senior scientific associate at the

University of Geneva (Switzerland) and the focal point for spatial data infrastructure (SDI) at GRID-Geneva. He is the manager of the EU/FP7 EOPOWER project and the

work package leader in the EU/FP7 enviroGRIDS and AfroMaison projects, where he

coordinates SDI development and implementation. He also participated in the EU/

FP7 ACQWA project and is the GRID-Geneva lead developer of the PREVIEW Global

capacity building material on SDI for enviroGRIDS and actively participates and contributes to various activities of the Global Earth Observation System of Systems

(GEOSS). Specialized in OGC standards, interoperability, and brokering technology for

environmental data and services, he is the coordinator of the Task ID-02 "Developing

Institutional and Individual Capacity" for GEO/GEOSS.

  Jorge Samuel Mendes de Jesus has 15 years of programming experience in the field of Geoinformatics, with a focus on Python programming, OGC web services, and spatial databases.

  

He has a PhD in geography and sustainable development from Ben-Gurion University

of the Negev, Israel. He has been employed by the Joint Research Center (JRC), Italy, where he worked on projects such as EuroGEOSS, Intamap, and Digital Observatory for Protected Areas (DOPA). He continued his professional career at Plymouth

Marine Laboratory, UK, as a member of the Remote Sensing Group contributing to the

NETMAR project and actively promoting the implementation of the WSDL standard

in PyWPS. He currently works at ISRIC—World Soil Information in the Netherlands,

where he supports the development of Global Soil Information Facilities (GSIF).

  Athanasios Tom Kralidis is a senior systems scientist for the Meteorological Service of Canada, where he provides geospatial technical and architectural leadership in support of MSC's data. Tom's professional background includes key

involvement in the development and integration of geospatial standards, systems,

and services for the

Resources Canada. He also uses these principles in architecting in support of the WMO Global Atmospheric Watch.

   (OGC) community, and was lead contributor to the OGC Web Map Context Documents Specification. He was also a member of the CGDI Architecture Advisory Board, as well as part of the Canadian Advisory Committee to ISO Technical Committee 211 Geographic information/Geomatics. open source software projects, and part of the MapServer Project Steering Committee. He . He holds a bachelor's degree in geography from York University, a GIS certification from Algonquin College, and a master's degree in geography and environmental studies (research and dissertation in geospatial web services/ infrastructure) from Carleton University. Tom is a certified Geomatics Specialist (GIS/LIS) with the Canadian Institute of Geomatics.

  John Maurer is a programmer and data manager at the Pacific Islands Ocean Observing System (PacIOOS) in Honolulu, Hawaii. He creates and configures web interfaces and data services to provide access, visualization, and mapping of oceanographic data from a variety of sources, including satellite remote sensing,

forecast models, GIS layers, and in situ observations (buoys, sensors, shark tracking,

and so on) throughout the insular Pacific. He obtained a graduate certificate in remote sensing, as well as a master's degree in geography from the University of

Colorado at Boulder, where he developed software to analyze ground-penetrating

radar (GPR) for snow accumulation measurements on the Greenland ice sheet.

  

While in Boulder, he worked with the National Snow and Ice Data Center (NSIDC)

for 8 years, sparking his initial interest in earth science and all things geospatial; an unexpected but comfortable detour from his undergraduate degree in music, science, and technology at Stanford University.

  Adrian Vu is a web and mobile developer based in Singapore, and has over

10 years of experience working on various projects for start-ups and organizations.

  He holds a BSc in information systems management (majoring in business

intelligence and analytics) from Singapore Management University. Occasionally, he

likes to dabble in new frameworks and technologies, developing many useful apps

for all to use and play with.

   Support files, eBooks, discount offers, and more

  Did you know that Packt offers eBook versions of every book published, with PDF

  

  and ePub files available? You can upgrade to the eBook version at

and as a print book customer, you are entitled to a discount on the eBook copy. Get in

touch with us at , you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks. TM

  

  

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital

book library. Here, you can search, access, and read Packt's entire library of books.

  Why subscribe?

  • Fully searchable across every book published by Packt • Copy and paste, print, and bookmark content
  • On demand and accessible via a web browser

  Free access for Packt account holders

  

  

PacktLib today and view 9 entirely free books. Simply use your login credentials for

immediate access.

  Table of Contents

  Chapter 1:

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

[ ]

  Table of Contents

  

  

  

  Chapter 3:

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  Chapter 4:

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

[ ]

  Table of Contents

  

Chapter 5:

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

  

Chapter 7:

  

  

  

  

  

[ ]

  Table of Contents [ ]

  

  

  

Chapter 8: Looking Forward

  

  

  

  

  

  

  

  

  Preface Python has become the language of choice for many in the geospatial industry.

  

Some use Python as a way to automate their workflows in software, such as ArcGIS

or QGIS. Others play around with the nuts and bolts of Python's immense variety of

third-party open source geospatial toolkits. Given all the programming tools available and the people already familiar with geospatial software, there is no reason why you should have to choose either one or

the other. Programmers can now develop their own applications from scratch to better

suit their needs. Python is, after all, known as a language for rapid development.

  By developing your own application, you can have fun with it, experiment with

new visual layouts and creative designs, create platforms for specialized workflows,

and tailor to the needs of others.

  What this book covers

  Chapter 1 , Preparing to Build Your Own GIS Application , talks about the benefits of developing a custom geospatial application and describes how to set up your development environment, and create your application folder structure.

  

Chapter 2 , Accessing Geodata, implements the crucial data loading and saving capabilities

of your application for both vector and raster data. Chapter 3 , Designing the Visual Look of Our Application, creates and puts together the basic building blocks of your application's user interface, giving you a first look at what your application will look like.

  Chapter 4 , Rendering Our Geodata, adds rendering capabilities so that the user can interactively view, zoom, and pan data inside the application.

  

[ ]

  Preface

  Chapter 5 , Managing and Organizing Geographic Data, creates a basic functionality for splitting, merging, and cleaning both the vector and raster data. Chapter 6 , Analyzing Geographic Data, develops basic analysis functionality, such as overlay statistics, for vector and raster data.

Chapter 7 , Packaging and Distributing Your Application, wraps it all up by showing you

how to share and distribute your application, so it is easier for you or others to use it.

Chapter 8 , Looking Forward, considers how you may wish to proceed to further build

on, customize, and extend your basic application into something more elaborate or specialized in whichever way you want.

  What you need for this book There are no real requirements for this book. However, to keep the book short and sweet, the instructions assume that you have a Windows operating system. If you are on Mac OS X or Linux, you should still be able create and run the application,

but then you will have to figure out the equivalent installation instructions for your

operating system. You may be forced to deal with compiling C++ code and face the

potential of unexpected errors. All other installations will be covered throughout the

book, including which Python version to use.

  Who this book is for This book is ideal for Python programmers and software developers who are tasked

with or wish to make a customizable special-purpose GIS application, or are interested

in expanding their knowledge of working with spatial data cleaning, analysis, or map

visualization. Analysts, political scientists, geographers, and GIS specialists seeking a creative platform to experiment with cutting-edge spatial analysis, but are still

only beginners in Python, will also find this book beneficial. Familiarity with Tkinter

application development in Python is preferable but not mandatory.

  Conventions

In this book, you will find a number of text styles that distinguish between different

kinds of information. Here are some examples of these styles and an explanation of their meaning.

  

[ ]

  Preface [ ]

  Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Download the Shapely wheel file that fits our system, looking something like

  Shapely-1.5.7-cp27-none-win32.whl ."

  A block of code is set as follows:

  class LayerGroup: def __init__(self): self.layers = list() self.connected_maps = list() def __iter__(self): for layer in self.layers: yield layer def add_layer(self, layer): self.layers.append(layer) def move_layer(self, from_pos, to_pos): layer = self.layers.pop(from_pos) self.layers.insert(to_pos, layer) def remove_layer(self, position): self.layers.pop(position) def get_position(self, layer): return self.layers.index(layer)

  Any command-line input or output is written as follows: >>> import PIL, PIL.Image >>> img = PIL.Image.open("your/path/to/icon.png") >>> img.save("your/path/to/pythongis/app/icon.ico", sizes=[(255,255),(128,128),(64,64),(48,48),(32,32),(16,16),(8,8)]) New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Click on the Inno Setup link on the left side."

  Warnings or important notes appear in a box like this.

  Preface Tips and tricks appear like this.

  Reader feedback Feedback from our readers is always welcome. Let us know what you think about

this book—what you liked or disliked. Reader feedback is important for us as it helps

us develop titles that you will really get the most out of.

  feedback@packtpub.com

  

To send us general feedback, simply e-mail , and mention

the book's title in the subject of your message. If there is a topic that you have expertise in and you are interested in either writing

  

  Customer support Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

  Downloading the example code

  You can download the example code files from your account at

  

  for all the Packt Publishing books you have purchased. If you and register to have the files e-mailed directly to you.

  Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do

happen. If you find a mistake in one of our books—maybe a mistake in the text or the

code—we would be grateful if you could report this to us. By doing so, you can save

other readers from frustration and help us improve subsequent versions of this book.

If you find any errata, please report them by visiting

  

  , selecting your book, clicking on the Errata Submission Form link,

and entering the details of your errata. Once your errata are verified, your submission

will be accepted and the errata will be uploaded to our website or added to any list of

existing errata under the Errata section of that title.

  

[ ]

  Preface

  

  and enter the name of the book in the search field. The required information will appear under the Errata section.

  Piracy Piracy of copyrighted material on the Internet is an ongoing problem across all

media. At Packt, we take the protection of our copyright and licenses very seriously.

  

If you come across any illegal copies of our works in any form on the Internet, please

provide us with the location address or website name immediately so that we can pursue a remedy.

  copyright@packtpub.com

  Please contact us at with a link to the suspected pirated material. We appreciate your help in protecting our authors and our ability to bring you valuable content.

  Questions If you have a problem with any aspect of this book, you can contact us at

  questions@packtpub.com , and we will do our best to address the problem.

  

[ ]

  Preparing to Build Your Own GIS Application You are here because you love Python programming and are interested in making

your own Geographic Information Systems (GIS) application. You want to create a

desktop application, in other words, a user interface, that helps you or others create, process, analyze, and visualize geographic data. This book will be your step-by-step guide toward that goal. We assume that you are someone who enjoys programming and being creative but

are not necessarily a computer science guru, Python expert, or seasoned GIS analyst.

To successfully proceed with this book, it is recommended that you have a basic introductory knowledge of Python programming that includes classes, methods, and the Tkinter toolkit, as well as some core GIS concepts. If you are a newcomer to some of these, we will still cover some of the basics, but you will need to have the interest and ability to follow along at a fast pace. In this introductory chapter, you will cover the following:

  • Learn some of the benefits of creating a GIS application from scratch • Set up your computer, so you can follow the book instructions.
  • Become familiar with the roadmap toward creating our application.

  Why reinvent the wheel? The first step in preparing ourselves for this book is in convincing ourselves why we

want to make our own GIS application, as well as to be clear about our motives. Spatial

analysis and GIS have been popular for decades and there is plenty of GIS software out there, so why go through the trouble of reinventing the wheel? Firstly, we aren't really reinventing the wheel, since Python can be extended with plenty of third-party libraries that take care of most of our geospatial needs (more on that later).

  

[ ]

  Preparing to Build Your Own GIS Application For me, the main motivation stems from the problem that most of today's GIS

applications are aimed at highly capable and technical users who are well-versed in

GIS or computer science, packed with a dizzying array of buttons and options that

will scare off many an analyst. We believe that there is a virtue in trying to create a

simpler and more user-friendly software for beginner GIS users or even the broader

public, without having to start completely from scratch. This way, we also add more

alternatives for users to choose from, as supplements to the current GIS market

dominated by a few major giants, notably ArcGIS and QGIS, but also others such as

GRASS, uDig, gvSIG, and more.

Another particularly exciting reason to create your own GIS from scratch is to make

your own domain-specific special purpose software for any task you can imagine, whether it is a water flow model GIS, an ecological migrations GIS, or even a GIS

for kids. Such specialized tasks that would usually require many arduous steps in

an ordinary GIS, could be greatly simplified into a single button and accompanied

with suitable functionality, design layout, icons, and colors. One such example is the Crime Analytics for Space-Time (CAST) software produced by the GeoDa Center at Arizona State University, seen in the following picture:

  

[ ]

Chapter 1 Also, by creating your GIS from scratch, it is possible to have greater control of the

  

size and portability of your application. This can enable you to go small—letting your

application have faster startup time, and travel the Internet or on a USB-stick easily. Although storage space itself is not as much of an issue these days, from a user's

perspective, installing a 200 MB application is still a greater psychological investment

with a greater toll in terms of willingness to try it than a mere 30 MB application (all else being equal). This is particularly true in the realm of smartphones and tablets, a geospatial apps. While the specific application very exciting market for special-purpose

we make in this book will not be able to run on iOS or Android devices, it will run on

Windows 8-based hybrid tablets, and can be rebuilt around a different GUI toolkit in order to support iOS or Android (we will mention some very brief suggestions for this in Chapter 8, Looking Forward).

  Finally, the utility and philosophy of free and open source software may be an

important motivation for some of you. Many people today, learn to appreciate open

source GIS after losing access to subscription-based applications like ArcGIS when they complete their university education or change their workplace. By developing your own open source GIS application and sharing with others, you can contribute back to and become part of the community that once helped you.

  Setting up your computer In this book, we follow steps on how to make an application that is developed in a Windows environment. This does not mean that the application cannot be developed on Mac OS X or Linux, but those platforms may have slightly different

installation instructions and may require compiling of the binary code that is outside

the scope of this book. Therefore, we leave that choice up to the reader. In this book,

which focuses on Windows, we avoid the problem of compiling it altogether, using

precompiled versions where possible (more on this later). The development process itself will be done using Python 2.7, specifically the 32-bit version, though 64-bit can theoretically be used as well (note that this is the bit version of your Python installation and has nothing to do with the bit version of your operating system). Although there exists many newer versions, version 2.7 is the most widely supported in terms of being able to use third-party packages. It

has also been reported that the version 2.7 will continue to be actively developed and

promoted until the year 2020. It will still be possible to use after support has ended.

If you do not already have version 2.7, install it now, by following these steps:

   .

2. Under Downloads click on download the latest 32-bit version of Python 2.7 for Windows, which at the time of this writing is Python 2.7.9.

  

[ ]

  Preparing to Build Your Own GIS Application 3. Download and run the installation program.

  For the actual code writing and editing, we will be using the built-in Python Interactive Development Environment (IDLE), but you may of course use any code editor you want. The IDLE lets you write long scripts that can be saved to files and offers an interactive shell window to execute one line at a time. There should be a desktop or start-menu link to Python IDLE after installing Python.

  Installing third-party packages In order to make our application, we will have to rely on the rich and varied ecosystem of third-party packages that already exists for GIS usage.

  The Python Package Index (PyPI) website currently lists more

  than 240 packages tagged Topic :: Scientific/Engineering ::

  GIS. For a less overwhelming overview of the more popular

  GIS-related Python libraries, check out the catalogue at the

  Python-GIS-Resources website created by the author:

  

  We will have to define which packages to use and install, and this depends on the type of application we are making. What we want to make in this book is a lightweight, highly portable, extendable, and general-purpose GIS application.

For these reasons, we avoid heavy packages like GDAL, NumPy, Matplotlib, SciPy,

and Mapnik (weighing in at about 30 MB each or about 150-200 MB if we combine

them all together). Instead, we focus on lighter third-party packages specialized for

each specific functionality.

  Dropping these heavy packages is a bold decision, as they contain a lot of functionality, and are reliable, efficient, and a dependency for many other packages. If you decide that you want to use them in an application where size is not an issue, you may want to begin now by installing the multipurpose NumPy and possibly SciPy, both of which have easy-to-use installers from their official websites. The other heavy packages will be briefly revisited in later chapters.

  Specific installation instructions are given for each package in the chapter where

they are relevant (see the following table for an overview) so that if you do not want

certain functionalities, you can ignore those installations. Due to our focus to make

a basic and lightweight application, we will only be installing a small number of

packages. However, we will provide suggestions throughout the book about other

relevant packages that you may wish to add later on.

  

[ ]

Chapter Installation Purpose

  4 PyAgg Visualization

  7 Py2exe Application distribution The typical way to install Python packages is using pip (included with Python 2.7), which downloads and installs packages directly from the Python Package Index website. Pip is used in the following way:

  1 Python

  1 PIL Raster data, management, and analysis

  1 Shapely Vector management and analysis

  2 PyShp Data

  2 PyGeoj Data

  2 Rtree Vector data speedup

  • Step 1—open your operating system's command line (not the

  [

]

  Python IDLE). On Windows, this is done by searching your system for cmd.exe and running it.

  • Step 2—in the black screen window that pops up, one simply types pip install packagename. This will only work if pip is on your system's environment path. If this is not the case, a quick fix is to simply type the full path to the pip script C:\Python27\Scripts\pip instead of just pip.

  For C or C++ based packages, it is becoming increasingly popular to make them available as precompiled wheel files ending in .whl, which has caused some confusion on how to install them. Luckily, we can use pip to install these wheel files as well, by simply downloading the wheel and pointing pip to its file path.

  Let's go ahead and install PIL for Windows now:

  1.

  

  2. Click on the latest

  .exe file link for our 32-bit Python 2.7 environment

  to download the PIL installer, which is currently Pillow-2.6.1.win32-

  py2.7.exe .

  3. Run the installation file.

  4. Open the IDLE interactive shell and type

  import PIL

  to make sure it was installed correctly.

  Since some of our dependencies have multiple purposes and are not unique to just one chapter, we will install these ones now. One of them is the Python Imaging Library (PIL), which we will use for the raster data model and for visualization.

  Preparing to Build Your Own GIS Application Another central package we will be using is Shapely, used for location testing and geometric manipulation. To install it on Windows, perform the following steps:

  1. .

  2. Download the Shapely wheel file that fits our system, looking something like

  Shapely-1.5.7-cp27-none-win32.whl .

  C:\Python27\

  3. As described earlier, open a command line window and type

  Scripts\pip install path\to\Shapely-1.5.7-cp27-none-win32.whl to unpack the precompiled binaries.

  

4. To make sure it was installed correctly, open the IDLE interactive shell and

type import shapely .

  Imagining the roadmap ahead Before we begin developing our application, it is important that we create a vision of how we want to structure our application. In Python terms, we will be creating a multilevel package with various subpackages and submodules to take care of

different parts of our functionality, independently of any user interface. Only on top

of this underlying functionality do we create the visual user interface as a way to access and run that underlying code. This way, we build a solid system, and allow power-users to access all the same functionality via Python scripting for greater automation and efficiency, as exists for ArcGIS and QGIS.

  

To setup the main Python package behind our application, create a new folder called

  pythongis

  anywhere on your computer. For Python to be able to interpret the folder

  pythongis as an importable package, it needs to find a file named __init__.py in

  that folder. Perform the following steps: 1. Open Python IDLE from the Windows start menu.

  2. The first window to pop up is the interactive shell. To open the script editing window click on File and New.

  3. Click on File and then Save As.

  pythongis

  4. In the dialog window that pops up, browse into the folder, type __init__.py as the filename, and click on Save.

There are two main types of GIS data: vector (coordinate-based geometries such as

points, lines, and polygons) and raster (a regularly spaced out grid of data points or

cells, similar to an image and its pixels).

  

[ ]

Chapter 1 For a more detailed introduction to the differences between vector

  and raster data, and other basic GIS concepts, we refer the reader to the book Learning Geospatial Analysis with Python, by Joel Lawhead. You can find this book at:

  Since vector and raster data are so fundamentally different in all regards, we split our package in two, one for vector and one for raster. Using the same method as

  pythongis

  earlier, we create two new subpackage folders within the package; one called vector and one called raster (each with the same aforementioned empty

  __init__.py file). Thus, the structure of our package will look as follows (note that : package is not part of the folder name): vector raster

  To make our new and subpackages importable by our top level

  pythongis package, we need to add the following relative import statements in pythongis/__init__.py

  :

  from . import vector from . import raster

  

Throughout the course of this book, we will build the functionality of these two data

types as a set of Python modules in their respective folders. Eventually, we want to

end up with a GIS application that has only the most basic of geospatial tools so that

we will be able to load, save, manage, visualize, and overlay data, each of which will

be covered in the following chapters.

As far as our final product goes, since we focus on clarity and simplicity, we do not

put too much effort into making it fast or memory efficient. This comes from an often

repeated saying among programmers, an example of which is found in Structured Programming with go to Statements , ACM, Computing Surveys 6 (4): premature optimization is the root of all evil

  • – Donald E. Knuth

  

[ ]

  Preparing to Build Your Own GIS Application This leaves us with software that works best with small files, which in most cases is good enough. Once you have a working application and you feel that you need support for larger or faster files, then it's up to you if you want to put in the extra effort of optimization.

  The GIS application you end up with at the end of the book is simple but functional,

and is meant to serve as a framework that you can easily build on. To leave you with

some ideas to pick up on, we placed various information boxes throughout the book

with ways that you can optimize or extend your application. For any of the core topics

and features that we were not able to cover earlier in the book, we give a broader discussion of missing functionality and future suggestions in the final chapter.

  Summary In this chapter, you learned about why you want to create a GIS application using Python, set up our programming environment, installed some recurring packages, and created your application structure and framework.

In the next chapter, you will take the first step toward making a geospatial application,

by creating a simple yet powerful module for loading and saving some common geospatial data formats from scratch.

  

[ ] Accessing Geodata All GIS processing must start with geographic data, so we begin our application by

building the capacity to interact with, load, and save various geographic file formats.

This chapter is divided into a vector and raster section, and in each section, we will cover the following:

  • Firstly, we create a data interface which means understanding data structures and how to interact with them.
  • Secondly and thirdly, any format-specific differences are outsourced to separate loader and saver modules.

  

This is a lot of functionality to fit into one chapter, but by working your way through,

you will learn a lot about data structures, and file formats, and end up with a solid foundation for your application.

  The approach In our efforts to build data access in this chapter, we focus on simplicity, understanding, and lightweight libraries. We create standardized data interfaces

for vector and raster data so that we can use the same methods and expect the same

results on any data, without worrying about file format differences. They are not necessarily optimized for speed or memory efficiency as they load entire files into memory at once.

  In our choice of third-party libraries for loading and saving, we focus on format- specific ones, so that we can pick and choose which formats to support and thus maintain a lightweight application. This requires some more work but allows us to learn intricate details about file formats.

  

[ ]

  Accessing Geodata

  If the size is not an issue in your application, you may wish to instead use the more powerful GDAL library, which can single-handedly load and save a much wider range of both vector and raster formats. To use GDAL, I suggest downloading and installing a precompiled version from On top of GDAL, the packages Fiona convenient and Pythonic interface to GDAL's functionality for vector and raster data, respectively.

  Vector data

We begin by adding support for vector data. We will be creating three submodules

  vector data loader saver

  

inside our package: , , and . To make these accessible from

their parent vector package, we need to import it in vector/__init__.py as follows:

  from . import data from . import loader from . import saver

  A data interface for vector data The first thing we want is a data interface that we can conveniently interact with.

  This data interface will be contained in a module of its own, so create this module now and save it as vector/data.py . We start off with a few basic imports, including compatibility functions for Shapely

(which we installed in Chapter 1, Preparing to Build Your Own GIS Application) and the

spatial indexing abilities of Rtree, a package we will install later. Note that vector data

loading and saving, are handled by separate modules that we have not yet created, but since they are accessed through our data interface, we need to import them here:

  # import builtins import sys, os, itertools, operator from collections import OrderedDict import datetime # import shapely geometry compatibility functions # ...and rename them for clarity import shapely from shapely.geometry import asShape as geojson2shapely

  

[ ]

  # import rtree for spatial indexing import rtree # import internal modules from . import loader from . import saver

  Downloading the example code

  You can download the example code files from your account at for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit and register to have the files e-mailed directly to you.

  The vector data structure Geographic vector data can be thought of as a table of data. Each row in the table is an observation (say, a country), and holds one or more attributes, or piece

of information for that observation (say, population). In a vector data structure, rows

are known as a features , and have additional geometry definitions (coordinates that

define, say, the shape and location of a country). An overview of the structure may therefore look something like this: In our implementation of the vector data structure, we therefore create the interface

  VectorData VectorData

  as a class. To create and populate a instance with data, we

  filepath

  can give it a argument that it loads via the loader module that we create later. We also allow for optional keyword arguments to pass to the loader, which as we shall see includes the ability to specify text encoding. Alternatively, an empty

  VectorData instance can be created by not passing it any arguments. While creating

  

an empty instance, it is possible to specify the geometry type of the entire data instance

(meaning, it can only hold either polygon, line, or point geometries), otherwise it will

set the data type based on the geometry type of the first feature that is added.

  

[ ]

  Accessing Geodata

In addition to storing the fieldnames and creating features from rows and geometries,

  VectorData filepath

  

a instance remembers the origin of the loaded data if applicable,

and the Coordinate Reference System (CRS) which defaults to unprojected WGS84 if

not specified. To store the features, rather than using lists or dictionaries, we use an ordered dictionary that allows us to identify each feature with a unique ID, sort the features, and perform fast and frequent feature lookups. To ensure that each feature in VectorData has a unique ID, we define a unique ID generator and

  VectorData attach independent ID generator instances to each instance. VectorData

  To let us interact with the instance, we add various magic methods to enable standard Python operations such as getting the number of features in the

data, looping through them, and getting and setting them through indexing their ID.

  add_feature copy

  

Finally, we include a convenient and method. Take a look at the

following code: