4.1. Package Fundamentals: Structure, Configuration, and Building
Story Problem: The Auckland Utilities You Keep Rewriting
The Problem: Copy-Paste Hell
Youβve written code to load Auckland SA2 boundaries three times now:
# project1/load_data.py
def load_sa2():
gdf = gpd.read_file('data/auckland_sa2.gpkg')
if gdf.crs != 'EPSG:2193':
gdf = gdf.to_crs('EPSG:2193')
return gdf
# Months later in project2/analysis.py
# ... copy-pasted the same function ...
# But wait, you improved the validation in project1!
# Now you need to update everywhere...The Solution: One Package, Infinite Projects
# Once: Create and publish
pip install auckland-gis
# Forever: Use everywhere
from auckland_gis import load_sa2
sa2 = load_sa2() # Always validated, always correctWhat You Will Learn
This section covers the complete package creation workflow:
- Why packaging matters for GIS workflows
- Project structure that scales
- Configuration with
pyproject.toml - Building distributions with
uv - Local testing before publication
By the end, youβll build your first package: geohello-yourname
Part 1: Why Packaging Matters for GIScience
The Research Reality
Without packages: - π 5 projects, 5 copies of the same CRS validation function - π Find a bug? Fix it in 5 places (and remember where they all are) - π€ Collaborator wants your code? Send a zip file and hope it works - π Reproduce analysis? βWhich version did I use again?β
With packages: - β
One function, maintained in one place - β
Fix once, improves everywhere - β
Share with pip install your-package - β
Pin version: your-package==1.2.0 ensures reproducibility
Real-World GIScience Use Cases
1. Research Lab Toolkit
Instead of:
# Everyone has their own version
from utils import calculate_accessibility # Which utils?Use:
pip install lab-spatial-toolkit==2.1.0
from lab_spatial_toolkit import calculate_accessibility2. Course Infrastructure
Instead of:
# Students struggle with data loading
gdf = gpd.read_file('../../data/maybe/here/boundaries.gpkg')Use:
pip install gisci343-toolkit
from gisci343_toolkit import load_assignment_data
pedestrians, boundaries = load_assignment_data(assignment=2)3. Consulting Deliverable
Instead of: - Send zip file - Hope they have the right Python version - Hope they install dependencies correctly - Cross fingers
Use:
# In your project report:
# "To reproduce: pip install auckland-transport-analysis==1.0.0"
from auckland_transport_analysis import generate_all_figures
figures = generate_all_figures('client_data/')GIS-Specific Benefits
Standardise CRS Handling
# Every function handles CRS consistently
from auckland_gis import ensure_nztm
def calculate_area(gdf):
gdf = ensure_nztm(gdf) # Always EPSG:2193
return gdf.geometry.areaValidate Spatial Joins
from auckland_gis import validate_join
# Catches when 5 SA2 zones silently disappear
result, report = validate_join(boundaries, census, key='SA2_code')
if not report['all_matched']:
print(f"WARNING: {report['unmatched_count']} zones missing!")Reusable Visualisations
from auckland_gis.viz import create_choropleth
# Same colour scale, same layout, every time
fig = create_choropleth(gdf, column='density', title='Population Density')Python Package Terminology
Before we build, letβs clarify terms:
Module: A single Python file
# metrics.py is a module
def calculate_density(gdf):
passPackage: A directory of modules with __init__.py
auckland_gis/ # This is a package
βββ __init__.py # Makes it importable
βββ data.py # Module
βββ metrics.py # Module
βββ viz.py # Module
Distribution: The bundled, installable version
# These files on PyPI are distributions:
auckland_gis-1.0.0-py3-none-any.whl # Wheel (fast install)
auckland_gis-1.0.0.tar.gz # Source distributionLibrary vs Application: - Library: Imported by other code (geopandas, osmnx) - Application: Run directly (qgis, jupyter)
Weβre building libraries in this course.
Part 2: Project Structure That Scales
The src/ Layout (Best Practice)
Modern Python packaging uses the src/ layout:
geohello/ # Project root
βββ src/ # Source code here
β βββ geohello/ # Actual package
β βββ __init__.py # Makes it a package
β βββ core.py # Your code
βββ tests/ # Tests separate from source
β βββ test_core.py
βββ docs/ # Documentation
βββ pyproject.toml # Package configuration
βββ README.md # Project description
βββ LICENSE # Open source license
Why src/ instead of flat layout?
β Flat layout (old way):
geohello/
βββ geohello/ # Package
β βββ __init__.py
β βββ core.py
βββ tests/
βββ pyproject.toml
Problems: - Tests might import from local directory instead of installed package - Hard to tell if tests pass because code works or because of accidental local imports
β
src/ layout (modern way):
geohello/
βββ src/
β βββ geohello/ # Package
βββ tests/
βββ pyproject.toml
Benefits: - Forces tests to use installed package - Catches import errors earlier - Clearer separation of concerns - Industry standard
Directory Structure Explained
Letβs build a real GIS package structure:
auckland-gis/ # GitHub repo name (with hyphen)
βββ src/
β βββ auckland_gis/ # Python package name (with underscore)
β βββ __init__.py # Package initialisation
β βββ data.py # Data loading functions
β βββ metrics.py # Calculation functions
β βββ viz.py # Visualisation functions
βββ tests/
β βββ __init__.py
β βββ test_data.py # Test data loading
β βββ test_metrics.py # Test calculations
β βββ fixtures/ # Test data files
β βββ sample_sa2.gpkg
βββ docs/
β βββ index.md # Documentation homepage
β βββ quickstart.md
β βββ api-reference.md
βββ examples/
β βββ basic_usage.ipynb # Example notebook
βββ pyproject.toml # Package configuration
βββ README.md # Project description
βββ LICENSE # e.g., MIT License
βββ .gitignore # Git ignore file
What Goes in __init__.py?
The __init__.py file makes a directory a Python package and controls whatβs importable:
Minimal __init__.py (for starters):
"""Auckland GIS utilities for urban analytics."""
__version__ = "0.1.0"Better __init__.py (export key functions):
"""Auckland GIS utilities for urban analytics."""
__version__ = "0.1.0"
from .data import load_sa2, load_pedestrian_counts
from .metrics import calculate_density, calculate_accessibility
from .viz import create_choropleth
__all__ = [
"load_sa2",
"load_pedestrian_counts",
"calculate_density",
"calculate_accessibility",
"create_choropleth",
]Now users can:
from auckland_gis import load_sa2 # Clean!
# Instead of:
from auckland_gis.data import load_sa2 # More verboseModule Organisation Principles
1. Group by functionality, not type
β Donβt do this:
src/auckland_gis/
βββ functions.py # Too vague!
βββ classes.py # Too vague!
βββ utilities.py # Too vague!
β Do this:
src/auckland_gis/
βββ data.py # Data loading/validation
βββ metrics.py # Calculations
βββ network.py # Network analysis
βββ viz.py # Visualisation
2. Keep modules focused
Each module should have one clear purpose:
# data.py - Data loading and validation
def load_sa2(path=None): ...
def load_census(year=2023): ...
def validate_geometry(gdf): ...
# metrics.py - Calculations only
def calculate_density(gdf, pop_col='population'): ...
def calculate_accessibility(gdf, poi_gdf, max_distance_m=800): ...
# viz.py - Visualisation only
def create_choropleth(gdf, column, cmap='YlOrRd'): ...
def plot_network(G, node_color='red'): ...3. Use subpackages for complex projects
For larger packages:
src/auckland_gis/
βββ __init__.py
βββ data/
β βββ __init__.py
β βββ boundaries.py
β βββ census.py
βββ analysis/
β βββ __init__.py
β βββ spatial.py
β βββ temporal.py
βββ viz/
βββ __init__.py
βββ maps.py
βββ charts.py
Data in Packages: Best Practices
β Donβt ship large data files
# Bad: 50MB shapefile in the package
src/auckland_gis/
βββ data/
βββ boundaries_large.shp # Makes pip install slow!
β Do ship small reference data
# Good: Small lookup tables, schemas
src/auckland_gis/
βββ data/
βββ sa2_schema.json # 2KB - fine!
β Do provide download functions
# auckland_gis/data.py
def load_sa2(path=None):
"""
Load Auckland SA2 boundaries.
Parameters
----------
path : str, optional
Path to local .gpkg file. If None, downloads from
data repository.
"""
if path is None:
path = download_boundaries() # Downloads on first use
return gpd.read_file(path)β Do use data repositories
- Host large files on Zenodo, Figshare, or GitHub releases
- Download on first use
- Cache locally
Part 3: pyproject.toml Configuration
What is pyproject.toml?
The pyproject.toml file is your packageβs configuration hub (PEP 621 standard):
- Package metadata (name, version, description)
- Dependencies
- Build system
- Tool configurations (pytest, mypy, etc.)
One file to rule them all (replaces old setup.py, requirements.txt, etc.)
Minimal pyproject.toml
This is what uv init --lib geohello creates:
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "geohello"
version = "0.1.0"
description = "A simple geospatial greeting package"
readme = "README.md"
requires-python = ">=3.10"
dependencies = []Letβs break it down:
[build-system]: How to build your package
[build-system]
requires = ["hatchling"] # Build tool to use
build-backend = "hatchling.build" # Backend APIModern options: - hatchling (modern, simple) β Recommended - setuptools (traditional, complex) - flit (minimalist) - poetry (opinionated)
[project]: Package metadata
[project]
name = "geohello" # PyPI package name (unique!)
version = "0.1.0" # Semantic versioning
description = "A simple greeting" # One-line summary
readme = "README.md" # Long description from file
requires-python = ">=3.10" # Python version requirementComplete pyproject.toml for GIS Package
Hereβs a full example for an Auckland GIS package:
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "auckland-gis"
version = "0.2.0"
description = "Geospatial utilities for Auckland urban analytics"
readme = "README.md"
requires-python = ">=3.10"
license = {text = "MIT"}
authors = [
{name = "Your Name", email = "your.email@auckland.ac.nz"}
]
keywords = ["gis", "urban analytics", "auckland", "geospatial"]
classifiers = [
"Development Status :: 3 - Alpha",
"Intended Audience :: Science/Research",
"Topic :: Scientific/Engineering :: GIS",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
]
# Runtime dependencies
dependencies = [
"geopandas>=0.14.0",
"shapely>=2.0.0",
"pandas>=2.0.0",
"matplotlib>=3.7.0",
]
# Optional features
[project.optional-dependencies]
viz = ["folium>=0.15.0", "plotly>=5.0.0"]
network = ["osmnx>=1.9.0", "networkx>=3.0"]
all = ["auckland-gis[viz,network]"]
# Development dependencies
dev = [
"pytest>=7.0",
"pytest-cov>=4.0",
"black>=23.0",
"ruff>=0.1.0",
]
[project.urls]
Homepage = "https://github.com/yourusername/auckland-gis"
Documentation = "https://auckland-gis.readthedocs.io"
Repository = "https://github.com/yourusername/auckland-gis"
Issues = "https://github.com/yourusername/auckland-gis/issues"
# Tool configurations
[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = ["test_*.py"]
python_functions = ["test_*"]
addopts = "--cov=auckland_gis --cov-report=term-missing"
[tool.black]
line-length = 100
target-version = ["py310", "py311", "py312"]
[tool.ruff]
line-length = 100
select = ["E", "F", "I"] # Error, Formatting, ImportUnderstanding Dependencies
Runtime dependencies (required to use your package):
dependencies = [
"geopandas>=0.14.0", # Minimum version
"pandas>=2.0.0,<3.0.0", # Version range
]Optional dependencies (for extra features):
[project.optional-dependencies]
viz = ["folium>=0.15.0"]
# Users can install with:
# pip install auckland-gis[viz]Development dependencies (for package development):
[project.optional-dependencies]
dev = [
"pytest>=7.0",
"black>=23.0",
]
# Install with: uv pip install -e ".[dev]"Package Naming Conventions
PyPI name (project name in pyproject.toml): - Use hyphens: auckland-gis - Must be unique on PyPI - Case-insensitive - Can include numbers: r5py
Python package name (directory under src/): - Use underscores: auckland_gis - Must be valid Python identifier - Lowercase only - Matches import: import auckland_gis
Example mapping:
PyPI: auckland-gis β pip install auckland-gis
Python: auckland_gis β import auckland_gis
GitHub: auckland-gis β github.com/user/auckland-gis
Part 4: Building with uv
Why uv?
uv is a modern, fast Python package manager and project manager:
Traditional workflow:
python -m venv venv
source venv/bin/activate
pip install -e .
pip install pytest
python -m build
twine upload dist/*With uv:
uv init --lib mypackage # Create project
uv build # Build distributions
uv publish # Publish to PyPIBenefits: - 10-100x faster than pip - Unified tool (no need for pip, venv, build, twine separately) - Better dependency resolution - Built in Rust for speed
Installing uv
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
# With pip (any platform)
pip install uv
# Verify installation
uv --versionCreating Your First Package
Letβs build geohello:
Step 1: Initialize
uv init --lib geohello
cd geohelloThis creates:
geohello/
βββ src/
β βββ geohello/
β βββ __init__.py
β βββ py.typed
βββ tests/
βββ pyproject.toml
βββ README.md
βββ .gitignore
Step 2: Write code
Edit src/geohello/__init__.py:
"""A simple geospatial greeting package."""
__version__ = "0.1.0"
def hello(place="Auckland"):
"""
Generate a greeting for a place.
Parameters
----------
place : str, default "Auckland"
Place name to greet
Returns
-------
str
Greeting message
Examples
--------
>>> hello()
'Hello from Auckland!'
>>> hello("Tokyo")
'Hello from Tokyo!'
"""
return f"Hello from {place}!"Step 3: Update metadata
Edit pyproject.toml:
[project]
name = "geohello-yourname" # Make it unique!
version = "0.1.0"
description = "A simple geospatial greeting"
readme = "README.md"
requires-python = ">=3.10"
authors = [
{name = "Your Name", email = "you@example.com"}
]
dependencies = [] # No dependencies for this simple exampleStep 4: Test locally
# Run your function directly
uv run python -c "from geohello import hello; print(hello())"
# Output: Hello from Auckland!
# Or open interactive Python
uv run python
>>> from geohello import hello
>>> print(hello("Tokyo"))
Hello from Tokyo!Building Distributions
Build both wheel and source distribution:
uv buildThis creates dist/ with two files:
dist/
βββ geohello_yourname-0.1.0-py3-none-any.whl # Wheel (fast install)
βββ geohello_yourname-0.1.0.tar.gz # Source (backup)
Whatβs the difference?
Wheel (.whl): - Binary distribution - Fast to install (pre-built) - Platform-specific or universal - Example: package-1.0.0-py3-none-any.whl - py3: Python 3 - none: No ABI (pure Python) - any: Any platform
Source Distribution (.tar.gz): - Contains source code - Built during installation - Always works but slower - Backup when wheel unavailable
For pure Python GIS packages: Use universal wheels (py3-none-any)
Testing the Built Package
Option 1: Install from wheel
uv pip install dist/geohello_yourname-0.1.0-py3-none-any.whl
# Test import
python -c "from geohello import hello; print(hello())"Option 2: Install in development mode
uv pip install -e .
# Changes to source code immediately reflectedOption 3: Install in fresh environment
# Create clean test environment
uv venv test-env
source test-env/bin/activate # On Windows: test-env\Scripts\activate
# Install and test
pip install dist/geohello_yourname-0.1.0-py3-none-any.whl
python -c "from geohello import hello; print(hello())"Common Build Issues
Issue 1: Module not found
ImportError: No module named 'geohello'
Solution: Check src/ structure is correct:
src/
βββ geohello/
βββ __init__.py # Must exist!
Issue 2: Wrong package name
# pyproject.toml
name = "geohello" # PyPI name (with hyphen OK)
# But directory must be:
src/geohello/ # Must use underscore!Issue 3: Missing dependencies
[project]
dependencies = [
"geopandas>=0.14.0", # Specify all runtime dependencies!
]Week 9 Lab: Build Your First Package
Lab Exercise: geohello-yourname
Goal: Create, build, and test a simple geospatial package locally
Time: 90 minutes
Steps:
Initialize (10 mins)
uv init --lib geohello-yourname cd geohello-yournameWrite function (20 mins)
- Implement
hello(place)function - Add docstring
- Test it works
- Implement
Configure (15 mins)
- Update
pyproject.tomlmetadata - Set your name, unique package name
- Improve README
- Update
Build (10 mins)
uv build # Check dist/ folder created ls dist/Test locally (20 mins)
# Test import works uv run python -c "from geohello import hello; print(hello())" # Test with different input uv run python -c "from geohello import hello; print(hello('Paris'))"Submit (5 mins)
- Screenshot of successful build
- Screenshot of successful import
- Reflection (100-150 words)
Deliverable (Completion Credit)
Due: End of Week 9 lab (or Friday 15 May, 5pm)
Submit to Canvas: 1. Screenshot showing successful uv build with both files in dist/ 2. Screenshot showing successful import and execution 3. Brief reflection: - What did you learn about package structure? - What was surprising or challenging? - How might you use this for Assignment 3?
Grading: Completion credit (pass/fail, not percentage)
Purpose: Ensure everyone can build packages before Week 10
Assignment 3 Planning
Use remaining lab time (30 mins) to plan your final package:
Brainstorm: - What reusable functions have you written? - What urban analytics topic interests you? - Micromobility? Accessibility? Walkability? Air quality?
Sketch structure:
your-package/
βββ src/
β βββ your_package/
β βββ __init__.py
β βββ data.py # What data loading functions?
β βββ analysis.py # What calculations?
β βββ viz.py # What visualisations?
Consider scope: - Whatβs achievable in 3-4 weeks? - What builds on your existing work? - What would be useful to others?
Summary
Youβve learned the foundations of Python packaging:
Why package? - Reusability across projects - Standardisation of workflows - Reproducibility of research - Sharing with community
Project structure: - src/ layout best practice - Clear module organisation - Separation of concerns - Proper data handling
Configuration: - pyproject.toml metadata - Dependency management - Tool configurations - Package naming
Building: - uv workflow - Wheel vs source distributions - Local testing - Troubleshooting
Next steps: In sec-testing-quality, youβll learn to test, document, and publish to TestPyPI.
Further Reading
- Python Packaging Guide: https://packaging.python.org/
- PEP 621 - pyproject.toml: https://peps.python.org/pep-0621/
- uv documentation: https://docs.astral.sh/uv/
src/layout explanation: https://hynek.me/articles/testing-packaging/- Semantic Versioning: https://semver.org/
Practice Exercises
Create
auckland-utils: Package with functions to load SA2 boundaries and validate CRSAdd submodules: Split your code into
data.py,metrics.py,viz.pyConfigure dependencies: Add
geopandas,matplotlibtopyproject.tomlBuild and test: Create distributions and verify local installation
Document: Write clear docstrings for all functions
Ready to add tests and publish? Continue to sec-testing-quality!