Skip to main content
Version: Next

Create New Geoprocessing Project

This tutorial walks you through designing and creating your own geoprocessing report. It covers many of the questions and decisions you might face along the way.

This tutorial assumes:

  • Your system setup is complete
  • You completed the sample project tutorial
  • Your geoprocessing virtual environment is currently running (Devcontainer or WSL)
  • You have VSCode open in your virtual environment with a terminal pane open

Where Do I Start?

Creating a geoprocessing project is not linear, it's iterative. You don't need to have all the answers for your project or understand all the features of the framework. Here's one approach:

Explore and Design

UI component library

Look at other SeaSketch Reports

Here is a design template the SeaSketch team uses. A document like this is a good place to capture thinking, solicit feedback, and record decisions. This is invaluable later when you try to remember why you built it the way you did, or if a new person needs to come up to speed on the project.

Build

Start Simple

The geoprocessing framework is a set of building blocks. Which ones you use are up to you.

If your planning process is simple:

  • a single planning boundary or none at all
  • straightforward objectives
  • smaller datasets
  • short running analysis
  • no classification of sketch types (e.g. protection levels)
  • no need to handle overlapping sketch polygons

Then your geoprocessing project can be kept simple.

  • no precalculation needed
  • manual prep and publish of datasources to S3, or even direct import of GeoJSON files in geoprocessing functions.
  • simple metrics calculated directly using libraries Turf and Geoblaze
  • simple reports rendering a few values, a table, a chart

A good example of this is Oregon SeaSketch reports.

Then Get Complicated

As your planning process gets more complex:

  • multiple planning boundaries (offshore/nearshore) with combined objectives
  • multiple objectives with targets
  • large datasets with multiple subclasses.
  • long running analysis with required precalculation
  • sketch classification system (e.g. protection levels)
  • enforcing rules about overlapping sketches

Then your project may benefit from more sophisticated building blocks, sometimes at the cost of flexibility:

  • Project Datasource records, managed via data:import and data:publish with automated import, transform, and publish to S3.
  • Geography records representing project planning boundaries
  • Metric records for representing common multi-dimensional analysis results.
  • Objective records representing objective targets per sketch class.
  • MetricGroup records connecting metric results to their data classes, datasource, objective target, etc.
  • toolbox for calculating overlay analysis metrics at the collection level in many dimensions simultaneously - by data class, by protection level, by planning boundary. Can even handle sketches overlapping themselves and not double counting overlay stats.
  • UI components that can work with all of these record types - ClassTable, SketchClassTable, GeographySwitcher, RbcsMpaObjective
  • precalc command pre-calculating overlay stats ahead of time for combinations of Datasources and Geographies.
  • worker functions to spread processing across more Lambdas to run in parallel.
  • Language translation workflow and library of pre-translated UI components.

Examples of more complex projects:

Create SeaSketch Project

First, follow the instructions to create a new SeaSketch project. This includes defining the planning bounds and creating a Sketch class. You will want to create a Polygon sketch class with a name that makes sense for you project (e.g. MPA for Marine Protected Area) and then also a Collection sketch class to group instances of your polygon sketch class into. Note that sketch classes are where you will integrate your geoprocessing services to view reports, but you will not do it at this time.

Initialize New Project

Start with initializing a new project:

cd /workspaces
npx @seasketch/geoprocessing@7.0.0-experimental-7x-simplify.67 init 7.0.0-experimental-7x-simplify.67

Tips:

  • the answers to all of the init questions can be changed later, so don't worry if you don't know the answer.
  • SeaSketch uses a BSD-3 license (the default choice). You can choose any including UNLICENSED meaning proprietary or "All rights reserved" .
  • The most common AWS region is us-west-1 or us-east-2. Choose the region closest to your project.

Learn more about your projects structure

Choose how to bring data into your workspace.

Import Datasources

Methods:

  • Use import:data
  • Manually prepare and copy your data to datasets bucket

Write a Geoprocessing Function

Let's start with src/functions/simpleFunction and build it up to use a datasource.

Methods:

  • Directly import geojson file in function
  • Use datasource record and getDatasource and getFeatures
  • Load from local bucket using load function, url, and bbox
  • Load from third-party using load function, url, and bbox

If the data you'll use in analysis is already published online, publicly accessible, and in flatgeobuf or cloud-optimized geotiff format, then you can directly access them.

Smoke Test With Examples

Methods to generate examples:

  • genRandomPolygon
  • geojson.io
  • export sketch geojson from SeaSketch project

Assuming you have a SeaSketch project with a Polygon sketch class, follow the instructions for sketching tools to draw one or more polygon sketches. You can also create a collection and group your sketches into the collection.

Finally, export your sketches and sketch collections as GeoJSON, and move them into your geoprocessing projects examples/sketches folder.

  /examples/
sketches/ # <-- examples used by geoprocessing functions
features/ # <-- examples used by preprocessing functions

Once you add your example sketches and collections to this folder, run your smoke tests.

npm run test

The smoke test for your geoprocessing function will run the function against every sketch example whether a single Sketch or a SketchCollection and output the results to examples/output. You look at this output and ensure that it is as expected.

Learn more about testing and debugging in testing

Write Report Client

Build and Deploy to AWS

Deploy your project

Debugging build failure

If the build step fails, you will need to look at the error message and figure out what you need to do. Did it fail in building the functions or the clients? 99% of the time you should be able to catch these errors sooner. If VSCode finds invalid Typescript code, it will warn you with files marked in red in the Explorer panel or with red markes and squiggle text in any of the files.

If you're still not sure try some of the following:

  • Run your smoke tests, see if they pass
  • When was the last time your build did succeed? You can be sure the error is caused by a change you made since then either in your project code, by upgrading your geoprocessing library version and not migratin fully, or by changing something on your system.
  • You can stash your current changes or commit them to a branch so they are not lost. Then sequentially check out previous commits of the code until you find one that builds properly. Now you know that the next commit cause the build error.

Connect to SeaSketch Project and Test

Choose clipToOcean as preprocessor Choose MpaTabReport as report client

Test different sketch and collection scenarios. When you find one that errors or does something unexpected, then you can export that sketch to your projects examples/sketches directory and run your smoke tests. If that succeeds and produces output as expected, then load your storybook and see if you can reproduce in your report client.

Expand and Iterate

There are more advanced features available if you need them.

Project Client

It has a lot of shortcut methods for working with datsources, geographies, precalc metrics, objectives, etc. It's not meant to be a black box, you can look at what it does.

[Link to project client ]

Configure Geography

Precalc Metrics

At the very least you should import your planning boundaries, preferably as individual files, or as individual layers within a file package.

Any file-based format that OGR and GDAL supports out of the box.

npm run precalc:data

? Do you want to precalculate only a subset?
Yes, by datasource
Yes, by geography
Yes, by both
❯ No, just precalculate everything (may take a while)

What's happening is that the precalc script starts a local web server on port 8001 that serves up the datasources in data/dist.

The precalc script then gets all your project datasources with precalc: true, and all your project geographies with precalc: true, and then calculate area, sum, and count metrics for each combination of datasource and geography.

Once complete project/precalc.json will have been updated with the new metric values.

If your datasource has classKeys defined in its record, precalc will also calculate area, sum, and count for each unique class value found within the classKey.

You must re-run precalc:data every time you change a geography record or a datasource.

  • To learn more advanced use, see the precalc guide.
  • To learn more about use of precalculated metrics, see the report client guide.

Metric Groups

How you intend to use your data will determine what form the data needs to be in.

Examples By Use Case

  • Do you have vector data?
    • Does it have a single data class?
      • Is it one file with one data class?
      • Is it one file with multi-class attribute, of which you only need one?
        • create a new dataset with
    • Does it have multiple data classes?
      • Is it one data class per file?
      • I sit one data class per layer within file?
      • Does it have multiple data classes within one layer with an attribute to differentiate them?
  • Do you have raster data?
    • Does it have a single data class?
      • Is it one file with one data class?
    • Does it have multiple data classes?
      • Is it one file, one data class per raster band?
      • Is it multiple files, one data class per file?
      • Is it a categorical raster with unique cell value for each class?

[ToDo: provide metric group example for each leaf in tree]

Create Report

  • Edits to the statistic you want calculated (i.e.calculating average instead of sum, etc) should happen in your function (src/functions/benthicHabitat.ts).
  • Edits to the way the analytics are displayed (i.e. changing labels, converting units, adding text context, etc) should happen in your component (src/components/BenthicHabitat.tsx).

Language Translation

Language translation takes effort to maintain. It is suggested that you get your reports close to final, in the English language, and then add translations.

What Next

Still have more questions? Start a discussion on Github.