Create New Geoprocessing Project
This tutorial walks you through designing and creating your own geoprocessing report. It covers many of the questions and decisions you might face along the way.
This tutorial assumes:
- Your system setup is complete
- You completed the sample project tutorial
- Your geoprocessing virtual environment is currently running (Devcontainer or WSL)
- You have VSCode open in your virtual environment with a terminal pane open
Where Do I Start?
Creating a geoprocessing project is not linear, it's iterative. You don't need to have all the answers for your project or understand all the features of the framework. Here's one approach:
- Explore and design
- Explore report building blocks
- Look at reports in other SeaSketch projects
- Create a rough design
- Build
- Expand and Iterate
- add more features
- preprocessing function
- Geographies and MetricGroups
- What next?
Explore and Design
Look at other SeaSketch Reports
Here is a design template the SeaSketch team uses. A document like this is a good place to capture thinking, solicit feedback, and record decisions. This is invaluable later when you try to remember why you built it the way you did, or if a new person needs to come up to speed on the project.
Build
Start Simple
The geoprocessing framework is a set of building blocks. Which ones you use are up to you.
If your planning process is simple:
- a single planning boundary or none at all
- straightforward objectives
- smaller datasets
- short running analysis
- no classification of sketch types (e.g. protection levels)
- no need to handle overlapping sketch polygons
Then your geoprocessing project can be kept simple.
- no precalculation needed
- manual prep and publish of datasources to S3, or even direct import of GeoJSON files in geoprocessing functions.
- simple metrics calculated directly using libraries Turf and Geoblaze
- simple reports rendering a few values, a table, a chart
A good example of this is Oregon SeaSketch reports.
Then Get Complicated
As your planning process gets more complex:
- multiple planning boundaries (offshore/nearshore) with combined objectives
- multiple objectives with targets
- large datasets with multiple subclasses.
- long running analysis with required precalculation
- sketch classification system (e.g. protection levels)
- enforcing rules about overlapping sketches
Then your project may benefit from more sophisticated building blocks, sometimes at the cost of flexibility:
- Project
Datasource
records, managed viadata:import
anddata:publish
with automated import, transform, and publish to S3. Geography
records representing project planning boundariesMetric
records for representing common multi-dimensional analysis results.Objective
records representing objective targets per sketch class.MetricGroup
records connecting metric results to their data classes, datasource, objective target, etc.- toolbox for calculating overlay analysis metrics at the collection level in many dimensions simultaneously - by data class, by protection level, by planning boundary. Can even handle sketches overlapping themselves and not double counting overlay stats.
- UI components that can work with all of these record types -
ClassTable
,SketchClassTable
,GeographySwitcher
,RbcsMpaObjective
precalc
command pre-calculating overlay stats ahead of time for combinations of Datasources and Geographies.worker
functions to spread processing across more Lambdas to run in parallel.- Language
translation
workflow and library of pre-translated UI components.
Examples of more complex projects:
- California - multiple planning geographies, worker functions
- Bermuda - IUCN classification system with metrics calculated overall, per protection level, and per sketch. worker functions
- Blue Azores nearshore - user switching between planning geographies.
- Samoa Reports
- Azores Nearshore Reports.
Create SeaSketch Project
First, follow the instructions to create a new SeaSketch project. This includes defining the planning bounds and creating a Sketch class. You will want to create a Polygon
sketch class with a name that makes sense for you project (e.g. MPA for Marine Protected Area) and then also a Collection
sketch class to group instances of your polygon sketch class into. Note that sketch classes are where you will integrate your geoprocessing services to view reports, but you will not do it at this time.
Initialize New Project
Start with initializing a new project:
cd /workspaces
npx @seasketch/geoprocessing@7.0.0-experimental-7x-simplify.67 init 7.0.0-experimental-7x-simplify.67
Tips:
- the answers to all of the init questions can be changed later, so don't worry if you don't know the answer.
- SeaSketch uses a BSD-3 license (the default choice). You can choose any including
UNLICENSED
meaning proprietary or "All rights reserved" . - The most common AWS region is
us-west-1
orus-east-2
. Choose the region closest to your project.
Learn more about your projects structure
Link Data Into Workspace
Choose how to bring data into your workspace.
Import Datasources
Methods:
- Use
import:data
- Manually prepare and copy your data to datasets bucket
Write a Geoprocessing Function
Let's start with src/functions/simpleFunction
and build it up to use a datasource.
Methods:
- Directly import geojson file in function
- Use
datasource
record andgetDatasource
andgetFeatures
- Load from local bucket using
load
function, url, and bbox - Load from third-party using
load
function, url, and bbox
If the data you'll use in analysis is already published online, publicly accessible, and in flatgeobuf or cloud-optimized geotiff format, then you can directly access them.
Smoke Test With Examples
Methods to generate examples:
- genRandomPolygon
- geojson.io
- export sketch geojson from SeaSketch project
Assuming you have a SeaSketch project with a Polygon sketch class, follow the instructions for sketching tools to draw one or more polygon sketches. You can also create a collection and group your sketches into the collection.
Finally, export your sketches and sketch collections as GeoJSON, and move them into your geoprocessing projects examples/sketches
folder.
/examples/
sketches/ # <-- examples used by geoprocessing functions
features/ # <-- examples used by preprocessing functions
Once you add your example sketches and collections to this folder, run your smoke tests.
npm run test
The smoke test for your geoprocessing function will run the function against every sketch example whether a single Sketch or a SketchCollection and output the results to examples/output
. You look at this output and ensure that it is as expected.
Learn more about testing and debugging in testing
Write Report Client
Build and Deploy to AWS
Debugging build failure
If the build step fails, you will need to look at the error message and figure out what you need to do. Did it fail in building the functions or the clients? 99% of the time you should be able to catch these errors sooner. If VSCode finds invalid Typescript code, it will warn you with files marked in red
in the Explorer panel or with red markes and squiggle text in any of the files.
If you're still not sure try some of the following:
- Run your smoke tests, see if they pass
- When was the last time your build did succeed? You can be sure the error is caused by a change you made since then either in your project code, by upgrading your geoprocessing library version and not migratin fully, or by changing something on your system.
- You can stash your current changes or commit them to a branch so they are not lost. Then sequentially check out previous commits of the code until you find one that builds properly. Now you know that the next commit cause the build error.
Connect to SeaSketch Project and Test
Choose clipToOcean
as preprocessor
Choose MpaTabReport
as report client
Test different sketch and collection scenarios. When you find one that errors or does something unexpected, then you can export that sketch to your projects examples/sketches
directory and run your smoke tests. If that succeeds and produces output as expected, then load your storybook and see if you can reproduce in your report client.
Expand and Iterate
There are more advanced features available if you need them.
Project Client
It has a lot of shortcut methods for working with datsources, geographies, precalc metrics, objectives, etc. It's not meant to be a black box, you can look at what it does.
[Link to project client ]
Configure Geography
Precalc Metrics
At the very least you should import your planning boundaries, preferably as individual files, or as individual layers within a file package.
Any file-based format that OGR and GDAL supports out of the box.
npm run precalc:data
? Do you want to precalculate only a subset?
Yes, by datasource
Yes, by geography
Yes, by both
❯ No, just precalculate everything (may take a while)
What's happening is that the precalc script starts a local web server on port 8001 that serves up the datasources in data/dist
.
The precalc script then gets all your project datasources with precalc: true
, and all your project geographies with precalc: true
, and then calculate area
, sum
, and count
metrics for each combination of datasource and geography.
Once complete project/precalc.json
will have been updated with the new metric values.
If your datasource has classKeys
defined in its record, precalc will also calculate area
, sum
, and count
for each unique class value found within the classKey.
You must re-run precalc:data
every time you change a geography record or a datasource.
- To learn more advanced use, see the precalc guide.
- To learn more about use of precalculated metrics, see the report client guide.
Metric Groups
How you intend to use your data will determine what form the data needs to be in.
Examples By Use Case
- Do you have vector data?
- Does it have a single data class?
- Is it one file with one data class?
- Is it one file with multi-class attribute, of which you only need one?
- create a new dataset with
- Does it have multiple data classes?
- Is it one data class per file?
- I sit one data class per layer within file?
- Does it have multiple data classes within one layer with an attribute to differentiate them?
- Does it have a single data class?
- Do you have raster data?
- Does it have a single data class?
- Is it one file with one data class?
- Does it have multiple data classes?
- Is it one file, one data class per raster band?
- Is it multiple files, one data class per file?
- Is it a categorical raster with unique cell value for each class?
- Does it have a single data class?
[ToDo: provide metric group example for each leaf in tree]
Create Report
- Edits to the statistic you want calculated (i.e.calculating average instead of sum, etc) should happen in your function (
src/functions/benthicHabitat.ts
). - Edits to the way the analytics are displayed (i.e. changing labels, converting units, adding text context, etc) should happen in your component (
src/components/BenthicHabitat.tsx
).
Language Translation
Language translation takes effort to maintain. It is suggested that you get your reports close to final, in the English language, and then add translations.
What Next
Still have more questions? Start a discussion on Github.