1. Unpack Faces Data

Unpack faces data needed for xwalk production

mkdir ~/workspace/postgis_tiger/otm_tiger_current
cd ~/workspace/postgis_tiger/otm_tiger_current
ln -s /data/production/prod/current/extiger/us/tl_2019*faces*gz .
gunzip *.gz -f
cd ..

2. Set environment variables

This file defines a standard set of environment variables used by the geography processing modules. In order to keep distinct vintages separate there exists a schema prefix which will be appended to the beginning of the postgresql schema. For example, if 'otm2018' is set as the schema prefix, data will be stored in 'otm2018public', 'otm2018final', 'otm2018names' and so on. Set the new schema prefix and environment variables prior to running any new vintage as it will drop/overwrite existing schemas and tables.

vi 00_setenvvars.bash

3. Load faces

runtime: 3 hours

This program loads the faces or facesah shape and dbf files into the prefixed public postgres schema. All files in the provided directory that match the form of faces.shp or facesah.dbf will be loaded.

time ./01_facesloader.bash /path/to/files faces

4. Format faces for crosswalking

runtime: 3.5 hours

This program takes the raw faces data in the prefixed public schema and formats the fields into crosswalkable identifiers. Fields are given 9 flags and state fips codes if necessary.

time ./02_xwalk.bash faces

5. Unpack shapefile data

unpack the shape data needed

cd ~/workspace/postgis_tiger/otm_tiger_current
ln -s /data/production/prod/current/extiger/us/tl_2019*{aiannh,aitsn,anrc,arealm,bg,cbsa,cd116,county,cousub,elsd,mil,necta,place,pointlm,rails,scsd,sldl,sldu,state,tract,unsd,zcta510}*gz .
gunzip *.gz -f
cd ..

6. Load shp→ db

runtime: 30 minutes

This program loads TIGER shapefile into the public schema. Providing a path (/path otm) and type will load all application files expected for that type. Providing a path and single layer will load a single layer.

time ./04_shapeloader.bash otm_tiger_current otm

7. Get WIB csv

unpack the wib csv data

mkdir wib_csvs
cd wib_csvs
ln -s /data/production/prod/current/exwib/us/*2019.csv.gz .
gunzip *.gz -f
chmod u+w *.csv
cd ..

8. Load WIB Defs

runtime: 10 seconds

This program loads the wib csv definitions into the working.wib_definitions table. This process will also correct expected encoding issues in the wib csv definitions.

time ./05_wibcsvloader.bash path/to/wib_csvs/

9. Create public WIBs

runtime: 1 minute

This program creates a wib table of the component parts from county, county sub, and place tables in the prefixed public schema. A scratch table called working.wib_geometry is also used.

time ./06_wibshape.bash wib

10. Validate Schemas

This program validates the table schemas in postgres vs the csv schemas in the ./schemas directory. The postgis tables checked are in the otmgeo database. The schemas verified are of raw tables only - those loaded in the public schema. Check that the output lists the expected tables as "validated". Any errors that occur here will require manual adjustment of downstream processes.

./07_checkschemas_qa.bash [otm|qwi]

11. Project Geography

runtime: process dependent

This program projects tables in the public schema into new tables in the working schema. All non-geomtry data is copied over unchanged. A set of predefined tables and projections are associated with the 'otm' and 'qwi' keywords but single layers can be projected as needed. See the program help instructions for more info.

time ./08_project.bash [otm|qwi]

12. Remove undefined

runtime: 10 seconds

This program removes undefined congressional districts, state legislative districts (upper and lower), and county subdivisions

time ./09_removeundefined.bash all

13. Remove unpopulated

runtime: 6 minutes

This program removes unpopulated and unused areas from the geography tables, either all tables or individual. Guam, American Samoa, and the volcanic Hawiian chain are removed. A working.invalid table is created that contains all of the geographies to be deleted. This table persists so it can be used to delete elements from the tables that created it.

The block definitions for the AK islands is now required. Located at: ./input_data/ak_island_blocks.csv

time ./10_removeunpopulated.bash otm

14. Create Schools

runtime: 20 seconds

This program creates the schools layer from the unsd and elsd tables in the prefixed working schema

./11_createschools.bash schools

15. Format names

runtime: 1 second

This program adjusts names on individual layers. The current iteration only makes adjustments to the mil layer.

./12_formatnames.bash mil

16. Crosswalk

runtime: 3 days

This program creates the takes the faces data and chooses the centermost face on a block to use for an internal point. The result data, block_xwalk_intpt in the prefixed working schema, represents the block level data. All non-centermost faces have been deleted in this file.

time ./13_intptxwalk.bash xwalk

runtime: 20 seconds

17. CD Block Splits

The crosswalk assigns higher level geographies based on the centermost face. Congressional districts get special handling because of their political import. Geo produces a pdf file that specifies a handful of assignments that we use wholesale.
DO NOT USE THE FULL CSV FILE OF BLOCK ASSIGNMENTS

Verify that this is the correct congress session, then download the CD116 Block Splits pdf file and convert to a csv of the form:

State Name State Code County Name County Code Tract Code Block CD’s within Block Data Tabulated to CD

Kentucky

21

Harrison

097

950100

1068

4,6

4

Kentucky

21

Harrison

097

950100

1070

4,6

4

Kentucky

21

Harrison

097

950400

2024

4,6

6

Kentucky

21

Harrison

097

950400

2032

4,6

6

Kentucky

21

Jefferson

111

010316

1013

3,4

4

Use the following commands to load and patch the crosswalk with the updated congressional districts.

time ./14_assigncdxwalk.bash path/to/CD.csv

18. Create the formatted names

runtime: 10 days

Create the formatted names for all OnTheMap layers.

time ./15_createnames.bash otm

19. Assign exoids to layers

Take the exoids from the crosswalk and update the references in each layer table

runtime: 45 minutes

time ./16_assignexoid.bash all

20. Generate water mask

Generate the coastal mask from the difference between the CBF and TIGER.

The CBF layer is hardcoded as it’s not released on the same schedule as tiger. Verify the most recent CBF release and update the year reference in the program below if necessary.
time ./17_generatecbfwatermask.bash otm 

21. Finalize the layer tables

Create a final version of all polygon layer tables in the expected layout.

time ./18_finalizegeo.bash otm

22. Export the Crosswalk CSV

Create a directory to store the final crosswalk csv and export all relevant columns. There are a few different versions of the crosswalk available. Choose as needed.

time ./19_exportxwalks.bash /some/folder blk_xwalk_wide

23. Create high, noncostal water reference

This program creates the high level, low resolution water features for OnTheMap. This is a new layer that will be a mix of water and polygon features

time ./20_highlevelwater.bash highlevel

24. Load and project areawater files

Unpack the areawater shapefiles

mkdir ~/workspace/postgis_tiger/areawaterfiles
cd ~/workspace/postgis_tiger/areawaterfiles
ln -s /data/production/prod/current/extiger/us/tl_2018*areawater*gz .
gunzip *.gz -f
cd ..

Load and project the areawater shapefiles in the water schema

time ./20_lowlevelwater.bash ~/workspace/postgis_tiger/areawaterfiles

25. Intermediate water layer

25.1. Load data

Load the necessary polygons into the water schema. This will also begin the processing of the midlevel coastline data that needs manual editing.

time ./22_midlevelwater.bash midlevel

25.2. Make manual edits

  1. remove all standalone line segments (look at the attribute table) and then run "unsplit lines". (TODO) This step can be automated in the future by removing lines with no coincident points at the start or end of a segment.

  2. use editor tools to merge endpoints that should be connected as needed then rerun "unsplit lines" WARNING: At this point there should be 5 linesegments - AK, Atlantic, Pacific, Great Lakes, and one small feature near Seattle

  3. create a continental boundary from the us_states by disolving all features

  4. convert the continental boundary to a line feature

  5. hand edit the coastal boundary so that all segments cross the continental boundary edges

  6. run feature to polygon on the continental lines and hand edited coast line features

  7. delete the state of AK, the CONUS, and that little area in seattle.

  8. repair geometry on both features to remove self intersections

  9. 'erase' the water.coastline_polygons (islands) from the coastline mask

  10. to create the intermediate water merge in the USGS small scale waterbody features and dissolve

  11. then clip by the US boundary just to be safe

  12. union layer to merge neighbors if exists

  13. finally intersect with the state boundaries just so individual polygons aren’t too big

  14. ftp back to database server

Run the following command to load the shapefile back into the database and to drop unnecessary columns

time ./22_midlevelwater_2.bash /path/to/intermediate_water.shp

26. Create the ancillary layers

Open the bash program below and verify that all URLs listed near the top of the table are still available and the most recent. Once complete, process all layers.

time ./23_ancillarylayers.bash all
Check that all of the following tables appear in the landmarks schema. On occasion a filename changes which causes a failure in processing a layer
  1. greenspace

  2. airport_area

  3. airport_point

  4. amtrak_stations

  5. rails

  6. univ_ccol

  7. nces_schools

  8. onestop

27. Set table ownership

Once all layers are complete, hand ownership of tables over to the otmgeo user

./00_setpermissions.bash

28. Delivery

28.1. Xwalks

The wide and data versions of the xwalk will need to be delivered to production. Ask staff for delivery location and rsync.

./19_exportxwalks.bash ./path/to/direcoty blk_xwalk_wide
./19_exportxwalks.bash ./path/to/direcoty blk_xwalk_data

28.2. Watermaps

The production system uses flagged geoids to determine which locations are ineligible for residence considerations. These are generated and delivered to production along with the xwalks.

./25_xwalksaswatermaps.bash ./path/to/direcoty all

28.3. Shapefiles

The polygon files, landmarks, and water files will need to be delivered offsite for application development. Export them and rsync to the dropbox.

./24_export.bash ./path/to/direcoty otm
./24_export.bash ./path/to/direcoty landmarks
./24_export.bash ./path/to/direcoty water