1. Unpack Faces Data
Unpack faces data needed for xwalk production
mkdir ~/workspace/postgis_tiger/otm_tiger_current
cd ~/workspace/postgis_tiger/otm_tiger_current
ln -s /data/production/prod/current/extiger/us/tl_2019*faces*gz .
gunzip *.gz -f
cd ..
2. Set environment variables
This file defines a standard set of environment variables used by the geography processing modules. In order to keep distinct vintages separate there exists a schema prefix which will be appended to the beginning of the postgresql schema. For example, if 'otm2018' is set as the schema prefix, data will be stored in 'otm2018public', 'otm2018final', 'otm2018names' and so on. Set the new schema prefix and environment variables prior to running any new vintage as it will drop/overwrite existing schemas and tables.
vi 00_setenvvars.bash
3. Load faces
runtime: 3 hours
This program loads the faces or facesah shape and dbf files into the prefixed public postgres schema. All files in the provided directory that match the form of faces.shp or facesah.dbf will be loaded.
time ./01_facesloader.bash /path/to/files faces
4. Format faces for crosswalking
runtime: 3.5 hours
This program takes the raw faces data in the prefixed public schema and formats the fields into crosswalkable identifiers. Fields are given 9 flags and state fips codes if necessary.
time ./02_xwalk.bash faces
5. Unpack shapefile data
unpack the shape data needed
cd ~/workspace/postgis_tiger/otm_tiger_current
ln -s /data/production/prod/current/extiger/us/tl_2019*{aiannh,aitsn,anrc,arealm,bg,cbsa,cd116,county,cousub,elsd,mil,necta,place,pointlm,rails,scsd,sldl,sldu,state,tract,unsd,zcta510}*gz .
gunzip *.gz -f
cd ..
6. Load shp→ db
runtime: 30 minutes
This program loads TIGER shapefile into the public schema. Providing a path (/path otm) and type will load all application files expected for that type. Providing a path and single layer will load a single layer.
time ./04_shapeloader.bash otm_tiger_current otm
7. Get WIB csv
unpack the wib csv data
mkdir wib_csvs
cd wib_csvs
ln -s /data/production/prod/current/exwib/us/*2019.csv.gz .
gunzip *.gz -f
chmod u+w *.csv
cd ..
8. Load WIB Defs
runtime: 10 seconds
This program loads the wib csv definitions into the working.wib_definitions table. This process will also correct expected encoding issues in the wib csv definitions.
time ./05_wibcsvloader.bash path/to/wib_csvs/
9. Create public WIBs
runtime: 1 minute
This program creates a wib table of the component parts from county, county sub, and place tables in the prefixed public schema. A scratch table called working.wib_geometry is also used.
time ./06_wibshape.bash wib
10. Validate Schemas
This program validates the table schemas in postgres vs the csv schemas in the ./schemas directory. The postgis tables checked are in the otmgeo database. The schemas verified are of raw tables only - those loaded in the public schema. Check that the output lists the expected tables as "validated". Any errors that occur here will require manual adjustment of downstream processes.
./07_checkschemas_qa.bash [otm|qwi]
11. Project Geography
runtime: process dependent
This program projects tables in the public schema into new tables in the working schema. All non-geomtry data is copied over unchanged. A set of predefined tables and projections are associated with the 'otm' and 'qwi' keywords but single layers can be projected as needed. See the program help instructions for more info.
time ./08_project.bash [otm|qwi]
12. Remove undefined
runtime: 10 seconds
This program removes undefined congressional districts, state legislative districts (upper and lower), and county subdivisions
time ./09_removeundefined.bash all
13. Remove unpopulated
runtime: 6 minutes
This program removes unpopulated and unused areas from the geography tables, either all tables or individual. Guam, American Samoa, and the volcanic Hawiian chain are removed. A working.invalid table is created that contains all of the geographies to be deleted. This table persists so it can be used to delete elements from the tables that created it.
The block definitions for the AK islands is now required. Located at: ./input_data/ak_island_blocks.csv
time ./10_removeunpopulated.bash otm
14. Create Schools
runtime: 20 seconds
This program creates the schools layer from the unsd and elsd tables in the prefixed working schema
./11_createschools.bash schools
15. Format names
runtime: 1 second
This program adjusts names on individual layers. The current iteration only makes adjustments to the mil layer.
./12_formatnames.bash mil
16. Crosswalk
runtime: 3 days
This program creates the takes the faces data and chooses the centermost face on a block to use for an internal point. The result data, block_xwalk_intpt in the prefixed working schema, represents the block level data. All non-centermost faces have been deleted in this file.
time ./13_intptxwalk.bash xwalk
runtime: 20 seconds
17. CD Block Splits
The crosswalk assigns higher level geographies based on the centermost face. Congressional districts get special handling because of their political import. Geo produces a pdf file that specifies a handful of assignments that we use wholesale. |
DO NOT USE THE FULL CSV FILE OF BLOCK ASSIGNMENTS |
Verify that this is the correct congress session, then download the CD116 Block Splits pdf file and convert to a csv of the form:
State Name | State Code | County Name | County Code | Tract Code | Block | CD’s within Block | Data Tabulated to CD |
---|---|---|---|---|---|---|---|
Kentucky |
21 |
Harrison |
097 |
950100 |
1068 |
4,6 |
4 |
Kentucky |
21 |
Harrison |
097 |
950100 |
1070 |
4,6 |
4 |
Kentucky |
21 |
Harrison |
097 |
950400 |
2024 |
4,6 |
6 |
Kentucky |
21 |
Harrison |
097 |
950400 |
2032 |
4,6 |
6 |
Kentucky |
21 |
Jefferson |
111 |
010316 |
1013 |
3,4 |
4 |
Use the following commands to load and patch the crosswalk with the updated congressional districts.
time ./14_assigncdxwalk.bash path/to/CD.csv
18. Create the formatted names
runtime: 10 days
Create the formatted names for all OnTheMap layers.
time ./15_createnames.bash otm
19. Assign exoids to layers
Take the exoids from the crosswalk and update the references in each layer table
runtime: 45 minutes
time ./16_assignexoid.bash all
20. Generate water mask
Generate the coastal mask from the difference between the CBF and TIGER.
The CBF layer is hardcoded as it’s not released on the same schedule as tiger. Verify the most recent CBF release and update the year reference in the program below if necessary. |
time ./17_generatecbfwatermask.bash otm
21. Finalize the layer tables
Create a final version of all polygon layer tables in the expected layout.
time ./18_finalizegeo.bash otm
22. Export the Crosswalk CSV
Create a directory to store the final crosswalk csv and export all relevant columns. There are a few different versions of the crosswalk available. Choose as needed.
time ./19_exportxwalks.bash /some/folder blk_xwalk_wide
23. Create high, noncostal water reference
This program creates the high level, low resolution water features for OnTheMap. This is a new layer that will be a mix of water and polygon features
time ./20_highlevelwater.bash highlevel
24. Load and project areawater files
Unpack the areawater shapefiles
mkdir ~/workspace/postgis_tiger/areawaterfiles
cd ~/workspace/postgis_tiger/areawaterfiles
ln -s /data/production/prod/current/extiger/us/tl_2018*areawater*gz .
gunzip *.gz -f
cd ..
Load and project the areawater shapefiles in the water schema
time ./20_lowlevelwater.bash ~/workspace/postgis_tiger/areawaterfiles
25. Intermediate water layer
25.1. Load data
Load the necessary polygons into the water schema. This will also begin the processing of the midlevel coastline data that needs manual editing.
time ./22_midlevelwater.bash midlevel
25.2. Make manual edits
-
remove all standalone line segments (look at the attribute table) and then run "unsplit lines". (TODO) This step can be automated in the future by removing lines with no coincident points at the start or end of a segment.
-
use editor tools to merge endpoints that should be connected as needed then rerun "unsplit lines" WARNING: At this point there should be 5 linesegments - AK, Atlantic, Pacific, Great Lakes, and one small feature near Seattle
-
create a continental boundary from the us_states by disolving all features
-
convert the continental boundary to a line feature
-
hand edit the coastal boundary so that all segments cross the continental boundary edges
-
run feature to polygon on the continental lines and hand edited coast line features
-
delete the state of AK, the CONUS, and that little area in seattle.
-
repair geometry on both features to remove self intersections
-
'erase' the water.coastline_polygons (islands) from the coastline mask
-
to create the intermediate water merge in the USGS small scale waterbody features and dissolve
-
then clip by the US boundary just to be safe
-
union layer to merge neighbors if exists
-
finally intersect with the state boundaries just so individual polygons aren’t too big
-
ftp back to database server
Run the following command to load the shapefile back into the database and to drop unnecessary columns
time ./22_midlevelwater_2.bash /path/to/intermediate_water.shp
26. Create the ancillary layers
Open the bash program below and verify that all URLs listed near the top of the table are still available and the most recent. Once complete, process all layers.
time ./23_ancillarylayers.bash all
Check that all of the following tables appear in the landmarks schema. On occasion a filename changes which causes a failure in processing a layer |
-
greenspace
-
airport_area
-
airport_point
-
amtrak_stations
-
rails
-
univ_ccol
-
nces_schools
-
onestop
27. Set table ownership
Once all layers are complete, hand ownership of tables over to the otmgeo user
./00_setpermissions.bash
28. Delivery
28.1. Xwalks
The wide and data versions of the xwalk will need to be delivered to production. Ask staff for delivery location and rsync.
./19_exportxwalks.bash ./path/to/direcoty blk_xwalk_wide
./19_exportxwalks.bash ./path/to/direcoty blk_xwalk_data
28.2. Watermaps
The production system uses flagged geoids to determine which locations are ineligible for residence considerations. These are generated and delivered to production along with the xwalks.
./25_xwalksaswatermaps.bash ./path/to/direcoty all
28.3. Shapefiles
The polygon files, landmarks, and water files will need to be delivered offsite for application development. Export them and rsync to the dropbox.
./24_export.bash ./path/to/direcoty otm
./24_export.bash ./path/to/direcoty landmarks
./24_export.bash ./path/to/direcoty water