Phase 1: Residential Building Consent Dataset

INPUTS

  1. Geocoded individual consent data
    Various information (specified below under OUTPUTS) need to be added to each observation in these two datasets
  1. LINZ spine from phase 0

OUTPUTS

csv file of building consents with the data fields 1-9 above. The matching process for assigning each consent to a LINZ parcel is described below. The following additional data fields are added. These are flags to designate the match to the LINZ dataset.

  1. Ranged Address Indicator
    a. Indicator (1 or 0) for LINZ parcel that is part of a ranged address consent Ranged_Address_indicator
  1. Matching Type Indicators:
    a. LINZ_MATCH_CODE
    b. LINZ_2ND_MATCH_CODE
    PROCEDURE FOR MATCHING CONSENTS TO LINZ PARCELS
    MATCHING FOR NON-RANGED ADDRESSES:
    Find the LINZ parcel of the geo-coordinate of the consent.
    a. Check whether the address of the consent and the address of the LINZ parcel match, by matching on number and first word. If so, stop and set LINZ_MATCH_CODE = 1. If not, go on to 2:

Find all the LINZ parcels within r radius of the geo-coordinate of the consent
a. Search for a match of the consent address within the set of LINZ parcels within the radius r. If a match is found, stop and set LINZ_MATCH_CODE = 2. If there is no match, go on to 3:

Identify the LINZ parcel of the geo-coordinate of the consent. If the name of the road in the address of the LINZ parcel matches the road name of the address given in the consent dataset, set the parcel to be the LINZ parcel of the of geocoordinate of the consent. Set LINZ_MATCH_CODE = 7 and stop. If not, proceed to 8:

Identify the LINZ parcel of the geo-coordinate of the consent and use this. Set LINZ_MATCH_CODE = 8 and stop. If there is no parcel under the geo-coordinate, go to step 9:

Set LINZ_MATCH_CODE = 9. This indicates no match even with just the consent long-lat.

MATCHING FOR RANGED ADDRESSES: For ranged addresses, follow the same approach for each individual address, using the same geo-coordinate for each address in step 1. However, we set r in step 2 to a larger number in order to do a wider search. If no matches are found at step 2, but at least one match in the ranged address is found at step 1 or step 2, go to step 5. This avoids matching parcels that are far away from the rest of the group. Only if there is no match within the ranged addresses at either step 1 or step 2 do we proceed to step 3.

We include an additional flag for each ranged address that indicates the best (i.e. lowest) LINZ_MATCH_CODE for that range. E.g. for a ranged address with five addresses, if at least one of them had LINZ_MATCH_CODE = 2, and all others were LINZ_MATCH_CODE > 2, then the best LINZ_MATCH_CODE would be 2. This is call LINZ_2ND_MATCH_CODE below. Each address (in the ranged address set) is assigned its own unique LINZ parcel. Each parcel is then assigned the information given in 1 through 11 below. Because the parcels share the same consent ID, we can tell that the different parcels pertain to the same consent, thereby avoiding double counting.