Land Value / m²
£20,000+
£5,000
£1,200
£150
£0
Compare the taxes you currently pay to a Land Value Tax system
Loading properties...
This may take a few moments
| Step | Step Name | Simple Description | Technical Description |
|---|---|---|---|
| 1 | Data Loading and Initial Cleaning | Download and tidy up the four raw datasets — Land Registry sales, property addresses, energy certificates, and plot boundaries — ready for processing. | Load LR PPD, OS Open UPRN, EPC (348 LA files), and INSPIRE GML batches; standardise columns, deduplicate EPC on UPRN, batch-reproject INSPIRE to WGS84, write clean parquets and GeoPackages. |
| 2 | Address Normalisation | Reformat addresses in both the sales data and the energy certificates so they can be reliably compared and matched against each other. | Uppercase, strip punctuation, expand abbreviations (ST→STREET etc.), and concatenate component fields into a single normalised string per dataset; postcode excluded from string and used as match key. |
| 3 | Land Registry to EPC Fuzzy Match | Find the energy certificate that belongs to each property sale, so that floor area and building details can be linked to the sale price. | Two-pass match within postcode: exact string match first, then token-ratio fuzzy match on residuals; confidence threshold applied; unmatched records flagged for postcode centroid fallback. |
| 4 | Assign Coordinates | Give each property sale a precise location — either the exact address point if matched, or the centre of its postcode otherwise. | Key-join EPC UPRN to OS Open UPRN for precise lat/lon; postcode centroid from ONS Postcode Directory as fallback; coordinate type flag added to all records. |
| 5 | Attach EPC Attributes | Add the physical building details — floor area, property type, age, and energy rating — to each matched sale record. | Carry forward floor_area_m2, property_type, built_form, construction_age_band, and energy_rating from matched EPC certificates; unmatched records flagged with imputed floor area by property type. |
| 6 | Spatial Join to INSPIRE Polygons | Work out which registered land parcel each property sits on, and calculate the size of that plot. | GeoPandas spatial join across 7 INSPIRE batch GeoPackages; assigns inspire_id and plot_area_m2 from polygon geometry; records categorised as polygon-matched, UPRN-only, or postcode-centroid. |
| 7 | Plot Area Filter | Remove implausibly tiny or very large plots from the training data, as they give unreliable signals about land value per square metre. | Lower bound ~10 m² removes artefact slivers; upper bound ~10,000 m² excludes very large sites from model training only; large polygons retained for display; thresholds set from Step 6 size distribution. |
| 8 | Building Value Model | Estimate what the building alone is worth, using prices paid in cheap areas where land adds little value. The remainder of the sale price is treated as the land’s value. | Gradient boosting model trained on North East + Wales transactions below national P30 price/m²; features: log floor area, property type, built form, age band, energy rating; no location inputs. Applied nationally: land_value_est = price − building_value_est. Flats and leaseholds excluded throughout. |
| 9 | Grid Construction | Divide England and Wales into a regular grid of 100 m × 100 m squares, clipped to the coastline. | 100 m raster constructed in British National Grid, clipped to OS land boundary; each cell assigned centroid lat/lon and a data-presence category (INSPIRE-matched, postcode fallback, or empty). |
| 10 | Grid Surface Assembly | Spread each property’s estimated land value across the grid using a distance-decay formula, blending towards a low agricultural baseline in areas with sparse sales data. | FFT convolution on 100 m raster; exponential decay h = 0.3 km, radius 3 km; Bayesian prior (W_PRIOR = 2.0, floor £2/m²); sqrt plot-area normalisation (alpha ≈ 0.5); negative LV clamped to zero but retained in denominator. |
| 11 | Grid Surface Validation | Check how accurately the finished grid predicts actual sale prices by comparing the grid’s values back against the transactions used to build it. | Reconstruct predicted_price = grid_lv × sqrt(REF_AREA × effective_plot_area) + building_value_est per transaction; report R², RMSE, MAE, MdAPE, and bias by quality band, region, and price decile. |
| 12 | PMTiles Generation | Package the grid and all land parcel boundaries into a single map tile file for the web app, covering the whole country at every zoom level. | Three-pass tippecanoe: Pass A — background grid z6–13 (no-drop); Pass B — grid + parcels z6–16 (drop-densest); Pass C — tile-join merge to land_values.pmtiles (7.56 GB). Parcels assigned LV/m² by sampling grid at polygon centroid. |