Archived Engagement Document

Scope of Work & Approach

From proof of concept to production analytics platform — ongoing data consultancy for WeBuyVintage.

Prepared for Liam Vasey & Andrew Wood, Co-Founders The Antiques Collective Ltd April 2026

1. Executive Summary

We ran a proof-of-concept sprint against your Google Sheets data. It confirmed that your data supports all the models we proposed. The PoC gave us directional answers; this document sets out the scope for building it properly.

This isn't a one-off software build. It's an ongoing data consultancy: the models improve as more events happen, and your team gets a self-serve AI tool they can ask questions to in plain English. Everything is hosted at majiai.co with password-protected access.

Immediate priority: what makes a great area, and where haven't you been yet? You've visited 80-95% of your target areas. The most valuable work right now is finding the remaining opportunities and working out when to go back to places you already know.


2. What We've Already Proven

During the PoC sprint we ingested all 59 Google Sheets tabs into a structured database and built prototype versions of all five original models. Key findings:

  • Commodity normalisation is essential. Gold is ~80% of revenue and has risen 2.4x since 2023. Raw revenue numbers are misleading. We built a blended commodity index to strip out price movement and see real operational performance.
  • Seasonality is real but moderate. Commodity-normalised, Jan/Feb peak at 1.26-1.29; July trough at 0.77. Worth scheduling around, but not the biggest lever.
  • Diminishing returns vary by area. Some areas hold up well through 4-5 visits; others drop off sharply after the 2nd. The repeat timing model captures this per-area.
  • Dealer performance is measurable once you control for area quality. We found clear tiers and category specialisms across 50 dealers.
  • 859 white-space opportunities in postcode districts you've never visited, next to your strongest-performing areas.
PoC status: All five prototype models are running and queryable. They validate the approach but have not been backtested, cross-validated, or tuned for production use. The work below takes them from directional to reliable. Full charts, dealer rankings, and white-space lists in the PoC Analysis.

3. Scope of Work

A. What Makes a Great Area & Where to Go Next Priority

Delivered first. This is the most valuable piece of work right now, and it becomes the foundation for Model 3 (Area Intelligence) which keeps the scoring and target lists up to date as new event data comes in.

What Makes a Great Area

"What do our 100 best-performing areas have in common?"

  • Statistical analysis of top vs bottom quartile areas across all measurable characteristics
  • Sweet spot identification: population size, demographics, wealth indicators, community character, housing age, competition density
  • First-visit baseline analysis. Normalise margin per customer on visit #1 to find the "gold seam" independent of commodity prices
  • Profit tree per customer (median margin/customer, not averages, to avoid outlier distortion)
  • How performance evolves across visit types (1st, 2nd, 3rd, 4th+). Do good areas start good or always get better?
  • External data enrichment: ONS demographics, house prices, population age profiles, deprivation indices, Mosaic.tech customer profiling
  • Scoring model that rates any UK postcode 0-100 for roadshow potential
Core insight from Liam: "They're not selling gold, they're selling everything else. The gold comes with it." The best areas are market towns (2-20k pop), strong community spirit, conservative demographics, areas with old money or former wealth. Places like Wisbech: used to have money, the stock is still there even if the area is now poorer. Smaller villages (~2k) benefit from network effects where everyone knows what's happening. In 6k+ areas, that effect disappears.

New Area Target Lists

"Where should we go that we've never been?"

  • List 1: Towns & villages - populations 2,500-20,000 across England & Wales. Ranked high to low priority, spread geographically. Includes: area name, county, local population, predicted score.
  • List 2: City suburbs - 200 suburbs within large towns and cities (30,000+ population) that haven't been visited, spread nationwide. Includes: suburb, parent town/city, county, suburb population, city population.
Data sources: ONS Census 2021, National Statistics Postcode Lookup, OS Open Data, Mosaic.tech consumer segmentation, Index of Multiple Deprivation, house price data from Land Registry.

B. Analytical Models Ongoing

Eight models that refresh as new data comes in. The original five from the proposal, evolved with PoC learnings, plus three new ones identified during discovery.

Model 1: Event Repeat Timing Engine

"When should we go back, and is it still worth going back?"

  • Commodity-normalised revenue analysis per area per visit number
  • Optimal cooldown period per area, factoring: target list numbers, target counts, dealer capacity, previous performance
  • Diminishing returns curves per area, so you know when to stop going back
  • Seasonality adjustment (within-year index, stripped of commodity and growth trends)
  • Rolling calendar showing which areas are ready for a return visit this month

Model 2: Dealer Performance Ranking

"Who are our best dealers, controlling for where they work?"

  • Composite score: location-adjusted performance, consistency, margin %, volume, category expertise
  • Category expertise mapping per dealer (gold, silver, watches, cameras, medals, coins, jewellery)
  • Trend detection so you spot declining performance early
  • Dealers within an acceptable band is the baseline; the key long-term metrics are customer satisfaction and repeat rates

Model 3: Area Intelligence & Postcode Scoring

"Where are our best customers, and where's the untapped demand?"

This is the ongoing version of the priority area analysis above. The Week 1 target lists and scoring model are the first output. Model 3 then keeps everything current as new events happen.

  • Postcode scoring model refreshed with each new batch of event data
  • Demand heatmaps and spend quality segmentation (platinum / gold / silver / bronze)
  • White space tracking, re-ranked as you visit new areas and results come in
  • Target numbers per week per region based on dealer days available
  • Regional benchmarking across all 17+ operational regions

Model 4: Dealer Capacity Planning

"Are we using our dealer network efficiently?"

  • ~35 active dealers, 4 days/week each, 2 per venue = ~220 realistic events/month (~78% utilisation)
  • Regional supply vs demand mapping. Where is capacity the constraint?
  • Revenue impact modelling for adding a dealer in each region
  • Travel radius and availability modelling per dealer

Model 5: Pattern Analysis

"What distinguishes a £30k event from a £3k event?"

  • Ongoing scoring of all areas (A/B/C/D tiers), refreshed as new data arrives
  • Predictive factors ranked by correlation strength
  • Area scoring model for any UK postcode, before you've ever been there

Model 6: Event Attendance Predictor New

"How many customers will show up at this specific event?"

  • Predicts footfall for each individual event based on: Facebook ad spend, time of year, number of RSVPs, leaflet distribution, local population, previous area performance, whether it's a repeat visit
  • Operational decisions it drives: whether to send a 3rd dealer, how much cash float is needed, whether an assistant is required
  • Can also be used to evaluate potential new target areas before committing
New data needed: Individual event Facebook spend, RSVP counts, leaflet distribution numbers. To be captured in Jotform or provided via Google Sheets.

Model 7: Estimate vs Actual Tracker New

"How accurate are each dealer's estimates, and who's gaming the system?"

  • Tracks estimation accuracy per dealer across three categories: gold (superseded by gold dealer price within ~6 days), silver (longer lead time), and SKU items (tracked to actual sale, can take up to 9 months)
  • Replaces the blanket 30% reduction with per-dealer accuracy profiles. Fair to honest estimators, catches those inflating
  • Estimation accuracy as a component of the dealer performance score
  • Trend monitoring: is a dealer's accuracy improving or deteriorating over time?
Why this matters: The current blanket 30% reduction penalises accurate estimators and rewards those who inflate. With a 9-month lag on SKU results, honest dealers get their commission cut unfairly while gamers learn to just add 30%. Per-dealer calibration fixes this.

Model 8: Rota Tool Live Bounty

"Can a machine do the rota as well as Liam?"

Shipped — Phase 1 (rota analysis) and Phase 2 (CP-SAT optimiser) are live. Generates optimised dealer assignments, before/after comparison, CSV export. Open the tool →

Scheduling tool that balances Liam's 10 prioritised goals:

  1. Fill all seats at every event
  2. Minimise average distance travelled, keeping standard deviation low (even spread across dealers is better than low average with outliers)
  3. Minimise road trips. Only use them when they help goals 1-2
  4. Keep travel time under 90 minutes wherever possible
  5. Match each dealer's ideal number of events per month
  6. Match each dealer's ideal number of events per week
  7. Fill the 3rd seat at twin events and high-traffic roadshows
  8. Get inexperienced dealers into 3rd seats for training
  9. Limit road trips to ≤2/month for willing dealers, ≤1/month for reluctant, zero for those with exemptions
  10. Respect individual dealer constraints (family, availability, preferences)
Commercial model: Delivered on a bounty basis. If the tool fully replaces manual scheduling: £1,500/month. If it handles the majority but still needs manual adjustment: £750/month. Assessment after a 4-week trial period.

C. Platform & AI Interface

Dashboards, recurring analysis pages, and an AI tool your team can ask questions to directly.

Self-Serve AI Query Tool

"Ask any question about your data in plain English."

  • Claude-powered conversational interface embedded on majiai.co
  • Connected to the live database, so answers use current data
  • Useable by anyone on the team, no technical skills required
  • Example queries: "Which areas in Yorkshire are ready for a return visit?", "How did Matt Case perform last month vs his average?", "Show me the top 20 untapped postcodes near Leeds"
  • Token usage costs included in the monthly fee (budgeted for normal operational use)

Fixed Analysis Pages

  • Recurring analyses that Liam has asked for, published as fixed pages on the platform that automatically refresh as new data comes in
  • Event performance by month, dealer leaderboard, area rankings, capacity dashboard, attendance predictions
  • Each page updates on a schedule (daily/weekly), no manual effort
  • Accessible via the password-protected WeBuyVintage portal at majiai.co

Data Pipeline

  • Data source → PostgreSQL automated daily sync. Currently Google Sheets, but the pipeline will be built to adapt if WBV moves to a different system (CRM, ERP, or custom tooling)
  • Jotform webhook integration for real-time event data capture
  • Validation rules on ingest: dealer initials, currency parsing, date checks, anomaly flagging
  • Commodity price feed for ongoing normalisation
  • External data refresh (ONS, house prices, demographic updates)

4. Approach & Methodology

How we turn the raw data into something you can actually rely on.

Commodity Normalisation

Gold is ~80% of your revenue and has risen 2.4x since 2023. Silver (~9%) has risen 3.1x. Any analysis using raw revenue numbers is measuring commodity markets, not your operational performance. We build a blended price index weighted by each event's actual gold/silver/other revenue mix, then normalise all revenue to constant commodity terms. This helps separate true seasonality, area quality, and dealer performance from metal price movement. It's worth noting that higher gold prices also attract more sellers, so operational growth and commodity growth are interlinked. The normalised view is a guide, not a clean split.

Per-Customer Economics

The profit tree is framed per customer, not per event. An event's revenue is footfall x margin per customer. We use median margin per customer (not average) to avoid distortion from outlier transactions. This lets you see whether your 2026 events are genuinely better, or whether you just have more customers coming through the door because of commodity prices.

First-Visit Baseline

The first visit to any area is the cleanest signal of that area's potential, before repeat effects or marketing build-up. By normalising first-visit margin per customer across all areas, we build a "gold seam" map showing where the richest untapped stock sits, regardless of how many times you've been there.

Backtesting & Validation

Every model is tested against held-out data before going live. We hold out the most recent 6 months, train on history, predict the held-out period, and measure accuracy. We also cross-validate across regions to make sure models work everywhere, not just in the areas they were trained on. Every recommendation comes with a confidence score so your team knows how much weight to give it.

External Data Enrichment

  • ONS Census 2021: population, age profiles, home ownership, household composition
  • Land Registry: house prices by postcode (proxy for "old money" areas)
  • Index of Multiple Deprivation: finds areas that are economically deprived but may still have legacy wealth in stock
  • Mosaic.tech: consumer segmentation to find where your ideal customer profiles cluster
  • Rural/urban classification: market town identification, village network mapping

5. Delivery Plan

16-week phased delivery, with the highest-value strategic work front-loaded.

Phase 1 - Weeks 1-4

Strategic Intelligence

The priority work. "What makes a good area" analysis and new target lists delivered by end of Week 1. KPI workshops and remaining strategic work through Weeks 2-4.

  • Week 1: "What Makes a Great Area" deep analysis + postcode scoring model
  • Week 1: New area target lists (towns 2.5-20k + city suburbs 30k+)
  • Weeks 2-4: KPI workshops + data schema design
Phase 2 -Weeks 3–7

Data Engineering

Proper pipeline replacing the PoC's one-off CSV import.

  • Data source integration (Google Sheets initially, flexible to whatever WBV moves to)
  • PostgreSQL deployment with production schema
  • Data validation, reconciliation rules, and audit trails
  • Historical data backfill through the production pipeline
  • External data enrichment (ONS, Mosaic, house prices, deprivation indices)
Phase 3 -Weeks 5–11

Model Validation & Tuning

Take the PoC prototypes to production. Test them properly, tune them, add confidence scores.

  • Backtest all 8 models against held-out data
  • Cross-validation across regions and time periods
  • Add confidence intervals to all model outputs
  • Edge-case handling (new venues, new dealers, sparse data)
  • Estimate vs actual model build (new, requires gold dealer price linkage)
  • Attendance predictor model build (new, requires FB spend + RSVP data)
  • Model performance benchmarks documented
  • Rank event performance by month (commodity-normalised, with % drop-off from peak)
  • Rank counties (performance index controlling for visit frequency and commodity prices)
  • Ideal regional split at 400 events/month
  • Top factors for a good event (venue type, day of week, dealer combination, seasonality, marketing spend)
  • Best venue types analysis
Phase 4 -Weeks 9–14

Platform Build & AI Interface

Web dashboard + self-serve AI query tool on majiai.co.

  • Fixed analysis pages (auto-refreshing with new data)
  • Self-serve Claude AI query interface
  • Event calendar, dealer leaderboard, area explorer, capacity dashboard
  • Jotform webhook integration for real-time data flow
  • Password-protected client portal
Phase 5 -Weeks 14–16

Testing, Training & Handover

  • UAT with ops team
  • Team training on all platform features and AI query tool
  • Documentation: user guide, data dictionary, model methodology
  • Automated weekly email digest setup (areas ready for revisit, dealer flags, opportunities)
  • Go-live
Ongoing -from Month 1

Data Consultancy & Maintenance

  • Monthly model retraining on latest data
  • Quarterly model performance review and tuning
  • Up to 4 hours/month ad-hoc analysis and support
  • Platform hosting, data pipeline monitoring, bug fixes
  • AI query interface token costs
  • Scheduling automation development (bounty -runs in parallel)

6. Investment

Build Fee

£20,000
Payable in 4 monthly instalments of £5,000 + VAT

Covers: all strategic analyses, 8 production models, data pipeline, platform build, AI interface, training & handover. Invoiced monthly April–July 2026.

Monthly Service Fee

£2,000/month
36-month term from June 2026

Covers: cloud hosting (~£100/month), automated data sync, monthly model retraining, AI query tokens (~£100/month for moderate use), 4 hours/month support, quarterly reviews, pipeline maintenance, bug fixes.

Token budget is based on moderate usage (50-80 queries/day across the team). If usage grows significantly we'd revisit, but this covers normal operational use comfortably. Hosting covers PostgreSQL database and web application infrastructure.

Scheduling Automation Add-On

Outcome Monthly Fee
Full replacement -tool completely replaces manual scheduling £1,500/month + VAT
Assisted -handles majority of scheduling, still needs manual tweaks £750/month + VAT

Assessment after a 4-week trial period. Bounty model, you only pay if it works. Development runs in parallel with the main platform build.

Total Contract Value

Year 1: £20,000 build + £14,000 monthly (7 months Jun–Dec) = £34,000

3-Year Total: £20,000 build + £72,000 monthly (36 months) = £92,000

Against ~£28m annual gross margin, even a 1% improvement in event yield would mean £280,000 in additional annual margin. That's a 8x return on the full 3-year investment in year one.


7. What We Need From You

  • Data source access - read-only access to your event data, currently in Google Sheets. If you move to a different system during the engagement, we'll adapt the pipeline accordingly.
  • Jotform account access - to add required fields (category dropdowns, postcode validation, dealer ID standardisation) and set up webhooks.
  • Facebook ad spend data - per-event spend, RSVP counts, and leaflet distribution numbers. Can be added to Jotform or provided via a spreadsheet.
  • 30 minutes with your ops lead - to understand the current scheduling process, dealer preferences, and pain points.
  • Scheduling constraints document - dealer home postcodes, availability patterns, ideal event counts, road trip preferences, and any exemptions.
  • Confirm hosting approach - we'll discuss the best option during discovery.

8. Next Steps

  • Review & sign-off on this scope of work and the accompanying services agreement.
  • First invoice - £5,000 + VAT, April 2026. Work begins immediately.
  • Data access - credentials for your data sources and Jotform login provided within first week.
  • Strategic intelligence delivered by end of Week 4. Area analysis, target lists, and county rankings in your hands within the first month.

This scope of work is valid for 30 days from the date above. All fees are exclusive of VAT. This document should be read in conjunction with the Data Analytics & Platform Services Agreement which sets out the formal commercial terms.