WordPress Taxonomy Pipeline Pt.1: Fetch Tags + spaCy NER in Python & Docker
Part 1 of 2 | AI-Powered WordPress Taxonomy Migration
In this video I walk through the first half of a 4-step Python pipeline
that semantically enriches WordPress tags using NLP and knowledge graphs —
all running locally in Docker.
What you will see:
Step 1 — Connect Python directly to a WordPress MySQL database and fetch
100 taxonomy terms (post_tag) with post counts. No plugin required — raw
SQL via Python.
Command used:
python source/pipeline/001_step_1_list_tags_categories_wp.py \
–limit 100 –taxonomy post_tag –no-dry-run
Step 2 — Run spaCy Named Entity Recognition (NER) on each tag to classify
them automatically: is “Amazon” a company (ORG), a location (GPE), or a
product (PRODUCT)? NER answers that without any manual labelling.
Command used:
python source/pipeline/002_step_2_spacy_ner.py \
–limit 100 –taxonomy post_tag –no-dry-run
Step 3 starts — Wikidata enrichment kicks off for 100 tags. Because the
Wikidata API queries take several minutes to complete, I leave Step 3
running and pick up the results in Part 2 (video 001_b).
Stack used:
– Python 3.9 managed via Anaconda (conda env: tags_treatment)
– Docker + WordPress + phpMyAdmin running locally on ports 8080 / 8081
– spaCy NER model: en_core_web_md
– MySQL connector for direct WordPress DB access
Why this matters for SEO:
Most WordPress sites accumulate taxonomy chaos — hundreds of tags with
zero semantic meaning attached. This pipeline is the first step toward
converting raw taxonomy strings into structured, AI-readable breadcrumb
proposals. That is the foundation for GEO (Generative Engine
Optimization): making your content legible to LLMs and AI agents, not
just to search-engine crawlers.
Full article: https://wp.me/p3Vuhl-3rb
GitHub: https://github.com/bflaven/ia_usages/tree/main/ia_seo_ia_semantic_breadcrumb_webmcp
Continue in Part 2 (001_b): Wikidata results + breadcrumb JSON +
WordPress plugin demo.
Tag(s) : AI-generated, Anaconda, artificial intelligence, P.O, PO, POC, Product Owner, Python, WordPress, Workflow
Categorie(s) : Agile, Anaconda, Development, Experiences, Python, Tutorials, Videos
