Modernizing my 150-line Python search engine: Yahoo! dumps -> Hugging Face 🤗
A few years ago I wrote a full-text search engine in 150 lines of Python. The Wikipedia data source it relied on has since been discontinued, and the tooling around it was showing its age. I wanted to (finally) write a follow-up about semantic search, but I realized that I had to get the old repository in a working state first. It’s now using Hugging Face (🤗) datasets, uv, ruff, pytest, and GitHub Actions, without touching the core search logic. ...