The web is a horrible data source 0 ▲ Screaming At My Screen by Timo Zimmermann 1 hour ago · 5 min read1066 words · Tech · hide · 0 comments As I was working on new tools for my personal assistant this week, I ended up adding web search and thereby getting the content of a website to the toolbox. Automating anything web related in 2026 is a pretty annoying experience, and with good reason. It is understandable that website owners add scraping safeguards, considering the state of the web. Luckily, when you build tools to behave well, things are manageable. There are two parts to adding "the web" as a data source to an LLM, no matter if personal assistant or coding agent. First you need to find the information you are looking for, then you need to get the data in a format your LLM likes. I was considering using Kagis search API, now that it is available. I might actually start using it when I put my research agent together, but for now $12 for 1000 requests is a bit steep, especially as long as I do not know how wild my assistant and agent go when presented with the opportunity to slurp in infinite information. Exa was… No comments yet. Log in to reply on the Fediverse. Comments will appear here.