10 hours ago · 7 min read1340 words · Tech · hide · 0 comments

The work that led to this is a research project that I will soon publish on aifoc.us, where I am analysing if the presence of a URL in a prompt influences the output based on the latent "knowledge" about that URL in the model. While doing this project I needed to test a heap of URLs and see if their data was in the model or not and I hit on a heap of problems. The more time I spent in that data, the more I realised I couldn't answer a basic question about a large share of the traffic fetching those URLs: when one of the many agents, indexers, and scrapers loads a page, what does it actually do? Download the HTML and stop? Parse the CSS? Follow the font linked from inside that CSS? Can they run the JavaScript, or just fetch the .js file and move on? So I built ua-tracer to answer it, and it's a tool that I think will be useful for many web developers trying to understand what any browser or bot does when it accesses your site. Click through any row in that list and you land on the…

No comments yet. Log in to reply on the Fediverse. Comments will appear here.