3 days ago · Tech · 0 comments

This project is a continuation of my previous autoresearch project, which optimized a reranking model to be under 10MB. Digging deeper by hand, I was able to take the size reduction much further, while outperforming reranking models which are 30x larger. In the end I was able to reduce the payload from 11.4 MB to 2.79 MB gzipped. You can see it in action on my resume page. Each square represents 1 kB. The majority of overall size reduction came from removing the ORT dependency. However, other changes enabled much better representation quality than the baseline.

No comments yet. Log in to reply on the Fediverse. Comments will appear here.