SQLite on Git, Part II: Unlocking Zlib's less known Feature

0 ▲

15 days ago · Tech · 0 comments

In the previous post, we followed the white rabbit down into Git's .git folder. We understood: Git uses zlib's deflate algorithm to compress objects. The Problem: Each range of zlib-compressed data depends on reading all the data before it. A 1Gb sqlite database compressed with zlib must read the whole file to access a single row1. In this article we're going to build a mental model on how zlib compresses and we're going to use Z_FULL_FLUSH to compress data in a way which is compatible with Git and allows us to random access data in objects stored in git. What has happened so far If you're just joining us, here's the quick recap: Prologue: We looked at random access, what it is, and why it is essential for running SQLite databases on top of Git's storage Part 1: We explored Git's .git folder, learned how loose objects are stored, and discovered that Z_FULL_FLUSH could be the key to enabling random access. Let's look behind the curtain of zlib the compression library used by Git's…

No comments yet. Log in to reply on the Fediverse. Comments will appear here.