This post is the convergence of two ideas that have been floating in my head for about a year. Can we learn messy stochastic models and use algorithmic/algebraic tools to rein in model complexity to make models interpretable? A notebook with a visual TLDR is available as an export of the marimo notebook. You can import the raw notebook into molab to try it out yourself. The first idea comes from a problem I kept running into at work. Boosting I spent most of my career working in fraud detection. XGBoost (or similar techniques) dominate in this space. XGBoost approximates a function by training small decision trees to incrementally chip away at a residual. Each tree doesn’t do a great job on its own, but they pick up each other’s slack by collectively contributing towards the whole picture. There’s an old parable about blind men touching different parts of an elephant. One feels the trunk and says it’s a snake, another feels the leg and says it’s a tree, a third feels the side and says…
No comments yet. Log in to reply on the Fediverse. Comments will appear here.