I tried understanding the gist of the paper, and I’m not really convinced there’s anything meaningful here. It just looks like a variation of the transformer architecture inspired by biology, but no real innovation or demonstrated results.> BDH is designed for interpretability. A