Thanks Ariye. What does group risk think about this paper?
I imagine these metrics would be good to include in the MI but are you confident that the methods being proposed are adequate to convince regulators on both sides of the Atlantic?
Thank you for reading. One of the main reasons we've written the paper is to help with model validation of LLM usage in our highly regulated industry. We are also engaging with regulators.
The industry at the moment is mostly using closed sourced vendor models that are very hard to validate or interpret. We are pushing to move onto models, with open source weights and where we can apply our interpretability methods.
Current validation approaches are still very behavioral in nature and we want move it into mechanistic interpretation world.
I imagine these metrics would be good to include in the MI but are you confident that the methods being proposed are adequate to convince regulators on both sides of the Atlantic?