Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Because the problem is not "get a list of what user can access" but "the AI that got trained on dataset must not leak to user that doesn't have access to it.

There is no feasible way to track that during training (at least yet), so only current solution would be to learn AI agent only on data use can access and that is costly



Who said it must be done during training? Most of the enterprise data is accessed after training - RAG or MCP tool calls. I can see how the techniques I mentioned above could be applied during RAG (in vector stores adopting Apache Accumulo ideas) or in MCP servers (MCP OAuth + RFC 8693 OAuth 2.0 Token Exchange + Zanzibar/Biscuit for faithfully replicating the authz constraints of systems where the data is being retrieved from).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: