2023-05-24 13:40:44
On the Impossible Safety of Large AI Models
The success hype of LLMs reached not only NLP-related field, but also get into life of normal humans professionals from a lot of different field. However, even I personally, have not seen any use-case where the model perform 100%, or 99.999%, or 99.9%... of the accuracy.
Theoretical proof that it is impossible to build
arbitrarily accurate AI model:
https://arxiv.org/abs/2209.15259
Why? TL;DR:
*
User-generated data: user-generated data are both mostly unverified and potentially highly sensitive;
*
High-dimension memorization: what to achieve better score on more data? You need way more parameters. However, the contexts are limitless. So... we need infinite amount of parameters? The complexity of “fully satisfactory” language processing might be orders of magnitude larger than today’s LLMs, in which case we may still obtain greater accuracy with larger models.
*
Highly heterogeneous users: the distribution of texts generated by a given user greatly diverges from the distribution of texts generated by another user. More data, more users, again, more contexts, more data which can be difficult to fully grasp and generalize.
*
Sparse heavy-tailed data per user: even we take into account only one user, even their data is not so dense to be generalized. We should expect an especially large empirical heterogeneity in language data, as the samples we obtain from a user can completely stand out from the user’s language distribution.
As a result, LAIM training is unlikely to be easier than mean estimation. The usual objective for ML is to estimate a distribution which is assumed to be normal one where we want to estimate the mean. How many combinations of such distributions are we able to predict?
+ We need to find a balance between accuracy and privacy.
Pretty challenging task. Will we be able to solve it anyway?
514 views10:40