Can Opacity Be Solved in an AI Derived from an LLM?

The short answer is “No”, and the reasons for it are interesting. An AI system is opaque if it is impossible or costly for it (or people auditing it) to explain why it gave some specific outputs. Opacity is undesirable in general – see my note here. So this question applies for both those outputs that, roughly speaking, look right, as well as those that are undesirable – for an example of the latter, see [1]. 

AI made from a Large Language Model is a derived AI, if it is made by adding constraints to the LLM. Constraints can be, for example, anything like filtering the input that a user can give it, or filtering the outputs before the user sees them. The kinds of constraints that can be put in place depend on the extent to which the LLM was designed to be configurable. It is likely that there are, or will be highly configurable LLMs, but more control increases the probability that the LLM needs to be retrained on more controlled and better understood training data.

Derived AI is interesting because it is the likely way that most organizations will use AI that learns from textual content: they will not build their own LLMs, or foundational models, but instead get outside specialist firms to customize / configure LLMs for the organization’s specific requirements. 

At the time of writing, in 2024, many LLMs are typically trained from Internet data (see, e.g., [2] and [3]), which makes it hard / costly to provide understandable relationships of outputs to training data – see my note here. That’s one contributor to opacity. Importantly, this is by design (see [4]), so it is highly likely that there is no way to solve this in a satisfactory way, as long as LLMs are made using current methods.

The other contributor to opacity is that an LLM offered as a paid service is likely going to be treated as a trade secret, for obvious reasons. Consequently, users will not be able to audit the algorithms applied to training data, and thereby decisions that led the designers of the LLM-as-a-service to design these algorithms in whichever way they did. 

It follows that a derived LLM-based AI will be at least as opaque as the underlying LLM. Proposed legislation in the USA, for example, does not address this issue – see my note on the proposed Algorithmic Accountability Act, here.

References

  1. New York Times, Google Chatbot’s A.I. Images Put People of Color in Nazi-Era Uniforms, Feb. 22, 2024.
  2. Groeneveld, Dirk, et al. “Olmo: Accelerating the science of language models.” arXiv preprint arXiv:2402.00838 (2024). https://arxiv.org/abs/2402.00838 
  3. Soldaini, Luca, et al. “Dolma: An Open Corpus of Three Trillion Tokens for Language Model Pretraining Research.” arXiv preprint arXiv:2402.00159 (2024). https://arxiv.org/abs/2402.00159 
  4. Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. “” Why should I trust you?” Explaining the predictions of any classifier.” Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016.