Can LLM AI Be a Source of Competitive Advantage?
· ·

Can LLM AI Be a Source of Competitive Advantage?

Let’s start with the optimistic “yes”, and see if it remains acceptable. Before we get carried away, a few reminders. For an LLM to be a source of competitive advantage, it needs to be a resource that enables products or services of a firm “to perform at a higher level than others in the same…

What Is the Depth of Expertise of an AI Training Dataset?
· ·

What Is the Depth of Expertise of an AI Training Dataset?

I use “depth of expertise” as a data quality dimension of AI training datasets. It describes how much a dataset reflects of expertise in a knowledge domain. This is not a common data quality dimension used in other contexts, and I haven’t seen it as such in discussions of, say, quality of data used for…

AI for the sake of AI :-) L’art pour l’art

AI for the sake of AI :-) L’art pour l’art

Just like l’art pour l’art, or art for the sake of art was the bohemian creed in the 19th century, it looks like there’s an “AI for the sake of AI” creed now when building general-purpose AI systems based on Large Language Models.  Let’s say that the aim for a sustainable business are happy, paying,…

Black Box Approach to AI Governance
· ·

Black Box Approach to AI Governance

As currently drafted (2024), the Algorithmic Accountability Act does not require the algorithms and training data used in an AI System to be available for audit. (See my notes on the Act, starting with the one here.) The way that an auditor learns about the AI System is from documented impact assessments, which involve descriptions…

Ambiguity of “Artificial Intelligence”
·

Ambiguity of “Artificial Intelligence”

Artificial Intelligence, if incorrectly defined, is even more confusing than it can be. Sometimes, it is considered a technology, which itself is problematic: is it a technology on par with database management systems, for example, which are neutral with respect to the data they are implemented to manage in their specific instances? Or, is it…

Which Problems Is It Hard to Design AI for?
·

Which Problems Is It Hard to Design AI for?

The less data there is, or the lower quality the data that is available, the more difficult it is to build AI based on statistical learning. For scarce data domains, the only way to design AI is to elicit knowledge from experts, design rules that represent that knowledge, parameterize them so that they apply to…

Perplexing Secrecy of AI Designs
· ·

Perplexing Secrecy of AI Designs

If AI is made for profit, then should its design be confidential? This choice is part of AI product strategy. The decision on this depends on the following at least. What is the relationship of each of these to AI confidentiality? Correctness: The more likely the AI / algorithm is to make errors, the more…

Can Opacity Be Solved in an AI Derived from an LLM?
· ·

Can Opacity Be Solved in an AI Derived from an LLM?

The short answer is “No”, and the reasons for it are interesting. An AI system is opaque if it is impossible or costly for it (or people auditing it) to explain why it gave some specific outputs. Opacity is undesirable in general – see my note here. So this question applies for both those outputs…

Opaque, Complex, Biased, and Unpredictable AI
· ·

Opaque, Complex, Biased, and Unpredictable AI

Opacity, complexity, bias, and unpredictability are key negative nonfunctional requirements to address when designing AI systems. Negative means that if you have a design that reduces opacity, for example, relative to another design, the former is preferred, all else being equal. The first thing is to understand what each term refers to in general, that…

Valuation of an AI Training Dataset
·

Valuation of an AI Training Dataset

If there is a market for AI training datasets, then the price will be determined by supply and demand. How does the supplier set the price, and how does the buyer evaluate if the price is right? The question behind both of these is this: how to estimate the value of a training dataset? We…

AI Growth through Expert Communities
· · ·

AI Growth through Expert Communities

In the creator economy, the creative individual sells content. The more attention the content captures, the more valuable it is. The incentive for the creator is status and payment for consumption of their content. Distribution channels are Internet platforms, where content is delivered as intended by the author, the platform does not transform it (other…

What Does a Training Data Market Mean for Authors?
· ·

What Does a Training Data Market Mean for Authors?

If any text can be training data for a Large Language Model, then any text is a training dataset that can be valued through a market for training data.  Which datasets have high value? Wikipedia, StackOverflow, Reddit, Quora are examples that have value for different reasons, that is, because they can be used to train…

Preconditions for a Market for High Quality AI Training Data
· ·

Preconditions for a Market for High Quality AI Training Data

There is no high quality AI without high quality training data. A large language model (LLM) AI system, for example, may seem to deliver accurate and relevant information, but verifying that may be very hard – hence the effort into explainable AI, among others.  If I wanted accurate and relevant legal advice, how much risk…

AI Compliance at Scale via Embedded Data Governance
· ·

AI Compliance at Scale via Embedded Data Governance

There are, roughly speaking, three problems to solve for an Artificial Intelligence system to comply with AI regulations in China (see the note here) and likely future regulation in the USA (see the notes on the Algorithmic Accountability Act, starting here):  Using available, large-scale crawled web/Internet data is a low-cost (it’s all relative) approach to…

Can an Artificial Intelligence Trained on Large-Scale Crawled Web Data Comply with the Algorithmic Accountability Act?
· · · ·

Can an Artificial Intelligence Trained on Large-Scale Crawled Web Data Comply with the Algorithmic Accountability Act?

If an artificial intelligence system is trained on large-scale crawled web/Internet data, can it comply with the Algorithmic Accountability Act?  For the sake of discussion, I assume below that (1) the Act is passed, which it is not at the time of writing, and (2) the Act applies to the system (for more on applicability,…

Does the EU AI Act apply to most software?
· · ·

Does the EU AI Act apply to most software?

Does the EU AI Act apply to most, if not all software? It is probably not what was intended, but it may well be the case.  The EU AI Act, here, applies to “artificial intelligence systems” (AI system), and defines AI systems as follows: ‘artificial intelligence system’ (AI system) means software that is developed with…

What is AI Governance for?
·

What is AI Governance for?

If an AI is not predictable by design, then the purpose of governing it is to ensure that it gives the right answers (actions) most of the time, and that when it fails, the consequences are negligible, or that it can only fail on inconsequential questions, goals, or tasks.

Machine/AI as Inventor? Notes on Thaler v. USPTO
· · ·

Machine/AI as Inventor? Notes on Thaler v. USPTO

Can “an artificial intelligence machine be an ‘inventor’ under the Patent Act”? According to the Memorandum Opinion filed on September 2, 2021, in the case 1:20-cv-00903, the US Patent and Trademark Office (USPTO) requires that the inventor is one or more people [1]. An “AI machine” cannot be named an inventor on a patent that…