How Data Availability and Cost Relate to AI Differentiation?

When someone pitches me an #ai/#MachineLearning idea, I always (also) ask about #data availability, data cost, and how they relate to their product differentiation and #aitechnology. Here’s how I see them, roughly speaking. #strategy #AIstrategy #AIeconomics pic.twitter.com/v6yb8JOHwi
— ivanjureta (@ivanjureta) February 19, 2018
There are, roughly speaking, three problems to solve for an Artificial Intelligence system to comply with AI regulations in China (see the note here) and likely future regulation in the USA (see the notes on the Algorithmic Accountability Act, starting here): Using available, large-scale crawled web/Internet data is a low-cost (it’s all relative) approach to…
In a previous note, here, I wrote that one of the requirements for Generative AI products/services in China is that if it uses data that contains personal information, the consent of the holder of the personal information needs to be obtained. It seems self-evident that this needs to be a requirement. It is also not…
The less data there is, or the lower quality the data that is available, the more difficult it is to build AI based on statistical learning. For scarce data domains, the only way to design AI is to elicit knowledge from experts, design rules that represent that knowledge, parameterize them so that they apply to…
In the creator economy, the creative individual sells content. The more attention the content captures, the more valuable it is. The incentive for the creator is status and payment for consumption of their content. Distribution channels are Internet platforms, where content is delivered as intended by the author, the platform does not transform it (other…
IP compliance requirements on generative AI reduce the readily and cheaply available amount of training data, with a few consequences on how product development and product operations are done.