Facebook Pixel Code

The Five Most Key Takeaways of This Blog

  • Most major tech companies are updating the terms and conditions for services and products (one example being Adobe’s creative suite) regarding A.I data collection. These changes are meant to reflect the companies’ strategy to use the data those products and services create. That data can help train those companies’ A.I. systems. 
  • Publicly available data is already easy pickings for most U.S.-based tech companies interested in A.I data collection. Social media posts that are on public accounts are one example of this. 
  • Private data is stuff like your (and, if you are a business owner, your employees’ and clients’/customers’) personal emails and text messages. 
  • For business owners, this means that these tech companies will most likely train A.I. systems based on publicly available data. And, in some cases, potentially your company’s private data on certain tech companies’ platforms. This could make your trade secrets and other sensitive information much more vulnerable to outsider access. 
  • Data collection practices tend to be much more lax in the United States than elsewhere. For instance, the European Center for Digital Rights filed enough complaints to prevent Meta (which owns Facebook and Instagram) from using public social media posts from its platforms for data scraping purposes. 
Training Day

Why do the biggest tech companies stay so dominant in the A.I. race? Why is it not the case that (relatively) smaller companies with game-changing A.I. products managed to topple the Goliaths. 

One reason is that A.I. is incredibly expensive to run. So those companies end up getting funding from the large companies, which by that funding end up getting some degree of control over the smaller company. Quid pro quo is the name of the game here. 

Another is that those large tech companies, a good number of which have been operating for several decades at this point, are sitting on what is just an unbelievable amount of data. 

And data is pretty much a fuel that powers the A.I. systems, in that if you do not give the A.I. system quality data, then it will not give you quality results. 

That data can be split into two different classes: publicly available data and privately available data. 



What Is Publicly Available Data?

If you read the Key Takeaways section above, then you know that most tech companies, at least in America, are already on the way to data-scraping users’ publicly available data. 

This data is the kind of stuff that you could easily find online. Your company’s videos on social media platforms, for instance, would be an example of this. 

These can be useful for training generative A.I. platforms that need visual and text content to learn how to generate such content. 

Some companies work hard to develop a distinctive brand aesthetic, both in written and visual content. To have that aesthetic become fodder for A.I. that may regurgitate parts of that aesthetic may upset some business owners. 

What Is Privately Available Data?

This really is the forbidden fruit that most big tech companies are looking to feed those data-hungry A.I. data collection systems on. 

Private data is what you could not just find with a quick Google search. You can find a celebrity’s public Instagram posts, yes. But you could not just gain similarly instant access to that celebrity’s text messages between friends and family. Not without some kind of hacking know-how, at least. 

But why are tech companies so eager to tip the rules and regulations in their favor? 

Think, for instance, of just how many personal emails and text messages there are sent every minute. That could include the confidential communications between your employees and clients. 

How much more powerful would an A.I. system be if it was able to feast on that massive amount of content? 

Quite powerful. There even is an example of this with Adobe’s recent controversy. 

The Adobe Controversy

Perhaps the most useful example that this writer could pull out for this blog’s readers is that of Adobe’s bold terms and conditions update

An update, it must be said upfront, that Adobe pretty quickly chose to backpedal from in light of quite a bit of controversy from all kinds of people in the creative community. 

What Adobe tried out in its update was to basically tell users of its products that the company would outright access users’ content for the sake of training A.I. 

Basically, they had to back down from this stance because enough people interpreted this as saying that Adobe just gave itself permission to, whether automatically or manually, pilfer users’ creative content for the sake of training its generative A.I. platform, Firefly. 

For this writer, at least, the content that someone, even at a company, creates in an Adobe platform that is not shared online (so, like, a private graphic design project) should be considered private. 

So, as a business owner, imagine if one of your employees was using Adobe Photoshop to create a visual graphic for the next team meeting. On that graphic is sensitive information that should be kept internal. It should worry one, then, that tech companies like Adobe may try to access that content and put it into a pool of data to train an A.I. system that will generate content.