Let’s be honest—we’re in the middle of a gold rush. Not for minerals or precious metals, but for your data. Since the release of ChatGPT and competing models, we’ve seen a level of hype around artificial intelligence that rivals and exceeds the hype, investment of the dot-com boom. The difference is the dot-com boom developed over several years where we have reached and exceeded the same level in just a year. Everything is suddenly “AI-powered.” Every day, I get yet another invite to a webinar claiming to have added AI to pick your acronym( WMS, YMS, ERP…..
As an old-school AI researcher and developer, and as a company focused on delivering enterprise solutions, we’ve had to confront a reality that few are talking about nor have the experience to address: data containment..
Our industry has talked about data governance for years, but containment — where your data lives, who touches it, and how it’s exposed — is the less glamorous but far more critical issue even more so as AI adoption accelerates.
Years ago — long before large language models reached today’s sophistication — we asked ourselves how best to architect our AI-powered solutions without sacrificing data control. The easiest route would have been to follow the SaaS playbook: build a minimum viable product” (MVP), push everything to the cloud, ship quick features, send your customers data to a OpenAI API, and start touting AI capabilities while iterating over time. Many providers are operating under this exact model now with no thought behind all this operational data of YOURS they are forwarding to train models.
That is not the Enterprise path
With a background building solutions for highly controlled environments like defense and healthcare — both governed by strict federal regulations for data privacy and some would say extreme controls over testing of systems and data protection. — our analysis and experience drove us to adopt a data protection first philosophy — and that led us to design on-premises-first solutions. Your data stays in your environment, inside your operational systems, and we avoid sending data to our systems and most definitely NEVER to a third party data broker.
Contrast this with what’s happening across the industry today: in the rush to bolt on AI features, many providers simply send your operational data to a large language model (LLM) API, receive a response, and call it “AI-powered.” In many cases, there’s little to no real processing happening inside their own solution. That new “AI” feature in a Yard Management System? It may be nothing more than your sensitive data being exported to train someone else’s model — with minimal or no added value to your business.
The truth is: many of these features don’t even require LLMs to begin with. For example — and without making this piece about our own solutions — years ago we developed an OCR machine-learning classifier that converts documents to text, classifies them, and extracts structured data. All of that processing happens entirely on-device, inside your network, under your full control. No external APIs. No cloud transfers. No third-party exposure.
Yet increasingly, we now see SaaS providers announcing similar features — except that their implementations require shipping your documents out to an LLM and returning the results. It’s not necessary. The technology exists to keep your data fully contained while still delivering intelligent automation. But the reality is that implementing this feature becomes simple for a startup, they simply call an LLM API and claim they’ve delivered AI functionality
Unfortunately, the trend is now moving in the opposite direction: your data is rapidly becoming the next fuel source for commercial LLMs. Analysts have for years been talking about the value of enterprise data, and yet many companies are being enticed to hand over their operational data — essentially for free — in exchange for buzzwords and shiny AI add-ons. LLM providers argue that you’re receiving value in return. In reality, you are giving away far more than you’re gaining.
As reported in many news outlets large language models like OpenAI have consumed a large part of the public internet to train their models. What’s left? Your internal operational data. This is the next prize. Any LLM that secures large volumes of enterprise data will become exponentially more valuable — but that value will accrue to the model provider, not to you.
A Data Coup: The Grok AI Example
One little-known but significant development occurred during the U.S. government’s DOGE initiative involving the integration of Elon Musk’s Grok AI platform into government systems. As Grok-powered tools became part of daily workflows, large volumes of previously protected government data were introduced into Grok’s training environment — data not generally accessible to the public and unavailable to competing LLMs. Over time, this exclusive data access may give Grok a distinct advantage as its models are trained on information other providers cannot obtain.
Protect Your Valuables. Deploy with Purpose — and with Control.
AI models remain in their infancy. Your operational data is extremely valuable — both to your business and as fuel for future LLM growth. Yet typical security audits still focus only on technical controls: whether data is encrypted at rest or in transit. That’s not the real question.
The real question enterprise leaders need to ask is:
Why is my TMS, WMS, or ERP provider sending operational data to a third-party LLM or data broker at all?
Data is your competitive advantage.
By keeping your operational data fully contained and securely leveraged inside your enterprise, you maintain control, preserve strategic advantage, and protect one of your most valuable corporate assets.
Allow the market to mature. Allow LLM training to occur on others’ data. Be deliberate in how you adopt AI — prioritize solutions that maintain full data containment and privacy.
Eventually, enterprise-grade LLMs will emerge. But in the meantime, don’t give away your data to help build someone else’s model.
References
- Apple@Work Podcast: Using Apple in logistics and manufacturing
- AI Firms will soon exhaust most of the internet’s data
Phillip Avelar is a Managing partner at Advanced Solutions, based in Chicago. He works with SAP enterprises to optimize their supply chains, increase productivity and challenges the status quo. He shares his passion for solving customers problems in his blog posts, industry articles and conference talks.