Before you try any AI tool: 5 questions to ask about your data

The fastest way to get burned by AI in contracting is also the most common: paste a sensitive document into a free chatbot, get a useful answer, and walk away thinking nothing happened. Something almost always happens. You just don't see it.

Most AI tools today are powered by external services — large language models hosted in the cloud, owned by someone other than you. When you paste text in, that text travels somewhere. Where exactly, what gets stored, and who can see it later: that's the part most people don't check before hitting Enter.

Here are the five questions I run through every time, before any project document, client name, contract clause, or pricing figure goes near an AI tool.

1. Where does my input go?

Read the tool's privacy page or terms of service. You're looking for one specific answer: which company's servers receive the text I type in?

If the answer is unclear, treat it as a red flag. The big providers (OpenAI, Anthropic, Google, Microsoft) are mostly upfront about it. Smaller "AI for X" wrappers built on top of those are sometimes vague — and "vague" usually means "the data goes to one of the big providers via our middle layer, and we don't want to spell that out."

A useful follow-up: in which country are those servers located? For projects under Saudi Arabia's PDPL, residency can matter.

2. Will my input be used to train future models?

This is the one that catches teams out the most.

Many free tiers, by default, allow the provider to use your inputs as training data. That means clauses from your private contract could end up subtly influencing how the model talks to other users — or, in worst-case scenarios, get reproduced verbatim if someone asks the right question.

The fix is usually one of:

Switch to a paid tier that explicitly opts you out of training (most major providers offer this).
Toggle a "data controls" or "improve the model for everyone" setting off in your account.
Use the API instead of the chat product — API traffic is generally not used for training by default.

Whichever path you take, document the choice somewhere your team can see it.

3. Is the input retained, and for how long?

Even when your input isn't used for training, it's often retained — kept on the provider's servers for some window of time, for things like abuse monitoring, debugging, or legal hold.

Common retention windows are 30 days, but they vary. Some enterprise plans offer zero-retention modes. Some smaller tools retain inputs indefinitely without telling you.

Ask: if a retention period exists, who can read this data during that window? The honest answer is usually "a small number of company employees and any law enforcement that compels us." That might be acceptable for a site safety checklist. It probably isn't for your client's BOQ.

4. What happens if I close my account?

Test the offboarding story before you onboard.

A good AI vendor lets you:

Export your conversation history.
Delete your data on demand, with a clear timeline.
Confirm deletion in writing.

A worrying vendor either makes deletion hard, makes it slow ("up to 90 days"), or quietly keeps a copy "for legal purposes." Treat that the same way you'd treat a subcontractor who won't sign a clear scope of works: with caution.

5. Is what I'm pasting actually mine to paste?

This one is rarely on the privacy page, because it's not a tech question — it's a contract question.

Many of the documents that flow through a contracting business aren't entirely yours. Client BOQs are usually owned by the client. Subcontractor proposals are owned by the subcontractor. Tender documents often have explicit confidentiality clauses banning third-party processing.

Before you paste, ask: does my contract with the document's owner permit me to send their data to a third party? If you don't know, the safe assumption is no. Either redact and anonymise the document first, or use a tool that runs locally on your machine without uploading anything externally.

A short rule of thumb

If you wouldn't paste it into a public Google Doc, don't paste it into a free AI tool either. Both can end up somewhere you didn't intend.

The good news: most of the genuinely useful AI work in contracting can be done with anonymised inputs, paid tiers with proper data controls, or local-first tools. The convenience tax is small. The risk reduction is large.

If you want to dig further, the next post in this series will look at the specific settings you can flip on the major providers to make them safer to use day-to-day.