-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Are we taking sufficient steps to safeguard client information, PII, PHI, and legal strategy? Consider the following questions and recommendations:
What Counts as Defensible Anonymization for the State Bar of Michigan
From a cautious, SBM‑compliant angle, we want something closer to HIPAA‑style de‑identification than informal anonymization:
HIPAA’s expert‑determination standard requires that an expert conclude there is a “very small risk” that the information could be used, alone or with reasonably available data, to identify an individual.
Applied to legal facts, that means systematically stripping or tokenizing: names, identifiers, unique dates, specific dollar figures, highly specific locations, rare job titles, and any combination that makes the fact pattern obviously about a particular client/matter.
Best practice that’s emerging: a technical anonymization layer that intercepts text before it hits the AI, replaces all sensitive entities with deterministic tokens (e.g., CLIENT_A, COMPANY_X, DATE_1), and then reverses the mapping locally after the AI responds. In that workflow, the provider never sees the underlying identities at all.
Even with strong anonymization, what we still must check:
Even if we’re comfortable that the text is non‑identifiable:
Is User still exposing confidential legal strategy or work product?
Privilege and MRPC 1.6 cover more than just “who the client is.” Sharing unique litigation strategy, settlement posture, or internal risk assessments with a third‑party AI can still be a confidentiality issue, even if the person is anonymized.
What do the provider’s terms say?
Michigan’s AI FAQ stresses that many AI tools “utilize the information entered to learn,” so inputs may be stored and regurgitated. If the provider keeps and reuses “anonymized” data, that can still prejudice clients or expose work product, even if they can’t easily attach a name.
Could a subpoena or breach hurt User's clients in any way?
Ethics and practice‑management guidance points out that if opposing counsel could subpoena the AI provider for “all prompts related to industry X mergers in 2025,” users might still be uncomfortable--even without explicit names, strategies or deal structures could be reconstructed.
So anonymization is necessary but not sufficient; we still need a risk assessment of what users are actually revealing and under what contractual/security regime.
Practical, Conservative Rule Set
If we want a clean, defensible line for Users:
Category 1 – Truly anonymized & generic:
- Matter is converted via a robust token‑based system or equivalent, operated locally.
- No combination of facts would reasonably allow identification of the client or matter by an outsider.
- Users are not transmitting unique strategy or internal mental impressions, just generic drafting/structuring questions. → Using mainstream AI tools here is relatively low risk, though you should still prefer providers with no‑training and strong security terms.
Category 2 – Anonymized but fact‑specific or strategy‑rich:
- User has stripped names, but the fact pattern is unusual, high‑profile, or includes distinctive dollar amounts/timing; or you’re discussing concrete litigation/negotiation strategy. → Treat as still confidential; only use if (a) provider is on an enterprise, no‑training, contractually locked‑down tier, and (b) you’d be comfortable explaining the use to the client and to a judge.
Category 3 – Identifiable or sensitive by context:
- The story is obviously about a particular person or organization (local public figure, single major employer in a small town, etc.), or involves particularly sensitive categories (health, crime, immigration, harassment with rare facts). → Do not send to general‑purpose cloud AI at all; use local/self‑hosted models or specialized legal AI tools with SOC2‑grade assurances and explicit no‑retention/no‑training language.
How are we currently “anonymizing” and "de-anonymizing" client information for legal research and writing, client communications, etc.(manual edits, search/replace, a script, or something like a proxy layer)? Whatever it is, it should be able to pass a HIPAA‑style “very small risk” test and align with the recent privilege cases even though users are unlikely to be covered entities under HIPAA.