PII guard for Claude Code to stay shopper information out of context | noirdoc

56b1be85 c99a 45a5 a6ec 365e96acc486.png


@robert_douglass No a dictionary can be too brittle, it makes use of a mixture of gear.

Presidio handles regex-based PII (IBAN, electronic mail, tax IDs). For names, we use 3 NER fashions, all native:

  • The NER part of spaCy’s de_core_news_lg pipeline (referred to as by way of Presidio)

  • Aptitude’s de-ner-large (devoted NER fashion, separate cross — catches “Schmidt, Lisa” comma-form and lowercase felony textual content)

  • GLiNER (zero-shot — upload customized entity sorts at runtime with out retraining)

Every NER fails in a different way, so the 3 vote in combination, for the reason that union has higher recall than any unmarried one.


Leave a Comment

Your email address will not be published. Required fields are marked *