Hi Hacker News. I've been working on a new project that I'd like to share- Anonymize_Excel.
Anonymize_Excel.py is a Python script that anonymizes an Excel file and synthesizes new data in its place.
It uses Microsoft Presidio to achieve this.
Currently it can recognize and synthesize the following data types: Name, Phone Number, Location, Email, Date/Time, Credit Card Number. More data types support to come.
It recognizes Natural Language using SpaCy.
Hope this helps someone!
---
Before this project I never worked with Microsoft Presidio, Pandas, SpaCy, or Excel with Python before. This has been a great learning experience!