Among my many areas of responsibility at work, is contacting all the users found in various data breach lists that our InfoSec team get their hands on (typically, these are the same lists that eventually make their way to HIBP). Not unsurprisingly, there is a significant amount of overlap between some of these lists, and one of the things I do is to ensure that I do not contact users about passwords I’ve already talked to them about.
(A quick aside here; to prevemt further breach of PII, I separate the email addresses and passwords into separate Excel spreadsheets, and add them to an encrypted archive when not working with them – the archive is stored on our internal file server, which has been certified for this use by our InfoSec team.)
One of the reasons I use Excel for this purpose, is the fact that it has some pretty decent data validation tools; conditional formatting among them. Somewhat unintuitively, conditional formatting is found under the Home tab, rather than the Data tab, where all of the other data validation and manipulation tools are found, which is why I often have to hunt for it before finding it.
At any rate, under the Home tab, choose Conditional Formatting, Highlight Cells Rules, and then Duplicate Values. Once I’ve identified the duplicate values and removed them from the new data set, I notify the users, before using the duplicate removal tool (found under the Data tab) to remove the duplicates.